Monday, June 27, 2011

Awk - convert epoch to date in the same file


Example 1)
Input file:
$ cat test1.txt
2|Z|1219071600|AF|0
3|N|1219158000|AF|89
4|N|1220799600|AS|12
1|Z|1220886000|AS|67
5|N|1220972400|EU|23
6|R|1221058800|OC|89

Required output:
2|Z|Mon 18 Aug 2008 03:00:00 PM UTC|AF|0
3|N|Tue 19 Aug 2008 03:00:00 PM UTC|AF|89
4|N|Sun 07 Sep 2008 03:00:00 PM UTC|AS|12
1|Z|Mon 08 Sep 2008 03:00:00 PM UTC|AS|67
5|N|Tue 09 Sep 2008 03:00:00 PM UTC|EU|23
6|R|Wed 10 Sep 2008 03:00:00 PM UTC|OC|89

i.e. convert the UNIX epoch values on 3 rd field of the above file to standard human readable date format using Awk.

The Awk solution using "strftime" function:
$ awk 'BEGIN {FS=OFS="|"}{$3=strftime("%c",$3)} {print}' test1.txt

More about strftime format specifiers can be found here

I have created a python program to achieve the above output and the script is here

Example 2)
Input file:
$ cat test2.txt
2|Z|time:1219071600|AF|0
3|N|time:1219158000|AF|89
4|N|time:1220799600|AS|12
1|Z|time:1220886000|AS|67
5|N|time:1220972400|EU|23
6|R|time:1221058800|OC|89

Required output:
2|Z|time:Mon 18 Aug 2008 03:00:00 PM UTC|AF|0
3|N|time:Tue 19 Aug 2008 03:00:00 PM UTC|AF|89
4|N|time:Sun 07 Sep 2008 03:00:00 PM UTC|AS|12
1|Z|time:Mon 08 Sep 2008 03:00:00 PM UTC|AS|67
5|N|time:Tue 09 Sep 2008 03:00:00 PM UTC|EU|23
6|R|time:Wed 10 Sep 2008 03:00:00 PM UTC|OC|89

The awk solution:
$ awk 'BEGIN {FS=OFS="|"}{$3="time:"strftime("%c",substr($3,6,10))} {print}' test2.txt

Related posts:
- Awk Substr function explained
- Print first character of a field in a file using Awk
- Replace first 5 characters of a file using UNIX Sed and Awk
- Convert fixed length file to a CSV file using Awk

Monday, June 20, 2011

Filter line using Awk split function

Input file "file.txt" has data in the following format:
$ cat file.txt
232323 90 /T/1382/8558/365p/133 100 234 679
S/1234 90 /N/1389/5000/365s/5000 800 134 679
792363 80 /T/1381/9858/365q/133 100 234 779
136383 90 /K/1382/5000/365p/5000 500 934 979
136383 90 /T/1382/5000/279p/9000 100 134 601

Required output: From the above file print only the lines whose 3rd field has the value "5000" as the 4th field (separated by the delimiter "/") in it. i.e. required output:
S/1234 90 /N/1389/5000/365s/5000 800 134 679
136383 90 /K/1382/5000/365p/5000 500 934 979
136383 90 /T/1382/5000/279p/9000 100 134 601

Using awk:
$ awk '{
split($3,arr,"/")
if(arr[4] == 5000) {
print $0
}
}' file.txt

The awk function split(s,a,sep) splits a string "s" into an awk array "a" using the delimiter "sep".

A simple UNIX bash script to solve this:
$ while read line
do
thirdf=$(echo "$line" | awk '{print $3}')
fourthf=$(echo "$thirdf" | awk -F "/" '{print $4}')
[ "$fourthf" -eq 5000 ] && echo $line
done < file.txt

Related posts:
- Add prefix to lines in a file using awk and bash
- Replace n-th occurrence of pattern in a file using Awk

© Jadu Saikia http://unstableme.blogspot.com