Tuesday, October 30, 2012

Time bucketing using Awk in Unix


Each line of input file file.txt is having UNIX epoch timestamp and some value.
$ cat file.txt 
Epoch,Value
1351605000,120
1351605060,130
1351605120,340
1351605180,200
1351605240,120
1351605300,890
1351605360,124
1351605420,450
1351605480,120
1351605540,120
1351605600,200
1351605660,120
1351605720,340
1351605780,670
1351605840,990
The UNIX epoch timestamps are of 1 minute interval. Converting the UNIX epoch timestamps to human readable format using awk 'strftime' function:
$ awk 'BEGIN {FS=OFS=","} NR != 1 {$1=strftime("%c",$1)} {print}' file.txt 
Output:
Epoch,Value
Tue 30 Oct 2012 01:50:00 PM GMT,120
Tue 30 Oct 2012 01:51:00 PM GMT,130
Tue 30 Oct 2012 01:52:00 PM GMT,340
Tue 30 Oct 2012 01:53:00 PM GMT,200
Tue 30 Oct 2012 01:54:00 PM GMT,120
Tue 30 Oct 2012 01:55:00 PM GMT,890
Tue 30 Oct 2012 01:56:00 PM GMT,124
Tue 30 Oct 2012 01:57:00 PM GMT,450
Tue 30 Oct 2012 01:58:00 PM GMT,120
Tue 30 Oct 2012 01:59:00 PM GMT,120
Tue 30 Oct 2012 02:00:00 PM GMT,200
Tue 30 Oct 2012 02:01:00 PM GMT,120
Tue 30 Oct 2012 02:02:00 PM GMT,340
Tue 30 Oct 2012 02:03:00 PM GMT,670
Tue 30 Oct 2012 02:04:00 PM GMT,990
Required: Group each 5 minutes entries together to one time bucket and perform a corresponding sum(Value) of the entries. E.g. All entries from '01:50:00 PM' (1351605000) till '01:54:00 PM' (1351605240) should be grouped together to '01:50:00 PM' (1351605000) time bucket, '01:55:00 PM' (1351605300) till '01:59:00 PM' (1351605540) to '01:55:00 PM' (1351605300) bucket and so on.
$ awk 'BEGIN {FS=OFS=","}  NR != 1 {arr[$1-($1%300)]+=$2} END {for (i in arr) print i,arr[i]}' file.txt 
Output:
 
1351605000,910
1351605300,1704
1351605600,2320
Converting UNIX epoch timestamps to human readable format:
$ awk 'BEGIN {FS=OFS=","}  NR != 1 {arr[$1-($1%300)]+=$2} END {for (i in arr) print i,arr[i]}' file.txt  | awk '
BEGIN {FS=OFS=","} {$1=strftime("%c",$1)} {print}'
Output:
Tue 30 Oct 2012 01:50:00 PM GMT,910
Tue 30 Oct 2012 01:55:00 PM GMT,1704
Tue 30 Oct 2012 02:00:00 PM GMT,2320
Related posts:
- Grouping files using Awk in Bash shell
- Awk convert epoch to date format
- SUM and GROUP BY using Awk in UNIX post1 post2
- COUNT number of occurrences using Awk
- Awk Associative array

© Jadu Saikia http://unstableme.blogspot.com