Sunday, September 28, 2008
Sum of and group by using awk
$ cat cont_details.txt
continent|mval|cval|kval
SA|2345|10|2.3
AF|123|12|4.5
SA|89|12.67
OC|890|10|2.3
EU|24|45|2.4
AF|90|10|10
NA|5678|12|89
AF|345|12|3.5
OC|90|78|5.6
OC|23|12|4.5
SA|1234|12|6.7
EU|90|12|10
AF|12|12|34
SA|909|12|56
Output required:
Select continent, sum(mval),sum(cval),sum(kval) group by continent
i.e. required output:
continent|mval|cval|kval
NA|5678|12|89
OC|1003|100|12.4
AF|570|46|52
SA|4577|46.67|77.67
EU|114|57|12.4
Awk solution:
$ awk -F"|" '
NR==1 {print}
NR!=1 {OFS="|";a[$1]+=$2;b[$1]+=$3;c[$1]+=$NF}
END{for (i in a){print i,a[i],b[i],c[i]}}
' cont_details.txt
Output:
continent|mval|cval|kval
NA|5678|12|89
OC|1003|100|12.4
AF|570|46|52
SA|4577|46.67|77.67
EU|114|57|12.4
Related post:
Implement group by clause functionality using awk
Subscribe to:
Post Comments (Atom)
© Jadu Saikia http://unstableme.blogspot.com

15 comments:
that is great! thanks a lot
Nice one
Excellent.. Much needed help indeed..Thanks
I used two of your tricks and perfect work for me. I just wonder how does your script work, because you refer to $1 and then your add with $2, it is a big question in my head. Could you please explain more clear for me. Thank you very much for your contribution here. Hope to hear from you, my professor.
@awkseeker,
Thanks for reading this blog.
I am putting an example below that will help you to understand awk associative-array. Please let me know if you still have any doubt on this.
$ cat file1.txt
Alex,147,1
Brian,056,7
Nanda,559,2
Lisa,221,0
Nanda,268,2
Alex,200,5
Alex,408,8
i)
#a[$1] registers the first field of the lines seen in the associative-array "a" (arrays are always associative in Awk)
$ awk -F "," '{a[$1]} END {for (i in a) print i}' file.txt
Nanda
Lisa
Alex
Brian
ii)
#Count of first field, sum of 2nd field, sum of 3rd field group by each first field
$ awk -F "," '{a[$1]++; b[$1]+=$2; c[$1]+=$3 } END {for (i in a) print i,a[i],b[i],c[i]}' file1.txt
Nanda 2 827 4
Lisa 1 221 0
Alex 3 755 14
Brian 1 56 7
iii)
#Sum of 2nd field, sum of 3rd field group by each first field
$ awk -F "," '{a[$1]+=$2; b[$1]+=$3 } END {for (i in a) print i,a[i],b[i]}' file1.txt
Nanda 827 4
Lisa 221 0
Alex 755 14
Brian 56 7
iv)
#Sum of 2nd field, sum of 3rd field group by each first field, same as ii)
$ awk -F "," '{a[$1]+=$2; b[$1]+=$3 } END {for (i in b) print i,a[i],b[i]}' file1.txt
Nanda 827 4
Lisa 221 0
Alex 755 14
Brian 56 7
Hope this helps.
Thank Juda, your blog is so valuable and save huge time to my work. I use you trick to do tabular text file and much more efficiency compare to traditional sql.
I am new to awk and have some basic of turbo c in previous time. however, your good explanation make me more clear and I would bother you if some doubts.
Best regards,
@Awkseeker : thanks so much for your comment. Its really inspiring. Feel free to ask any doubt on Unix one liner, awk, sed and I will definitely try to help.
nice tutor,
How about grouping with more than a column, i.e:
Data:
a, b, c, d
Require:
select a, b, sum(c), sum(d) group by a,b;
Thnk's
nice tutor,
How about grouping with more than a column, i.e:
Data:
a, b, c, d
Require:
select a, b, sum(c), sum(d) group by a,b;
Thnk's
@cef_luxan, something like this?
$ cat file.txt
US|A|23
IN|A|12
US|B|12
LK|C|20
US|B|12
$ awk -F"|" '
{OFS="|";a[$1]+=$NF}
END{for (i in a){print i,a[i]}}
' file.txt
LK|20
IN|12
US|47
$ awk -F"|" '
{OFS="|";a[$1"|"$2]+=$NF}
END{for (i in a){print i,a[i]}}
' file.txt
IN|A|12
LK|C|20
US|A|23
US|B|24
Hi Mr. Jadu,
Thank you for your excellent answer
Hi Juda, love to bother you again. I tried some SOAP and got the result in big XML. I want to use awk to archive the needed parameters only. Could you help me.
@awkseeker, sure, could you please put your input and expected output. Thanks.
Post a Comment