Monday, November 24, 2008
Print range of columns using awk - exclude range
This is more relevant when you have a file with huge number of columns and you want to print a range of columns from the file or you want to exclude a range of columns from printing and print the rest of the columns.
Lets see with an small example,
Input file:
$ cat details.txt
Name Age Sex Add DOB CARD
AXU 12 M IN 12-Jul Y
ANI 13 F IN 10-Jan N
JCK 16 M JP 03-Frb Y
LON 12 M IN 12-Oct Y
A) Print a range of columns using awk:
e.g. Print from column 2 till column 4 using awk
$ awk -v f=2 -v t=4 '{ for (i=f; i<=t;i++) printf("%s%s", $i,(i==t) ? "\n" : OFS) }' details.txt
Age Sex Add
12 M IN
13 F IN
16 M JP
12 M IN
B) Exclude column range from printing in awk:
e.g. Exclude column range 2-4 and print the rest of the columns from the above file details.txt
$ awk -v f=2 -v t=4 '
{ for (i=1; i<=NF;i++)
if( i>=f && i<=t) continue;
else
printf("%s%s", $i,(i!=NF) ? OFS : ORS) }' details.txt
Name DOB CARD
AXU 12-Jul Y
ANI 10-Jan N
JCK 03-Frb Y
LON 12-Oct Y
Labels:
Awk,
awk newbie,
Bash,
bash scripts,
bash shell
Wednesday, November 19, 2008
Awk FNR variable usage example
I already discussed about awk FNR variable in many of my previous posts. Here is one more example for the use of NR==FNR in awk.
Description:
NR is the Number of the current Record (line) being processed.
FNR is the Number of the current Record within the current file.
So, for the first file processed they(NR, FNR) will be equal but on the first line of the second and subsequent files FNR will start from 1 again.
when NR==FNR (i.e. when processing the first file) an associative array is built up which stores the first field in an array element which also has the first field as its index.
and
when NR!=FNR (i.e. when processing the second and subsequent files) the associative array is checked to see if it has an element indexed by the first field and if so the default behavior (of printing out all of the line being processed) is carried out.
Input file:
$ cat file.txt
10 AD
20 NA
30 PS
50 KR
Required:
Add a third column to the above file which is the sum of first field elements.
i.e. required output:
10 AD 110
20 NA 110
30 PS 110
50 KR 110
awk solution:
$ awk 'NR==FNR{sum+=$1;next}$0=$0 FS sum' file.txt file.txt
10 AD 110
20 NA 110
30 PS 110
50 KR 110
Related posts for NR==FNR in awk:
- Match words between two file using awk
- Perform join using awk
- Update a file based on another file using sed
- Delete lines based on another file using awk
- Update file based on another file using awk
Description:
NR is the Number of the current Record (line) being processed.
FNR is the Number of the current Record within the current file.
So, for the first file processed they(NR, FNR) will be equal but on the first line of the second and subsequent files FNR will start from 1 again.
when NR==FNR (i.e. when processing the first file) an associative array is built up which stores the first field in an array element which also has the first field as its index.
and
when NR!=FNR (i.e. when processing the second and subsequent files) the associative array is checked to see if it has an element indexed by the first field and if so the default behavior (of printing out all of the line being processed) is carried out.
Input file:
$ cat file.txt
10 AD
20 NA
30 PS
50 KR
Required:
Add a third column to the above file which is the sum of first field elements.
i.e. required output:
10 AD 110
20 NA 110
30 PS 110
50 KR 110
awk solution:
$ awk 'NR==FNR{sum+=$1;next}$0=$0 FS sum' file.txt file.txt
10 AD 110
20 NA 110
30 PS 110
50 KR 110
Related posts for NR==FNR in awk:
- Match words between two file using awk
- Perform join using awk
- Update a file based on another file using sed
- Delete lines based on another file using awk
- Update file based on another file using awk
Labels:
Awk,
awk FNR,
awk newbie,
awk variables,
Bash,
bash scripts,
bash shell
Monday, November 17, 2008
Concatenate lines using awk in bash
Input file:
$ cat rft01.txt
data set 01
unid=ef023; pid=34
data set 03
data set 09
unid=ef028; pid=36
data set 02
unid=ef021; pid=54
Output Required:
concatenate lines in the above file such that the o/p looks like this:
data set 01 unid=ef023; pid=34
data set 03
data set 09 unid=ef028; pid=36
data set 02 unid=ef021; pid=54
Awk solution:
$ awk 'END{print RS}$0=(/^data set/?NR==1?_:RS:FS)$0' ORS= rft01.txt
data set 01 unid=ef023; pid=34
data set 03
data set 09 unid=ef028; pid=36
data set 02 unid=ef021; pid=54
And if you want the o/p like this:
data set 01 unid=ef023; pid=34
data set 09 unid=ef028; pid=36
data set 02 unid=ef021; pid=54
The awk solution would be:
$ awk '/^data set/{s=$0;next}{print s " "$0}' rft01.txt
Related post:
- Merging lines using awk
- Merge previous line using sed
$ cat rft01.txt
data set 01
unid=ef023; pid=34
data set 03
data set 09
unid=ef028; pid=36
data set 02
unid=ef021; pid=54
Output Required:
concatenate lines in the above file such that the o/p looks like this:
data set 01 unid=ef023; pid=34
data set 03
data set 09 unid=ef028; pid=36
data set 02 unid=ef021; pid=54
Awk solution:
$ awk 'END{print RS}$0=(/^data set/?NR==1?_:RS:FS)$0' ORS= rft01.txt
data set 01 unid=ef023; pid=34
data set 03
data set 09 unid=ef028; pid=36
data set 02 unid=ef021; pid=54
And if you want the o/p like this:
data set 01 unid=ef023; pid=34
data set 09 unid=ef028; pid=36
data set 02 unid=ef021; pid=54
The awk solution would be:
$ awk '/^data set/{s=$0;next}{print s " "$0}' rft01.txt
Related post:
- Merging lines using awk
- Merge previous line using sed
Labels:
Awk,
awk concatenation,
awk newbie,
Bash,
bash shell
Wednesday, November 12, 2008
List empty directories using find in bash
Linux command find gives an option called "empty" using which we can list empty regular files or empty directories.
e.g.
To list all the empty directories
$ find . -type d -empty
Output:
./bdb/prac
./sim/old/data
./prac/testdir
Related post:
- find file names with only digits, no text
- Use logical operator with linux find command
- exclude directory from find command
e.g.
To list all the empty directories
$ find . -type d -empty
Output:
./bdb/prac
./sim/old/data
./prac/testdir
Related post:
- find file names with only digits, no text
- Use logical operator with linux find command
- exclude directory from find command
Labels:
Bash,
bash shell newbie,
find command,
Linux Commands
Friday, November 7, 2008
Print individual records using awk array - bash
Office lunch time. JSingh, Vis and KKR were playing a game and they wanted me to count their scores.
I made a rough text file which took this form after completion of 2 rounds.
$ cat officegame.txt
Name|Round1|Round2
JSingh|0|20
Vis|50|0
KKR|20|20
JSingh|10|40
Vis|50|20
KKR|40|10
JSingh|40|60
Vis|30|20
KKR|90|20
JSingh|0|60
Vis|20|20
KKR|50|50
After 2 rounds, they asked me their individual total scores in each rounds. This is what I did for the same.
Output:
Name|Round1|Round2
Vis|150|60
JSingh|50|180
KKR|200|100
I sent them the output. JSingh asked me the breakdown of each individual score in each of the rounds.
I had to write this to achieve his requirement,
Output:
Vis [Round1={50+50+30+20}, Round2={0+20+20+20}]
JSingh [Round1={0+10+40+0}, Round2={20+40+60+60}]
KKR [Round1={20+40+90+50}, Round2={20+10+20+50}]
Related post:
- sum of and group by using awk
- group-by clause functionality in awk
- awk associative array examples
I made a rough text file which took this form after completion of 2 rounds.
$ cat officegame.txt
Name|Round1|Round2
JSingh|0|20
Vis|50|0
KKR|20|20
JSingh|10|40
Vis|50|20
KKR|40|10
JSingh|40|60
Vis|30|20
KKR|90|20
JSingh|0|60
Vis|20|20
KKR|50|50
After 2 rounds, they asked me their individual total scores in each rounds. This is what I did for the same.
$ awk -F"|" '
NR==1 {print}
NR!=1 {OFS="|";a[$1]+=$2;b[$1]+=$3}
END{for (i in a){print i,a[i],b[i]}}
' officegame.txt
Output:
Name|Round1|Round2
Vis|150|60
JSingh|50|180
KKR|200|100
I sent them the output. JSingh asked me the breakdown of each individual score in each of the rounds.
I had to write this to achieve his requirement,
$ awk -F "|" 'NR > 1 {
if (n[$1] == $1) {
r1[$1] = r1[$1] "+" $2
r2[$1] = r2[$1] "+" $3
} else {
n[$1] = $1
r1[$1] = $2
r2[$1] = $3
}
}
END {
for (i in n) {
printf "%s [Round1={%s}, Round2={%s}]\n", n[i], r1[i], r2[i]
}
}' officegame.txt
Output:
Vis [Round1={50+50+30+20}, Round2={0+20+20+20}]
JSingh [Round1={0+10+40+0}, Round2={20+40+60+60}]
KKR [Round1={20+40+90+50}, Round2={20+10+20+50}]
Related post:
- sum of and group by using awk
- group-by clause functionality in awk
- awk associative array examples
Labels:
Awk,
awk array,
awk newbie,
Bash,
bash scripts,
bash shell
Sunday, November 2, 2008
Reverse order of few lines awk - bash
Input file:
$ cat details.txt
line1
line2
line3
line4
line5
line6
line7
line8
line9
line10
Required: Reverse the order of the lines from line5 to line8. i.e. required output:
line1
line2
line3
line4
line8
line7
line6
line5
line9
line10
The awk solution:
$ awk -v from=5 -v to=8 'NR==from {
s=$0
for(i=from+1;i<to;i++){
getline;s=$0"\n"s
}
getline;print;print s
next
}1' details.txt
The 1 above in awk one liner can be replaced as {print}
In order to reverse the order of lines of the whole file, we have tac command, which print files in reverse.
$ tac details.txt
The same can be achieved using sed and awk as mentioned below:
$ sed -n '1!G;h;$p' details.txt
$ awk '{a[i++]=$0} END {for (j=i-1; j>=0;) print a[j--] }' details.txt
$ cat details.txt
line1
line2
line3
line4
line5
line6
line7
line8
line9
line10
Required: Reverse the order of the lines from line5 to line8. i.e. required output:
line1
line2
line3
line4
line8
line7
line6
line5
line9
line10
The awk solution:
s=$0
for(i=from+1;i<to;i++){
getline;s=$0"\n"s
}
getline;print;print s
next
}1' details.txt
The 1 above in awk one liner can be replaced as {print}
In order to reverse the order of lines of the whole file, we have tac command, which print files in reverse.
$ tac details.txt
The same can be achieved using sed and awk as mentioned below:
$ sed -n '1!G;h;$p' details.txt
$ awk '{a[i++]=$0} END {for (j=i-1; j>=0;) print a[j--] }' details.txt
Labels:
Awk,
awk newbie,
Bash Array,
bash scripts,
Linux Commands
Subscribe to:
Posts (Atom)
© Jadu Saikia http://unstableme.blogspot.com
