Wednesday, August 6, 2014

Linux/UNIX Awk Command Tutorial with Examples

http://www.linuxtechi.com/awk-command-tutorial-with-examples

AWK  Stands for 'Aho, Weinberger, and Kernighan'
Awk is a scripting language which is used  for  processing or analyzing text files. Or we can say that awk is mainly used for grouping of data based on either a column or field , or on a set of columns. Mainly it's used for reporting data in a usefull manner. It also employs Begin and End Blocks to process the data.
Syntax of awk :
# awk 'pattern {action}' input-file > output-file
Lets take a input file with the following data
$ cat  awk_file
Name,Marks,Max Marks
Ram,200,1000
Shyam,500,1000
Ghyansham,1000
Abharam,800,1000
Hari,600,1000
Ram,400,1000


Example:1 Print all the lines from a file.
By default, awk prints all lines of a file , so to print every line of above created file use below command :
linuxtechi@mail:~$ awk '{print;}' awk_file
Name,Marks,Max Marks
Ram,200,1000
Shyam,500,1000
Ghyansham,1000
Abharam,800,1000
Hari,600,1000
Ram,400,1000


Example:2 Print only Specific field like 2nd & 3rd.

linuxtechi@mail:~$ awk -F "," '{print $2, $3;}' awk_file
Marks Max Marks
200 1000
500 1000
1000
800 1000
600 1000
400 1000

In the above command we have used the option  -F “,”  which specifies that comma (,) is the field separator in the file

Example:3 Print the lines which matches the pattern
I want to print the lines which contains the word “Hari & Ram”
linuxtechi@mail:~$ awk '/Hari|Ram/' awk_file
Ram,200,1000
Hari,600,1000
Ram,400,1000


Example:4 How do we find unique values in the first column of name
linuxtechi@mail:~$ awk -F, '{a[$1];}END{for (i in a)print i;}' awk_file
Abharam
Hari
Name
Ghyansham
Ram
Shyam

 
Example:5  How to find the sum of data entry in a particular column .
Synatx :  awk -F, '$1=="Item1"{x+=$2;}END{print x}' awk_file
linuxtechi@mail:~$ awk -F, '$1=="Ram"{x+=$2;}END{print x}' awk_file
600

Example:6  How to find the  total of all numbers in a column.
For eg we take the 2nd and the 3rd column.
linuxtechi@mail:~$ awk -F"," '{x+=$2}END{print x}' awk_file
3500
linuxtechi@mail:~$ awk -F"," '{x+=$3}END{print x}' awk_file
5000

Example:7  How to find the sum of individual group records.
Eg if we consider the first column than we can do the summation for the first column based on the items
linuxtechi@mail:~$ awk -F, '{a[$1]+=$2;}END{for(i in a)print i”, “a[i];}' awk_file
Abharam, 800
Hari, 600
Name, 0
Ghyansham, 1000
Ram, 600
Shyam, 500

Example:8 How to find the sum of all entries in second column  and append it to the end of the file.
linuxtechi@mail:~$ awk -F"," '{x+=$2;y+=$3;print}END{print "Total,"x,y}' awk_file
Name,Marks,Max Marks
Ram,200,1000
Shyam,500,1000
Ghyansham,1000
Abharam,800,1000
Hari,600,1000
Ram,400,1000
Total,3500 5000


Example:9 How to find the count of entries against every column based on the first column:
linuxtechi@mail:~$ awk -F, '{a[$1]++;}END{for (i in a)print i, a[i];}' awk_file
Abharam 1
Hari 1
Name 1
Ghyansham 1
Ram 2
Shyam 1


Example:10 How to print only the first record of every group:
linuxtechi@mail:~$ awk -F, '!a[$1]++' awk_file
Name,Marks,Max Marks
Ram,200,1000
Shyam,500,1000
Ghyansham,1000
Abharam,800,1000
Hari,600,1000


AWK Begin Block
Syntax for BEGIN block is
 # awk ‘BEGIN{awk initializing code}{actual AWK code}’ filename.txt
Let us create a datafile with below contents

 
Example:11  How to populate each column names along with their corresponding data.
linuxtechi@mail:~$ awk 'BEGIN{print "Names\ttotal\tPPT\tDoc\txls"}{printf "%-s\t%d\t%d\t%d\t%d\n", $1,$2,$3,$4,$5}' datafile
awk-begin

Example:12 How to change the Field Separator
As we can see space is the field separator in the datafile , in the below example we will change field separator  from space to "|"
linuxtechi@mail:~$ awk 'BEGIN{OFS="|"}{print $1,$2,$3,$4,$5}' datafile
awk-field-separator

 

No comments:

Post a Comment