the Techy Talks..: Linux: awk - Group By Count File Data

Tuesday, 25 November 2014

Linux: awk - Group By Count File Data

While its pretty easy to do 'Group By' at database level, 'awk' enables us to do same at file level.

Consider a file as below:

# cat test.csv
anita 111
rama 555
david 555
raj 111
shaik 222
naren 222

Here, I want to know how many names in the 1st column are having same values in the second column.

That's nothing but a simple Group By in MySQL or any other database.

So, load to a table and execute the query.

But, if the file having crores of data, MySQL query would be a costly operation.

In this time, 'awk' command will come for your help.

Below is the command:

# awk '{arr[$2]++;}END{for(i in arr)print i, arr[i] ;}' test.csv

555 2

111 2

222 2

Here, awk is fetching the second column into an array and counting the appearance.

If, the file is comma separated(CSV), use the below command:

# awk -F, '{arr[$2]++;}END{for(i in arr)print i, arr[i] ;}' test.csv

555 2

111 2

222 2

the Techy Talks..

Tuesday, 25 November 2014

Linux: awk - Group By Count File Data

No comments:

Post a Comment