Thursday, 13 February 2014

Linux: Sorting a File

Here's simple 'sort' command that will remove the duplicate entry from any file and sort in ascending order:

Consider a file with few numbers:

[root@myhost tmp]# cat testSort.txt
23
4
56
001
34
3
 


To sort it:

[root@myhost tmp]# sort -u testSort.txt > sortd_testSort.txt
 

wherein, -u --> Unique


Its sorted:

[root@myhost tmp]# cat sortd_testSort.txt
001
23
3
34
4
56

Oops! Though it worked, it didn't work correctly. Because the file contains numeric data.

So use the below:

[root@myhost tmp]# sort -u -n testSort.txt > numeric_sortd_testSort.txt 
wherein, -n --> for numeric data

 

And, now its correct:

[root@myhost tmp]# cat numeric_sortd_testSort.txt
001
3
4
23
34
56

However, this works only for the small files. Smaller than space available with '/tmp' partition.

If you want to sort huge files, use the below:

[root@myhost lib]# sort -u -n -T. testSort.txt > numeric_sortd_testSort_new.txt 
wherein, -T. --> T indicates the location used as temporary space. By default its '/tmp'.

'.' indicates current directory

No comments:

Post a Comment

Note: only a member of this blog may post a comment.