Q. How do I sort date type of data stored in a text file which is in following sample format:
Code Date TIME
HOS05 23/12/2008 10AM
HOS06 15/12/2008 2PM
HOS62 29/12/2008 10AM
HOS64 23/12/2008 2PM
HOS70 26/12/2008 10AM
ZFT01 06/12/2008 10AM
HOS73 11/12/2008 2PM
MHOS051 05/12/2008 10AM
MHOS041 20/12/2008 2PM
MHOS042 27/12/2008 2PM
MHOS053 11/12/2008 10AM
MHOS043 30/12/2008 2PM
How do I sort data date and time wise?
A. Use sort command to displays the lines of its input listed in sorted order. It has special option callede -M to sort months. There are total 3 fields. First one is some sort of code. Second is actual date in DD/MM/YYYY format and third one is time. You need to sort 2nd and 3rd column using the -k option of sort command as follows:
$ sort -k 2,2n -k 3 data.file.txt
Sample output:
MHOS051 05/12/2008 10AM ZFT01 06/12/2008 10AM MHOS053 11/12/2008 10AM HOS73 11/12/2008 2PM HOS06 15/12/2008 2PM MHOS041 20/12/2008 2PM HOS05 23/12/2008 10AM HOS64 23/12/2008 2PM HOS70 26/12/2008 10AM MHOS042 27/12/2008 2PM HOS62 29/12/2008 10AM MHOS043 30/12/2008 2PM
Where,
- -k 2,2n -k 3 : Sort data using the given column number. The option -k 2,2n -k 3 sorts each column. First, it will sort 2nd column (date dd field) and then 3rd column (time).
- data.file.txt : Input file
Multilevel sort
Now consider data as follows:
MHOS051 05/12/2008 10AM ZFT01 06/12/2008 10AM MHOS053 11/12/2008 10AM MHOS051 10/01/2009 10AM HOS73 11/12/2008 2PM HOS06 15/12/2008 2PM MHOS041 20/12/2008 2PM HOS05 23/12/2008 10AM HOS64 23/12/2008 2PM MHOS051 12/04/2009 10AM HOS70 26/12/2008 10AM MHOS042 27/12/2008 2PM HOS62 29/12/2008 10AM MHOS051 27/01/2009 10AM MHOS043 30/12/2008 2PM
Now, you need to first sort out year and then date using F[.C][OPTS] format, where F is the number of the field to use, and C is the number of the first character from the beginning of the field. You need to use sort command as follows:
$ sort -n -k 2.9 -k 2.5 -k 2 data.file.txt
Sample output:
MHOS051 05/12/2008 10AM ZFT01 06/12/2008 10AM HOS73 11/12/2008 2PM MHOS053 11/12/2008 10AM HOS06 15/12/2008 2PM MHOS041 20/12/2008 2PM HOS05 23/12/2008 10AM HOS64 23/12/2008 2PM HOS70 26/12/2008 10AM MHOS042 27/12/2008 2PM HOS62 29/12/2008 10AM MHOS043 30/12/2008 2PM MHOS051 10/01/2009 10AM MHOS051 27/01/2009 10AM MHOS051 12/04/2009 10AM
Where,
- -n : sort numeric data
- -k 2.9 : Select 2nd filed and 9th character for sorting (i.e sort on last digit of year)
- -k 2.5 : Select 2nd field and 5th character for sorting (i.e. sort on last digit of month)
- -k 2 : Select 2nd field and sort it out
- data.file.txt : Input file
Further readings:
- sort command man page
You should follow me on twitter here or grab rss feed to keep track of new changes.
Featured Articles:
- 30 Handy Bash Shell Aliases For Linux / Unix / Mac OS X
- Top 30 Nmap Command Examples For Sys/Network Admins
- 25 PHP Security Best Practices For Sys Admins
- 20 Linux System Monitoring Tools Every SysAdmin Should Know
- 20 Linux Server Hardening Security Tips
- Linux: 20 Iptables Examples For New SysAdmins
- Top 20 OpenSSH Server Best Security Practices
- Top 20 Nginx WebServer Best Security Practices
- 20 Examples: Make Sure Unix / Linux Configuration Files Are Free From Syntax Errors
- 15 Greatest Open Source Terminal Applications Of 2012

- My 10 UNIX Command Line Mistakes
- Top 10 Open Source Web-Based Project Management Software
- Top 5 Email Client For Linux, Mac OS X, and Windows Users
- The Novice Guide To Buying A Linux Laptop














{ 10 comments… read them below or add one }
Umm, you say it’s the -M option and then you don’t use -M as an option. That’s funny.
Dan,
-M will only compare month if they are in following format i.e. a month name abbreviation:
JAN
FEB
MAR
…
..
DEC
HTH
of course, if people started using ISO data format e.g. “20081225″ to store and export the date, it would simplify their sorting immensely. Well, for dates anyway ;’)
.h
@harry
Agree about that.
That is not the ISO format
2008-12-25
is the ISO date format, but that is still sortable.
Thanks for this usefull tip (especially the -kx.y notation)
I’m working through a similar sorting problem. Where I get stuck is getting the sorted output to sort on the am and pm data. looking at your output it looks like the same issue. ie: on the last table on 11/12/2008 2pm comes before 10am. i have tried sorting the am/pm part first but with no luck. any ideas?
This has been a great help, but for some reason when I go to sort it will only sort one part of the date , the DD, the MM or the YYYY but never more than one. The only difference in my code is that instead of a space delimiter I have a comma so I added the -t”,” part to the code, any ideas why mine isn’t working?
1 PM will get sorted before 12 PM . . . which is wrong
@Brad
1 PM will get sorted before 12 PM . . . which is wrong
And yes it is wrong. The solution to this is to do some extra text munging so that AM becomes larger that PM but any 12 PM time is converted to the numericical convention of 0 AM. This is why the convention of
YYYY-MM-DD_HH:MM:SS
is important. It remains highly readable but is sortable both numerically and alphabetically.
Anyone like to give the convertsion a go?