Linux / UNIX Shell: Sort Date

by on October 23, 2008 · 11 comments· LAST UPDATED October 23, 2008

in , ,

Q. How do I sort date type of data stored in a text file which is in following sample format:

Code Date TIME
HOS05 23/12/2008 10AM
HOS06 15/12/2008 2PM
HOS62 29/12/2008 10AM
HOS64 23/12/2008 2PM
HOS70 26/12/2008 10AM
ZFT01 06/12/2008 10AM
HOS73 11/12/2008 2PM
MHOS051 05/12/2008 10AM
MHOS041 20/12/2008 2PM
MHOS042 27/12/2008 2PM
MHOS053 11/12/2008 10AM
MHOS043 30/12/2008 2PM

How do I sort data date and time wise?

A. Use sort command to displays the lines of its input listed in sorted order. It has special option callede -M to sort months. There are total 3 fields. First one is some sort of code. Second is actual date in DD/MM/YYYY format and third one is time. You need to sort 2nd and 3rd column using the -k option of sort command as follows:
$ sort -k 2,2n -k 3 data.file.txt
Sample output:

MHOS051	05/12/2008	10AM
ZFT01	06/12/2008	10AM
MHOS053	11/12/2008	10AM
HOS73	11/12/2008	2PM
HOS06	15/12/2008	2PM
MHOS041	20/12/2008	2PM
HOS05	23/12/2008	10AM
HOS64	23/12/2008	2PM
HOS70	26/12/2008	10AM
MHOS042	27/12/2008	2PM
HOS62	29/12/2008	10AM
MHOS043	30/12/2008	2PM

Where,

  • -k 2,2n -k 3 : Sort data using the given column number. The option -k 2,2n -k 3 sorts each column. First, it will sort 2nd column (date dd field) and then 3rd column (time).
  • data.file.txt : Input file

Multilevel sort

Now consider data as follows:

MHOS051	05/12/2008	10AM
ZFT01	06/12/2008	10AM
MHOS053	11/12/2008	10AM
MHOS051	10/01/2009	10AM
HOS73	11/12/2008	2PM
HOS06	15/12/2008	2PM
MHOS041	20/12/2008	2PM
HOS05	23/12/2008	10AM
HOS64	23/12/2008	2PM
MHOS051	12/04/2009	10AM
HOS70	26/12/2008	10AM
MHOS042	27/12/2008	2PM
HOS62	29/12/2008	10AM
MHOS051	27/01/2009	10AM
MHOS043	30/12/2008	2PM

Now, you need to first sort out year and then date using F[.C][OPTS] format, where F is the number of the field to use, and C is the number of the first character from the beginning of the field. You need to use sort command as follows:
$ sort -n -k 2.9 -k 2.5 -k 2 data.file.txt
Sample output:

MHOS051	05/12/2008	10AM
ZFT01	06/12/2008	10AM
HOS73	11/12/2008	2PM
MHOS053	11/12/2008	10AM
HOS06	15/12/2008	2PM
MHOS041	20/12/2008	2PM
HOS05	23/12/2008	10AM
HOS64	23/12/2008	2PM
HOS70	26/12/2008	10AM
MHOS042	27/12/2008	2PM
HOS62	29/12/2008	10AM
MHOS043	30/12/2008	2PM
MHOS051	10/01/2009	10AM
MHOS051	27/01/2009	10AM
MHOS051	12/04/2009	10AM

Where,

  • -n : sort numeric data
  • -k 2.9 : Select 2nd filed and 9th character for sorting (i.e sort on last digit of year)
  • -k 2.5 : Select 2nd field and 5th character for sorting (i.e. sort on last digit of month)
  • -k 2 : Select 2nd field and sort it out
  • data.file.txt : Input file

Further readings:

  1. sort command man page
TwitterFacebookGoogle+PDF versionFound an error/typo on this page? Help us!

{ 11 comments… read them below or add one }

1 Dan Shumaker October 24, 2008 at 5:08 pm

Umm, you say it’s the -M option and then you don’t use -M as an option. That’s funny.

Reply

2 nixCraft October 24, 2008 at 6:32 pm

Dan,

-M will only compare month if they are in following format i.e. a month name abbreviation:
JAN
FEB
MAR

..
DEC

HTH

Reply

3 harry October 24, 2008 at 9:27 pm

of course, if people started using ISO data format e.g. “20081225” to store and export the date, it would simplify their sorting immensely. Well, for dates anyway ;’)

.h

Reply

4 Ace September 30, 2009 at 9:21 pm

@harry
Agree about that.

Reply

5 Anthony Thyssen December 17, 2010 at 3:32 am

That is not the ISO format
2008-12-25
is the ISO date format, but that is still sortable.

Reply

6 Richard April 23, 2011 at 5:57 pm

Thanks for this usefull tip (especially the -kx.y notation)

Reply

7 Kevin December 26, 2011 at 3:37 pm

I’m working through a similar sorting problem. Where I get stuck is getting the sorted output to sort on the am and pm data. looking at your output it looks like the same issue. ie: on the last table on 11/12/2008 2pm comes before 10am. i have tried sorting the am/pm part first but with no luck. any ideas?

Reply

8 Mike January 3, 2012 at 4:27 pm

This has been a great help, but for some reason when I go to sort it will only sort one part of the date , the DD, the MM or the YYYY but never more than one. The only difference in my code is that instead of a space delimiter I have a comma so I added the -t”,” part to the code, any ideas why mine isn’t working?

Reply

9 Brad February 24, 2013 at 9:08 pm

1 PM will get sorted before 12 PM . . . which is wrong

Reply

10 Anthony Thyssen February 25, 2013 at 11:47 pm

@Brad
1 PM will get sorted before 12 PM . . . which is wrong

And yes it is wrong. The solution to this is to do some extra text munging so that AM becomes larger that PM but any 12 PM time is converted to the numericical convention of 0 AM. This is why the convention of
YYYY-MM-DD_HH:MM:SS
is important. It remains highly readable but is sortable both numerically and alphabetically.

Anyone like to give the convertsion a go?

Reply

11 Anthony Thyssen June 3, 2013 at 3:58 am

I had to actually sort a file without using ISO formal…

Example input…
[03/07/2013 08:18:45 AM] PikaPika01 logged in.
[03/07/2013 08:19:55 AM (world: 975,86,2494)] PikaPika01: hey

To sort I needed to handle..
1/ The date order is american MM/DD/YYYY format
2/ AM and PM must be sorted before the actual time field
3/ “AM]” and “AM (..” words sort differently (ignore trailing chars)
4/ “12” is before “01” that is 12 AM is before 01 AM

This was my resulting sorting script…
cat log |
sed ‘s/ 12:/ 00:/’ |
sort -t\ -k1.8,1.11 -k1.2,1.3 -k1.4,1.5 -k3,3.2 -k2,2 -k4 |
sed ‘s/ 00:/ 12:/’ > sorted_log

Reply

Leave a Comment

Tagged as: , , , , , , , , , , ,

Previous Faq:

Next Faq: