≡ Menu

Linux / UNIX Shell: Sort Date

Q. How do I sort date type of data stored in a text file which is in following sample format:

Code Date TIME
HOS05 23/12/2008 10AM
HOS06 15/12/2008 2PM
HOS62 29/12/2008 10AM
HOS64 23/12/2008 2PM
HOS70 26/12/2008 10AM
ZFT01 06/12/2008 10AM
HOS73 11/12/2008 2PM
MHOS051 05/12/2008 10AM
MHOS041 20/12/2008 2PM
MHOS042 27/12/2008 2PM
MHOS053 11/12/2008 10AM
MHOS043 30/12/2008 2PM

How do I sort data date and time wise?

A. Use sort command to displays the lines of its input listed in sorted order. It has special option callede -M to sort months. There are total 3 fields. First one is some sort of code. Second is actual date in DD/MM/YYYY format and third one is time. You need to sort 2nd and 3rd column using the -k option of sort command as follows:
$ sort -k 2,2n -k 3 data.file.txt
Sample output:

MHOS051	05/12/2008	10AM
ZFT01	06/12/2008	10AM
MHOS053	11/12/2008	10AM
HOS73	11/12/2008	2PM
HOS06	15/12/2008	2PM
MHOS041	20/12/2008	2PM
HOS05	23/12/2008	10AM
HOS64	23/12/2008	2PM
HOS70	26/12/2008	10AM
MHOS042	27/12/2008	2PM
HOS62	29/12/2008	10AM
MHOS043	30/12/2008	2PM

Where,

  • -k 2,2n -k 3 : Sort data using the given column number. The option -k 2,2n -k 3 sorts each column. First, it will sort 2nd column (date dd field) and then 3rd column (time).
  • data.file.txt : Input file

Multilevel sort

Now consider data as follows:

MHOS051	05/12/2008	10AM
ZFT01	06/12/2008	10AM
MHOS053	11/12/2008	10AM
MHOS051	10/01/2009	10AM
HOS73	11/12/2008	2PM
HOS06	15/12/2008	2PM
MHOS041	20/12/2008	2PM
HOS05	23/12/2008	10AM
HOS64	23/12/2008	2PM
MHOS051	12/04/2009	10AM
HOS70	26/12/2008	10AM
MHOS042	27/12/2008	2PM
HOS62	29/12/2008	10AM
MHOS051	27/01/2009	10AM
MHOS043	30/12/2008	2PM

Now, you need to first sort out year and then date using F[.C][OPTS] format, where F is the number of the field to use, and C is the number of the first character from the beginning of the field. You need to use sort command as follows:
$ sort -n -k 2.9 -k 2.5 -k 2 data.file.txt
Sample output:

MHOS051	05/12/2008	10AM
ZFT01	06/12/2008	10AM
HOS73	11/12/2008	2PM
MHOS053	11/12/2008	10AM
HOS06	15/12/2008	2PM
MHOS041	20/12/2008	2PM
HOS05	23/12/2008	10AM
HOS64	23/12/2008	2PM
HOS70	26/12/2008	10AM
MHOS042	27/12/2008	2PM
HOS62	29/12/2008	10AM
MHOS043	30/12/2008	2PM
MHOS051	10/01/2009	10AM
MHOS051	27/01/2009	10AM
MHOS051	12/04/2009	10AM

Where,

  • -n : sort numeric data
  • -k 2.9 : Select 2nd filed and 9th character for sorting (i.e sort on last digit of year)
  • -k 2.5 : Select 2nd field and 5th character for sorting (i.e. sort on last digit of month)
  • -k 2 : Select 2nd field and sort it out
  • data.file.txt : Input file

Further readings:

  1. sort command man page
Tweet itFacebook itGoogle+ itPDF itFound an error/typo on this page?

{ 11 comments… add one }

  • Dan Shumaker October 24, 2008, 5:08 pm

    Umm, you say it’s the -M option and then you don’t use -M as an option. That’s funny.

  • nixCraft October 24, 2008, 6:32 pm

    Dan,

    -M will only compare month if they are in following format i.e. a month name abbreviation:
    JAN
    FEB
    MAR

    ..
    DEC

    HTH

  • harry October 24, 2008, 9:27 pm

    of course, if people started using ISO data format e.g. “20081225” to store and export the date, it would simplify their sorting immensely. Well, for dates anyway ;’)

    .h

  • Ace September 30, 2009, 9:21 pm

    @harry
    Agree about that.

  • Anthony Thyssen December 17, 2010, 3:32 am

    That is not the ISO format
    2008-12-25
    is the ISO date format, but that is still sortable.

  • Richard April 23, 2011, 5:57 pm

    Thanks for this usefull tip (especially the -kx.y notation)

  • Kevin December 26, 2011, 3:37 pm

    I’m working through a similar sorting problem. Where I get stuck is getting the sorted output to sort on the am and pm data. looking at your output it looks like the same issue. ie: on the last table on 11/12/2008 2pm comes before 10am. i have tried sorting the am/pm part first but with no luck. any ideas?

  • Mike January 3, 2012, 4:27 pm

    This has been a great help, but for some reason when I go to sort it will only sort one part of the date , the DD, the MM or the YYYY but never more than one. The only difference in my code is that instead of a space delimiter I have a comma so I added the -t”,” part to the code, any ideas why mine isn’t working?

  • Brad February 24, 2013, 9:08 pm

    1 PM will get sorted before 12 PM . . . which is wrong

    • Anthony Thyssen February 25, 2013, 11:47 pm

      @Brad
      1 PM will get sorted before 12 PM . . . which is wrong

      And yes it is wrong. The solution to this is to do some extra text munging so that AM becomes larger that PM but any 12 PM time is converted to the numericical convention of 0 AM. This is why the convention of
      YYYY-MM-DD_HH:MM:SS
      is important. It remains highly readable but is sortable both numerically and alphabetically.

      Anyone like to give the convertsion a go?

  • Anthony Thyssen June 3, 2013, 3:58 am

    I had to actually sort a file without using ISO formal…

    Example input…
    [03/07/2013 08:18:45 AM] PikaPika01 logged in.
    [03/07/2013 08:19:55 AM (world: 975,86,2494)] PikaPika01: hey

    To sort I needed to handle..
    1/ The date order is american MM/DD/YYYY format
    2/ AM and PM must be sorted before the actual time field
    3/ “AM]” and “AM (..” words sort differently (ignore trailing chars)
    4/ “12” is before “01” that is 12 AM is before 01 AM

    This was my resulting sorting script…
    cat log |
    sed ‘s/ 12:/ 00:/’ |
    sort -t\ -k1.8,1.11 -k1.2,1.3 -k1.4,1.5 -k3,3.2 -k2,2 -k4 |
    sed ‘s/ 00:/ 12:/’ > sorted_log

Leave a Comment