Q. How do I sort date type of data stored in a text file which is in following sample format:
Code Date TIME
HOS05 23/12/2008 10AM
HOS06 15/12/2008 2PM
HOS62 29/12/2008 10AM
HOS64 23/12/2008 2PM
HOS70 26/12/2008 10AM
ZFT01 06/12/2008 10AM
HOS73 11/12/2008 2PM
MHOS051 05/12/2008 10AM
MHOS041 20/12/2008 2PM
MHOS042 27/12/2008 2PM
MHOS053 11/12/2008 10AM
MHOS043 30/12/2008 2PM
How do I sort data date and time wise?
A. Use sort command to displays the lines of its input listed in sorted order. It has special option callede -M to sort months. There are total 3 fields. First one is some sort of code. Second is actual date in DD/MM/YYYY format and third one is time. You need to sort 2nd and 3rd column using the -k option of sort command as follows:
$ sort -k 2,2n -k 3 data.file.txt
Sample output:
MHOS051 05/12/2008 10AM ZFT01 06/12/2008 10AM MHOS053 11/12/2008 10AM HOS73 11/12/2008 2PM HOS06 15/12/2008 2PM MHOS041 20/12/2008 2PM HOS05 23/12/2008 10AM HOS64 23/12/2008 2PM HOS70 26/12/2008 10AM MHOS042 27/12/2008 2PM HOS62 29/12/2008 10AM MHOS043 30/12/2008 2PM
Where,
- -k 2,2n -k 3 : Sort data using the given column number. The option -k 2,2n -k 3 sorts each column. First, it will sort 2nd column (date dd field) and then 3rd column (time).
- data.file.txt : Input file
Multilevel sort
Now consider data as follows:
MHOS051 05/12/2008 10AM ZFT01 06/12/2008 10AM MHOS053 11/12/2008 10AM MHOS051 10/01/2009 10AM HOS73 11/12/2008 2PM HOS06 15/12/2008 2PM MHOS041 20/12/2008 2PM HOS05 23/12/2008 10AM HOS64 23/12/2008 2PM MHOS051 12/04/2009 10AM HOS70 26/12/2008 10AM MHOS042 27/12/2008 2PM HOS62 29/12/2008 10AM MHOS051 27/01/2009 10AM MHOS043 30/12/2008 2PM
Now, you need to first sort out year and then date using F[.C][OPTS] format, where F is the number of the field to use, and C is the number of the first character from the beginning of the field. You need to use sort command as follows:
$ sort -n -k 2.9 -k 2.5 -k 2 data.file.txt
Sample output:
MHOS051 05/12/2008 10AM ZFT01 06/12/2008 10AM HOS73 11/12/2008 2PM MHOS053 11/12/2008 10AM HOS06 15/12/2008 2PM MHOS041 20/12/2008 2PM HOS05 23/12/2008 10AM HOS64 23/12/2008 2PM HOS70 26/12/2008 10AM MHOS042 27/12/2008 2PM HOS62 29/12/2008 10AM MHOS043 30/12/2008 2PM MHOS051 10/01/2009 10AM MHOS051 27/01/2009 10AM MHOS051 12/04/2009 10AM
Where,
- -n : sort numeric data
- -k 2.9 : Select 2nd filed and 9th character for sorting (i.e sort on last digit of year)
- -k 2.5 : Select 2nd field and 5th character for sorting (i.e. sort on last digit of month)
- -k 2 : Select 2nd field and sort it out
- data.file.txt : Input file
Further readings:
- sort command man page
🐧 Get the latest tutorials on Linux, Open Source & DevOps via:
- RSS feed or Weekly email newsletter
- Share on Twitter • Facebook • 13 comments... add one ↓
Category | List of Unix and Linux commands |
---|---|
File Management | cat |
Firewall | Alpine Awall • CentOS 8 • OpenSUSE • RHEL 8 • Ubuntu 16.04 • Ubuntu 18.04 • Ubuntu 20.04 |
Network Utilities | dig • host • ip • nmap |
OpenVPN | CentOS 7 • CentOS 8 • Debian 10 • Debian 8/9 • Ubuntu 18.04 • Ubuntu 20.04 |
Package Manager | apk • apt |
Processes Management | bg • chroot • cron • disown • fg • jobs • killall • kill • pidof • pstree • pwdx • time |
Searching | grep • whereis • which |
User Information | groups • id • lastcomm • last • lid/libuser-lid • logname • members • users • whoami • who • w |
WireGuard VPN | Alpine • CentOS 8 • Debian 10 • Firewall • Ubuntu 20.04 |
Umm, you say it’s the -M option and then you don’t use -M as an option. That’s funny.
Dan,
-M will only compare month if they are in following format i.e. a month name abbreviation:
JAN
FEB
MAR
…
..
DEC
HTH
of course, if people started using ISO data format e.g. “20081225” to store and export the date, it would simplify their sorting immensely. Well, for dates anyway ;’)
.h
@harry
Agree about that.
That is not the ISO format
2008-12-25
is the ISO date format, but that is still sortable.
Thanks for this usefull tip (especially the -kx.y notation)
I’m working through a similar sorting problem. Where I get stuck is getting the sorted output to sort on the am and pm data. looking at your output it looks like the same issue. ie: on the last table on 11/12/2008 2pm comes before 10am. i have tried sorting the am/pm part first but with no luck. any ideas?
This has been a great help, but for some reason when I go to sort it will only sort one part of the date , the DD, the MM or the YYYY but never more than one. The only difference in my code is that instead of a space delimiter I have a comma so I added the -t”,” part to the code, any ideas why mine isn’t working?
1 PM will get sorted before 12 PM . . . which is wrong
@Brad
1 PM will get sorted before 12 PM . . . which is wrong
And yes it is wrong. The solution to this is to do some extra text munging so that AM becomes larger that PM but any 12 PM time is converted to the numericical convention of 0 AM. This is why the convention of
YYYY-MM-DD_HH:MM:SS
is important. It remains highly readable but is sortable both numerically and alphabetically.
Anyone like to give the convertsion a go?
I had to actually sort a file without using ISO formal…
Example input…
[03/07/2013 08:18:45 AM] PikaPika01 logged in.
[03/07/2013 08:19:55 AM (world: 975,86,2494)] PikaPika01: hey
To sort I needed to handle..
1/ The date order is american MM/DD/YYYY format
2/ AM and PM must be sorted before the actual time field
3/ “AM]” and “AM (..” words sort differently (ignore trailing chars)
4/ “12” is before “01” that is 12 AM is before 01 AM
This was my resulting sorting script…
cat log |
sed ‘s/ 12:/ 00:/’ |
sort -t\ -k1.8,1.11 -k1.2,1.3 -k1.4,1.5 -k3,3.2 -k2,2 -k4 |
sed ‘s/ 00:/ 12:/’ > sorted_log
I always have trouble with this, this article was very useful but I still can’t sort by MMM
For example:
asd [14/Feb/2010:10:52:30 -0500] num 1
asd [08/Feb/2018:09:52:30 -0500] num 6
ads [14/Mar/2017:09:52:30 -0500] num 4
a [10/Feb/2017:09:52:20 -0500] num 3
ds [14/Mar/2017:10:00:00 -0500] num 5
das [14/Feb/2017:09:52:10 -0500] num 2
I tried to sort by year, then month , then day and even that doesn’t really come out right … sigh .. .
closest I got was this:
sort -k2.11n -M -k2.3 -k2.2 -k2.15n test