Shell: How To Remove Duplicate Text Lines
Q. I need to sort data from a log file but there are too many duplicate lines. How do I remove all duplicate lines from a text file under GNU/Linux?
A.. You need to use shell pipes along with following two utilities:
a] sort command - sort lines of text files
b] uniq command - report or omit repeated lines
Removing Duplicate Lines With Sort, Uniq and Shell Pipes
Use the following syntax:
sort {file-name} | uniq -u
sort file.log | uniq -u
Here is a sample test file called garbage.txt:
this is a test food that are killing you wings of fire we hope that the labor spent in creating this software this is a test unix ips as well as enjoy our blog
Type the following command to get rid of all duplicate lines:
$ sort garbage.txt | uniq -u
Sample output:
food that are killing you unix ips as well as enjoy our blog we hope that the labor spent in creating this software wings of fire
Where,
- -u : check for strict ordering, remove all duplicate lines.
E-mail
Print
Can't find an answer to your question? Contact us
Related Other Helpful FAQs:
Discussion on This FAQ
Leave a Reply
We encourage your comments, and suggestions. But please stay on topic, be polite, and avoid spam. Thank you very much for stopping by our site!
Tags: BASH Shell, Linux, pipes, sort command, sort lines, sorting text, uniq command, UNIX



September 20th, 2008 at 7:43 am
you can use
command: sort -u filename
it gives you the same result
September 20th, 2008 at 8:41 am
How can change your example so the output would be (without duplicate lines, but one of duplicates is still there)?
this is a test
food that are killing you
wings of fire
we hope that the labor spent in creating this software
unix ips as well as enjoy our blog
Thank you.
September 20th, 2008 at 10:12 am
uniq -c will do it Martin.
September 21st, 2008 at 3:34 am
One more approach keeping the order of lines same as input. The good thing about this is that it can be applied if we need to remove duplicate based on a field or fields.
$ awk ‘!x[$0]++’ garbage.txt
Output:
this is a test
food that are killing you
wings of fire
we hope that the labor spent in creating this software
unix ips as well as enjoy our blog
Today at 5:51 am (15 hours ago)
What will be the command to save the output of this command to the same file? Or maybe a script?