How do I find duplicate files in a given set of directories and delete them using a shell script or a command line options? How do I get rid of double duplicates files stored in ~/foo and /u2/foo directory?
You need to use a tool called fdupes. It will searche the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. fdupes is a nice tool to get rid of duplicate files.
Install fdupes
Type the following command under Debian / Ubuntu Linux:
# apt-get install fdupes
Type the following command under Redhat / RHEL / Fedota / CentOS Linux, enter (turn on rpmforge repo before running the following yum command):
# yum install fdupes
How Do I Use fdupes?
Find duplicate files in /etc/ directory, enter:
# fdupes /etc
Sample outputs:
/etc/vimrc /etc/virc
How Do I Delete Unwanted Files?
You can force fdupes to prompt you for files to preserve, deleting all others (use this with care otherwise you may loss data): # fdupes -d /etc
Sample outputs:
[1] /etc/vimrc [2] /etc/virc Set 1 of 1, preserve files [1 - 2, all]: 1 [+] /etc/vimrc [-] /etc/virc
How Do Recursively Search Directory?
You can recursively search every directory given follow subdirectories encountered within the -r option, enter:
# fdupes -r /dir1
How Do I Find Dupes In Two Directories?
Type the command as follows:
# fdupes /dir1 /dir2
OR
# fdupes -r /etc /data/etc /nas95/etc
How Do I See Size Of Duplicate Files?
Type the following command with the -S option:
# fdupes -S /etc
Sample outputs:
1533 bytes each: /etc/vimrc /etc/virc
Further readings:
- man page fdupes
You should follow me on twitter here or grab rss feed to keep track of new changes.
Featured Articles:
- 30 Handy Bash Shell Aliases For Linux / Unix / Mac OS X
- Top 30 Nmap Command Examples For Sys/Network Admins
- 25 PHP Security Best Practices For Sys Admins
- 20 Linux System Monitoring Tools Every SysAdmin Should Know
- 20 Linux Server Hardening Security Tips
- Linux: 20 Iptables Examples For New SysAdmins
- Top 20 OpenSSH Server Best Security Practices
- Top 20 Nginx WebServer Best Security Practices
- 20 Examples: Make Sure Unix / Linux Configuration Files Are Free From Syntax Errors
- 15 Greatest Open Source Terminal Applications Of 2012

- My 10 UNIX Command Line Mistakes
- Top 10 Open Source Web-Based Project Management Software
- Top 5 Email Client For Linux, Mac OS X, and Windows Users
- The Novice Guide To Buying A Linux Laptop













{ 13 comments… read them below or add one }
cool. thx
Thanks forgot that one, there’s always a guarantee that Debian will make it easier.
Do you know an equivalent with a GUI front end on Ubuntu?
Tech Observer
I don’t think so,
apt-cache search dupes
fdupes – identifies duplicate files within given directories
findimagedupes – Finds visually similar or duplicate images
Hi Vivek
Nice topic,
Best regards,
–Philippe
Cool tool but, does it takes consideration in hard links? they are the same file, no a dup, but with different name
I use fdupes with sed to output a shell file that will delete unwanted duplicates:
fdupes -r -n -S /directory | sed -r “s/^/#rm \”/” | sed -r “s/$/\”/” >duplicate-files.sh
The shell file created by this has each line commented out. It’s just a matter of uncommenting the files you want to delete. I don’t recall where I first saw this idea – credit to original poster though.
c.
I think the “yum install fdupes” is missing in the post
Great post, as usual.
Regards,
Rajagopal
@Rajagopal,
Thanks for the heads up!
@Wido,
Yes, take a look at man page you will get the option to control soft and hard links.
I’ve read about a similar tool, included in Ubuntu distro, with a GUI front end whose name is
fslint
hello, happy to see your article.
I am intrested in what tools you are using to create PDF for posts .
could you tell ?
“Download PDF version”
Try FSlint maybe. It has a GUI if you insist.
Hope it helps
I’m runnung CentOS 5 with fdupes v1.4. fdupes runs fine if i’m checking for duplicates on the Linux box itself. But I need to check for duplicates on an NFS mounted drive. When I run the command nothing seems to happen other than I get the command line again. I’m typing fdupes -r /dir1 /dir2
My nfs mount was mounted with the follwoing command mount -t nfs -o tcp ipaddress:/ifs /dupeChk
Thanks in advance for any help.