You need to use a tool called fdupes. It will searche the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. fdupes is a nice tool to get rid of duplicate files.
Another option is to use a tool called to find and fix common errors in file storage such as duplicate files.
Install fdupes on a Linux
Type the following apt-get command under a Debian / Ubuntu Linux:
# apt-get install fdupes
Type the following yum command under a Redhat / RHEL / Fedota / CentOS Linux, enter (turn on rpmforge repo before running the following yum command):
# yum install fdupes
How Do I Use fdupes command?
To find duplicate files in /etc/ directory, enter:
# fdupes /etc
Sample outputs:
/etc/vimrc /etc/virc
How Do I Delete Unwanted Files?
You can force fdupes to prompt you for files to preserve, deleting all others (use this with care otherwise you may loss data): # fdupes -d /etc
Sample outputs:
[1] /etc/vimrc [2] /etc/virc Set 1 of 1, preserve files [1 - 2, all]: 1 [+] /etc/vimrc [-] /etc/virc
How Do Recursively Search Directory?
You can recursively search every directory given follow subdirectories encountered within the -r option, enter:
# fdupes -r /dir1
How Do I Find Dupes In Two Directories?
Type the command as follows:
# fdupes /dir1 /dir2
OR
# fdupes -r /etc /data/etc /nas95/etc
How Do I See Size Of Duplicate Files?
Type the following command with the -S option:
# fdupes -S /etc
Sample outputs:
1533 bytes each: /etc/vimrc /etc/virc
Remove duplicate files with fslint
The fslint is a command to find various problems with filesystems, including duplicate files and problematic filenames etc. This is a recommended tool for desktop users. To install type the following on a Debian/Ubuntu Linux:
$ sudo apt-get install fslint
Sample outputs:
Reading package lists... Done Building dependency tree Reading state information... Done E: Unable to locate package flint root@nas01:~# apt-get install fslint Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: python-glade2 Suggested packages: python-gtk2-doc The following NEW packages will be installed: fslint python-glade2 0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded. Need to get 149 kB of archives. After this operation, 849 kB of additional disk space will be used. Do you want to continue? [Y/n] y Get:1 http://ftp.us.debian.org/debian/ stable/main python-glade2 amd64 2.24.0-4 [43.3 kB] Get:2 http://ftp.us.debian.org/debian/ stable/main fslint all 2.44-2 [106 kB] Fetched 149 kB in 3s (49.4 kB/s) Selecting previously unselected package python-glade2. (Reading database ... 63146 files and directories currently installed.) Preparing to unpack .../python-glade2_2.24.0-4_amd64.deb ... Unpacking python-glade2 (2.24.0-4) ... Selecting previously unselected package fslint. Preparing to unpack .../archives/fslint_2.44-2_all.deb ... Unpacking fslint (2.44-2) ... Processing triggers for desktop-file-utils (0.22-1) ... Processing triggers for mime-support (3.58) ... Processing triggers for man-db (2.7.0.2-5) ... Setting up python-glade2 (2.24.0-4) ... Setting up fslint (2.44-2) ...
Individual command line tools are available in addition to the GUI and to access them, one can change to, or add to $PATH the /usr/share/fslint/fslint directory on a standard install. Each of these commands in that directory have a --help option which further details its parameters:
$ ls /usr/share/fslint/fslint/
Sample outputs:
findbl findid findns findsn findu8 findup fstool zipdir finded findnl findrs findtf findul fslint supprt
Where,
- findup – find DUPlicate files
- findnl – find Name Lint (problems with filenames)
- findu8 – find filenames with invalid utf8 encoding
- findbl – find Bad Links (various problems with symlinks)
- findsn – find Same Name (problems with clashing names)
- finded – find Empty Directories
- findid – find files with dead user IDs
- findns – find Non Stripped executables
- findrs – find Redundant Whitespace in files
- findtf – find Temporary Files
- findul – find possibly Unused Libraries
- zipdir – Reclaim wasted space in ext2 directory entries
Examples
To search for duplicates in current directory and below, enter:
## Set path first ## export PATH=$PATH:/usr/share/fslint/fslint/ findup findup . |
To search for duplicates in all /nas01/cyberciti.biz/projects source directories and merge using hardlinks, enter:
findup -m /nas01/cyberciti.biz/projects* |
To search system for duplicate files over 20K in size
sudo findup / -size +20k |
To search only my files (that I own and are in my home directory)
findup ~ -user $(id -u) |
To search system for duplicate files belonging to tom user:
sudo findup / -user $(id -u tom) |
Say hello to fslint-gui tool
fslint-gui is a GUI wrapper for the individual fslint command line tools:
fslint-gui & |
Sample outputs:
Further readings:
- man pages fdupes(1),fslint-gui(1)



14 comment
cool. thx
Thanks forgot that one, there’s always a guarantee that Debian will make it easier.
Do you know an equivalent with a GUI front end on Ubuntu?
Tech Observer
I don’t think so,
apt-cache search dupes
fdupes – identifies duplicate files within given directories
findimagedupes – Finds visually similar or duplicate images
Hi Vivek
Nice topic,
Best regards,
–Philippe
Cool tool but, does it takes consideration in hard links? they are the same file, no a dup, but with different name
I use fdupes with sed to output a shell file that will delete unwanted duplicates:
fdupes -r -n -S /directory | sed -r “s/^/#rm \”/” | sed -r “s/$/\”/” >duplicate-files.sh
The shell file created by this has each line commented out. It’s just a matter of uncommenting the files you want to delete. I don’t recall where I first saw this idea – credit to original poster though.
c.
I think the “yum install fdupes” is missing in the post
Great post, as usual.
Regards,
Rajagopal
@Rajagopal,
Thanks for the heads up!
@Wido,
Yes, take a look at man page you will get the option to control soft and hard links.
I’ve read about a similar tool, included in Ubuntu distro, with a GUI front end whose name is
fslint
hello, happy to see your article.
I am intrested in what tools you are using to create PDF for posts .
could you tell ?
“Download PDF version”
Try FSlint maybe. It has a GUI if you insist.
Hope it helps
I’m runnung CentOS 5 with fdupes v1.4. fdupes runs fine if i’m checking for duplicates on the Linux box itself. But I need to check for duplicates on an NFS mounted drive. When I run the command nothing seems to happen other than I get the command line again. I’m typing fdupes -r /dir1 /dir2
My nfs mount was mounted with the follwoing command mount -t nfs -o tcp ipaddress:/ifs /dupeChk
Thanks in advance for any help.
I’m using Duplicate Files Deleter for a year now it’s an easy fix .