≡ Menu

Linux fdupes: Get Rid (Delete) Of Double Duplicate Files In Directory

How do I find duplicate files in a given set of directories and delete them using a shell script or a command line options? How do I get rid of double duplicates files stored in ~/foo and /u2/foo directory on a Linux operating systems? How can I remove duplicate files on a Linux based server?

You need to use a tool called fdupes. It will searche the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. fdupes is a nice tool to get rid of duplicate files.
Tutorial details
DifficultyEasy (rss)
Root privilegesNo
RequirementsNone
Estimated completion timeN/A

Another option is to use a tool called to find and fix common errors in file storage such as duplicate files.

Install fdupes on a Linux

Type the following apt-get command under a Debian / Ubuntu Linux:

# apt-get install fdupes

Type the following yum command under a Redhat / RHEL / Fedota / CentOS Linux, enter (turn on rpmforge repo before running the following yum command):

# yum install fdupes

How Do I Use fdupes command?

To find duplicate files in /etc/ directory, enter:

# fdupes /etc

Sample outputs:

/etc/vimrc
/etc/virc

How Do I Delete Unwanted Files?

You can force fdupes to prompt you for files to preserve, deleting all others (use this with care otherwise you may loss data): # fdupes -d /etc
Sample outputs:

[1] /etc/vimrc
[2] /etc/virc
Set 1 of 1, preserve files [1 - 2, all]: 1
   [+] /etc/vimrc
   [-] /etc/virc

How Do Recursively Search Directory?

You can recursively search every directory given follow subdirectories encountered within the -r option, enter:

# fdupes -r /dir1

How Do I Find Dupes In Two Directories?

Type the command as follows:

# fdupes  /dir1 /dir2

OR

# fdupes -r /etc /data/etc /nas95/etc

How Do I See Size Of Duplicate Files?

Type the following command with the -S option:

# fdupes -S /etc

Sample outputs:

1533 bytes each:
/etc/vimrc
/etc/virc

Remove duplicate files with fslint

The fslint is a command to find various problems with filesystems, including duplicate files and problematic filenames etc. This is a recommended tool for desktop users. To install type the following on a Debian/Ubuntu Linux:

$ sudo apt-get install fslint

Sample outputs:

Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package flint
root@nas01:~# apt-get install fslint
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
  python-glade2
Suggested packages:
  python-gtk2-doc
The following NEW packages will be installed:
  fslint python-glade2
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 149 kB of archives.
After this operation, 849 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://ftp.us.debian.org/debian/ stable/main python-glade2 amd64 2.24.0-4 [43.3 kB]
Get:2 http://ftp.us.debian.org/debian/ stable/main fslint all 2.44-2 [106 kB]
Fetched 149 kB in 3s (49.4 kB/s)
Selecting previously unselected package python-glade2.
(Reading database ... 63146 files and directories currently installed.)
Preparing to unpack .../python-glade2_2.24.0-4_amd64.deb ...
Unpacking python-glade2 (2.24.0-4) ...
Selecting previously unselected package fslint.
Preparing to unpack .../archives/fslint_2.44-2_all.deb ...
Unpacking fslint (2.44-2) ...
Processing triggers for desktop-file-utils (0.22-1) ...
Processing triggers for mime-support (3.58) ...
Processing triggers for man-db (2.7.0.2-5) ...
Setting up python-glade2 (2.24.0-4) ...
Setting up fslint (2.44-2) ...

Individual command line tools are available in addition to the GUI and to access them, one can change to, or add to $PATH the /usr/share/fslint/fslint directory on a standard install. Each of these commands in that directory have a --help option which further details its parameters:
$ ls /usr/share/fslint/fslint/
Sample outputs:

findbl  findid  findns  findsn  findu8  findup  fstool  zipdir
finded  findnl  findrs  findtf  findul  fslint  supprt

Where,

  1. findup - find DUPlicate files
  2. findnl - find Name Lint (problems with filenames)
  3. findu8 - find filenames with invalid utf8 encoding
  4. findbl - find Bad Links (various problems with symlinks)
  5. findsn - find Same Name (problems with clashing names)
  6. finded - find Empty Directories
  7. findid - find files with dead user IDs
  8. findns - find Non Stripped executables
  9. findrs - find Redundant Whitespace in files
  10. findtf - find Temporary Files
  11. findul - find possibly Unused Libraries
  12. zipdir - Reclaim wasted space in ext2 directory entries

Examples

To search for duplicates in current directory and below, enter:

##  Set path first ##
export PATH=$PATH:/usr/share/fslint/fslint/
findup
findup .

To search for duplicates in all /nas01/cyberciti.biz/projects source directories and merge using hardlinks, enter:

findup -m /nas01/cyberciti.biz/projects*

To search system for duplicate files over 20K in size

sudo findup / -size +20k

To search only my files (that I own and are in my home directory)

findup ~ -user $(id -u)

To search system for duplicate files belonging to tom user:

sudo findup / -user $(id -u tom)

Say hello to fslint-gui tool

fslint-gui is a GUI wrapper for the individual fslint command line tools:

fslint-gui &

Sample outputs:

Fig.01: Linux Remove Duplicate Files With fslint-gui Tool

Fig.01: Linux Remove Duplicate Files With fslint-gui Tool

Further readings:
Tweet itFacebook itGoogle+ itPDF itFound an error/typo on this page?

{ 14 comments… add one }

  • jason February 3, 2010, 3:05 pm

    cool. thx

  • Tech Observer February 4, 2010, 5:40 pm

    Thanks forgot that one, there’s always a guarantee that Debian will make it easier.
    Do you know an equivalent with a GUI front end on Ubuntu?

    Tech Observer

    • nixCraft February 4, 2010, 6:08 pm

      I don’t think so,

      apt-cache search dupes
      fdupes – identifies duplicate files within given directories
      findimagedupes – Finds visually similar or duplicate images

  • Philippe Petrinko February 6, 2010, 1:58 pm

      Hi Vivek
        Nice topic,
        Best regards,
    –Philippe

  • Wido February 6, 2010, 5:35 pm

    Cool tool but, does it takes consideration in hard links? they are the same file, no a dup, but with different name

  • Name February 6, 2010, 9:46 pm

    I use fdupes with sed to output a shell file that will delete unwanted duplicates:
    fdupes -r -n -S /directory | sed -r “s/^/#rm \”/” | sed -r “s/$/\”/” >duplicate-files.sh

    The shell file created by this has each line commented out. It’s just a matter of uncommenting the files you want to delete. I don’t recall where I first saw this idea – credit to original poster though.

    c.

  • Rajagopal February 7, 2010, 11:29 am

    I think the “yum install fdupes” is missing in the post

    Great post, as usual.

    Regards,

    Rajagopal

  • nixCraft February 7, 2010, 12:41 pm

    @Rajagopal,

    Thanks for the heads up!

  • nixCraft February 7, 2010, 6:33 pm

    @Wido,

    Yes, take a look at man page you will get the option to control soft and hard links.

  • gio February 12, 2010, 2:36 pm

    I’ve read about a similar tool, included in Ubuntu distro, with a GUI front end whose name is

    fslint

  • 荒野无灯 May 4, 2010, 7:38 am

    hello, happy to see your article.
    I am intrested in what tools you are using to create PDF for posts .
    could you tell ?
    “Download PDF version”

  • haasdas February 29, 2012, 9:08 pm

    Try FSlint maybe. It has a GUI if you insist.

    Hope it helps

  • James March 15, 2012, 5:14 pm

    I’m runnung CentOS 5 with fdupes v1.4. fdupes runs fine if i’m checking for duplicates on the Linux box itself. But I need to check for duplicates on an NFS mounted drive. When I run the command nothing seems to happen other than I get the command line again. I’m typing fdupes -r /dir1 /dir2
    My nfs mount was mounted with the follwoing command mount -t nfs -o tcp ipaddress:/ifs /dupeChk

    Thanks in advance for any help.

  • markbun June 13, 2013, 9:24 pm

    I’m using Duplicate Files Deleter for a year now it’s an easy fix .

Leave a Comment