≡ Menu

Linux fdupes: Get Rid (Delete) Of Double Duplicate Files In Directory

How do I find duplicate files in a given set of directories and delete them using a shell script or a command line options? How do I get rid of double duplicates files stored in ~/foo and /u2/foo directory on a Linux operating systems? How can I remove duplicate files on a Linux based server?

You need to use a tool called fdupes. It will searche the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. fdupes is a nice tool to get rid of duplicate files.

Another option is to use a tool called to find and fix common errors in file storage such as duplicate files.

Install fdupes on a Linux

Type the following apt-get command under a Debian / Ubuntu Linux:

# apt-get install fdupes

Type the following yum command under a Redhat / RHEL / Fedota / CentOS Linux, enter (turn on rpmforge repo before running the following yum command):

# yum install fdupes

How Do I Use fdupes command?

To find duplicate files in /etc/ directory, enter:

# fdupes /etc

Sample outputs:

/etc/vimrc                              
/etc/virc

How Do I Delete Unwanted Files?

You can force fdupes to prompt you for files to preserve, deleting all others (use this with care otherwise you may loss data): # fdupes -d /etc
Sample outputs:

[1] /etc/vimrc                          
[2] /etc/virc

Set 1 of 1, preserve files [1 - 2, all]: 1

   [+] /etc/vimrc
   [-] /etc/virc

How Do Recursively Search Directory?

You can recursively search every directory given follow subdirectories encountered within the -r option, enter:

# fdupes -r /dir1

How Do I Find Dupes In Two Directories?

Type the command as follows:

# fdupes  /dir1 /dir2

OR

# fdupes -r /etc /data/etc /nas95/etc

How Do I See Size Of Duplicate Files?

Type the following command with the -S option:

# fdupes -S /etc

Sample outputs:

1533 bytes each:                        
/etc/vimrc
/etc/virc

Remove duplicate files with fslint

The fslint is a command to find various problems with filesystems, including duplicate files and problematic filenames etc. This is a recommended tool for desktop users. To install type the following on a Debian/Ubuntu Linux:

$ sudo apt-get install fslint

Sample outputs:

Reading package lists... Done
Building dependency tree       
Reading state information... Done
E: Unable to locate package flint
root@nas01:~# apt-get install fslint
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following extra packages will be installed:
  python-glade2
Suggested packages:
  python-gtk2-doc
The following NEW packages will be installed:
  fslint python-glade2
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 149 kB of archives.
After this operation, 849 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://ftp.us.debian.org/debian/ stable/main python-glade2 amd64 2.24.0-4 [43.3 kB]
Get:2 http://ftp.us.debian.org/debian/ stable/main fslint all 2.44-2 [106 kB]
Fetched 149 kB in 3s (49.4 kB/s) 
Selecting previously unselected package python-glade2.
(Reading database ... 63146 files and directories currently installed.)
Preparing to unpack .../python-glade2_2.24.0-4_amd64.deb ...
Unpacking python-glade2 (2.24.0-4) ...
Selecting previously unselected package fslint.
Preparing to unpack .../archives/fslint_2.44-2_all.deb ...
Unpacking fslint (2.44-2) ...
Processing triggers for desktop-file-utils (0.22-1) ...
Processing triggers for mime-support (3.58) ...
Processing triggers for man-db (2.7.0.2-5) ...
Setting up python-glade2 (2.24.0-4) ...
Setting up fslint (2.44-2) ...

Individual command line tools are available in addition to the GUI and to access them, one can change to, or add to $PATH the /usr/share/fslint/fslint directory on a standard install. Each of these commands in that directory have a --help option which further details its parameters:
$ ls /usr/share/fslint/fslint/
Sample outputs:

findbl  findid  findns  findsn  findu8  findup  fstool  zipdir
finded  findnl  findrs  findtf  findul  fslint  supprt

Where,

  1. findup – find DUPlicate files
  2. findnl – find Name Lint (problems with filenames)
  3. findu8 – find filenames with invalid utf8 encoding
  4. findbl – find Bad Links (various problems with symlinks)
  5. findsn – find Same Name (problems with clashing names)
  6. finded – find Empty Directories
  7. findid – find files with dead user IDs
  8. findns – find Non Stripped executables
  9. findrs – find Redundant Whitespace in files
  10. findtf – find Temporary Files
  11. findul – find possibly Unused Libraries
  12. zipdir – Reclaim wasted space in ext2 directory entries

Examples

To search for duplicates in current directory and below, enter:

##  Set path first ##
export PATH=$PATH:/usr/share/fslint/fslint/
findup 
findup .

To search for duplicates in all /nas01/cyberciti.biz/projects source directories and merge using hardlinks, enter:

findup -m /nas01/cyberciti.biz/projects*

To search system for duplicate files over 20K in size

sudo findup / -size +20k

To search only my files (that I own and are in my home directory)

findup ~ -user $(id -u)

To search system for duplicate files belonging to tom user:

sudo findup / -user $(id -u tom)

Say hello to fslint-gui tool

fslint-gui is a GUI wrapper for the individual fslint command line tools:

fslint-gui &

Sample outputs:

Fig.01: Linux Remove Duplicate Files With fslint-gui Tool

Fig.01: Linux Remove Duplicate Files With fslint-gui Tool

Further readings:
Share this tutorial on:

Your support makes a big difference:
I have a small favor to ask. More people are reading the nixCraft. Many of you block advertising which is your right, and advertising revenues are not sufficient to cover my operating costs. So you can see why I need to ask for your help. The nixCraft, takes a lot of my time and hard work to produce. If you use nixCraft, who likes it, helps me with donations:
Become a Supporter →    Make a contribution via Paypal/Bitcoin →   

Don't Miss Any Linux and Unix Tips

Get nixCraft in your inbox. It's free:



{ 14 comments… add one }
  • jason February 3, 2010, 3:05 pm

    cool. thx

  • Tech Observer February 4, 2010, 5:40 pm

    Thanks forgot that one, there’s always a guarantee that Debian will make it easier.
    Do you know an equivalent with a GUI front end on Ubuntu?

    Tech Observer

    • nixCraft February 4, 2010, 6:08 pm

      I don’t think so,

      apt-cache search dupes
      fdupes – identifies duplicate files within given directories
      findimagedupes – Finds visually similar or duplicate images

  • Philippe Petrinko February 6, 2010, 1:58 pm

      Hi Vivek
        Nice topic,
        Best regards,
    –Philippe

  • Wido February 6, 2010, 5:35 pm

    Cool tool but, does it takes consideration in hard links? they are the same file, no a dup, but with different name

  • Name February 6, 2010, 9:46 pm

    I use fdupes with sed to output a shell file that will delete unwanted duplicates:
    fdupes -r -n -S /directory | sed -r “s/^/#rm \”/” | sed -r “s/$/\”/” >duplicate-files.sh

    The shell file created by this has each line commented out. It’s just a matter of uncommenting the files you want to delete. I don’t recall where I first saw this idea – credit to original poster though.

    c.

  • Rajagopal February 7, 2010, 11:29 am

    I think the “yum install fdupes” is missing in the post

    Great post, as usual.

    Regards,

    Rajagopal

  • nixCraft February 7, 2010, 12:41 pm

    @Rajagopal,

    Thanks for the heads up!

  • nixCraft February 7, 2010, 6:33 pm

    @Wido,

    Yes, take a look at man page you will get the option to control soft and hard links.

  • gio February 12, 2010, 2:36 pm

    I’ve read about a similar tool, included in Ubuntu distro, with a GUI front end whose name is

    fslint

  • 荒野无灯 May 4, 2010, 7:38 am

    hello, happy to see your article.
    I am intrested in what tools you are using to create PDF for posts .
    could you tell ?
    “Download PDF version”

  • haasdas February 29, 2012, 9:08 pm

    Try FSlint maybe. It has a GUI if you insist.

    Hope it helps

  • James March 15, 2012, 5:14 pm

    I’m runnung CentOS 5 with fdupes v1.4. fdupes runs fine if i’m checking for duplicates on the Linux box itself. But I need to check for duplicates on an NFS mounted drive. When I run the command nothing seems to happen other than I get the command line again. I’m typing fdupes -r /dir1 /dir2
    My nfs mount was mounted with the follwoing command mount -t nfs -o tcp ipaddress:/ifs /dupeChk

    Thanks in advance for any help.

  • markbun June 13, 2013, 9:24 pm

    I’m using Duplicate Files Deleter for a year now it’s an easy fix .

Security: Are you a robot or human?

Leave a Comment

You can use these HTML tags and attributes: <strong> <em> <pre> <code> <a href="" title="">


   Tagged with: , , , , , , , , , , ,