Linux Find Large Files

Posted on in Categories , , , , , , , last updated December 17, 2008

Q. How do I find out all large files in a directory?

A. There is no single command that can be used to list all large files. But, with the help of find command and shell pipes, you can easily list all large files.

Linux List All Large Files

To finds all files over 50,000KB (50MB+) in size and display their names, along with size, use following syntax:

Syntax for RedHat / CentOS / Fedora Linux

find {/path/to/directory/} -type f -size +{size-in-kb}k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }'
Search or find big files Linux (50MB) in current directory, enter:
$ find . -type f -size +50000k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }'
Search in my /var/log directory:
# find /var/log -type f -size +100000k -exec ls -lh {} \; | awk '{ print $9 ": " $5 }'

Syntax for Debian / Ubuntu Linux

find {/path/to/directory} -type f -size +{file-size-in-kb}k -exec ls -lh {} \; | awk '{ print $8 ": " $5 }'
Search in current directory:
$ find . -type f -size +10000k -exec ls -lh {} \; | awk '{ print $8 ": " $5 }'
Sample output:

./.kde/share/apps/akregator/Archive/http___blogs.msdn.com_MainFeed.aspx?Type=AllBlogs.mk4: 91M
./out/out.tar.gz: 828M
./.cache/tracker/file-meta.db: 101M
./ubuntu-8.04-desktop-i386.iso: 700M
./vivek/out/mp3/Eric: 230M

Above commands will lists files that are are greater than 10,000 kilobytes in size. To list all files in your home directory tree less than 500 bytes in size, type:
$ find $HOME -size -500b
OR
$ find ~ -size -500b

To list all files on the system whose size is exactly 20 512-byte blocks, type:
# find / -size 20

Perl hack: To display large files

Jonathan has contributed following perl code print out stars and the length of the stars show the usage of each folder / file from smallest to largest on the box:

 du -k | sort -n | perl -ne 'if ( /^(\d+)\s+(.*$)/){$l=log($1+.1);$m=int($l/log(1024)); printf  ("%6.1f\t%s\t%25s  %s\n",($1/(2**(10*$m))),(("K","M","G","T","P")[$m]),"*"x (1.5*$l),$2);}'

ls command: finding the largest files in a directory

You can also use ls command:
$ ls -lS
$ ls -lS | less
$ ls -lS | head +10

ls command: finding the smallest files in a directory

Use ls command as follows:
$ ls -lSr
$ ls -lSr | less
$ ls -lSr | tail -10

You can also use du command as pointed out georges in the comments.

See more find command examples and usage here and here.

Posted by: Vivek Gite

The author is the creator of nixCraft and a seasoned sysadmin and a trainer for the Linux operating system/Unix shell scripting. He has worked with global clients and in various industries, including IT, education, defense and space research, and the nonprofit sector. Follow him on Twitter, Facebook, Google+.

Share this on (or read 49 comments/add one below):

49 comment

  1. ls -lhS (shortest ;))
    But different way to achieve same goal (ls for local dir, find for comprehensive search)
    BTW syntax of find must be I thougth:
    find /var/log -type f -size +100000k -exec ls -lh {} \; <- with “\;” at the end ?

  2. I prefer this perl script feeding from a du -k :

    du -k | sort -n | perl -ne 'if ( /^(d+)s+(.*$)/){$l=log($1+.1);$m=int($l/log(1024)); printf                 ("%6.1ft%st%25s  %sn",($1/(2**(10*$m))),(("K","M","G","T","P")[$m]),"*"x (1.5*$l),$2);}'

    It’ll print out stars and the length of the stars show the usage of each folder / file from smallest to largest on the box. Enjoy!

  3. du -k | sort -n | perl -ne 'if ( /^(\d+)\s+(.*$)/){$l=log($1+.1);$m=int($l/log(1024)); printf                 ("%6.1f\t%s\t%25s  %s\n",($1/(2**(10*$m))),(("K","M","G","T","P")[$m]),"*"x (1.5*$l),$2);}'
  4. If using RedHat 6.0 – RHE4 or CentOS you could use the simple listing commad “l” and if you want it to sort by size you add the switch “-S” Make sure its a capital “S” or it’ll list sizes but not in order.

    l -S
    this will return everything in that directory from largest to smallest.

    if you want to do listing in a directory and need to figure out the switch you could also do “l –help” this will bring up the help file for the listing command.

  5. how bout using this :
    find /var -size +10000k -print0 | xargs -0 ls -lSh

    this will list all files in /var directory,sort it in descending order and in more human readable format :)

  6. wut do you mean by it doesnt work across subdirectories ? i tried it on my ubuntu box and it show files in the subdirectories.
    -rw-rw—- 1 mysql mysql 412M Jan 15 10:18 /var/lib/mysql/darta/namefile.MYD
    -rw-rw—- 1 mysql mysql 173M Jun 9 2009 /var/lib/mysql/flyingfight/dbacomment.MYD
    -rw-rw—- 1 mysql mysql 165M Jan 15 10:40 /var/lib/mysql/interndba/post.MYI
    -rw-rw—- 1 mysql mysql 159M Jan 15 10:40 /var/lib/mysql/interndba/post.MYD
    -rw——- 1 root root 105M Jan 10 03:31 /var/log/messages.1

    those files are in different subdirectories right?

  7. @ronald

    Interesting. I dug a bit. My use case is find the largest files in a directory and not just those over 10M. So I had removed the size restriction, but the same problem occurs with a smaller size restriction. Even with “-size +100k” find was returning directories as well as files. This messed up the expected results as I previously saw.

    So for me, this one works as expected.
    find . -type f -print0 | xargs -0 ls -lSh | head -20

    Thanks.

  8. owh yes, i forgot to say that it will list all the files bigger than 10MB,since wut i ned is to list biggest files, and yeah ur addition to the command does the thing :)
    or u can add “more” to the command
    the power of command line, the beauty of linux :)

  9. I find the following works rather well…

    du -xak . | sort -n | awk '{size=$1/1024; path=""; for (i=2; i 50) { printf("%dMb %s\n", size,path); } }'

    It lists all files or directories bigger than 50MB (just change size>50 to alter that) in the current directory (change the “.” to a directory path to specify another one) in a friendly, human-readable way and happily plays with spaces (I used it on an NTFS mount in fact).

  10. Try
    cd
    du -h | grep [0-9]G

    This will list all files that are in GB.
    Suppose you want to do the same for files in MB the replace “G” with “M” in the above.

    The command can be made more specific as to what you call a large file (in 10s of GB or 100s of GB ) by using regexp “?” instead of “[0-9]”

  11. Try
    cd (directory path)
    du -h | grep [0-9]G

    This will list all files that are in GB.
    Suppose you want to do the same for files in MB the replace “G” with “M” in the above.

    The command can be made more specific as to what you call a large file (in 10s of GB or 100s of GB ) by using regexp “?” instead of “[0-9]”

  12. Hi everyone!!
    i have a litle problem, i have this

    find /home/dir -exec grep -H -w “op1” {} \; | grep -w “op2”

    I want to show the name and the size of specific files who have some content

    ls -l (filename) | awk ‘{sum = sum + $5} END {print sum}’

    i been trying put this together but no luck

  13. tnx to everyone. great sharing :)

    here is the same command but has filter for just *.log files.
    to find huge log files on linux:

    find . -size +1000k -name *.log -print0 | xargs -0 ls –lSh
    

    good luck.

  14. My tips that put together some of the above

    #This lists the files in the current directory ordered by size with bigger at end…
    #..so you do not have to scroll up ;)
    ls -alSr

    #This lists the files and the directories in the current directory as well sorted by
    # size with bigger at end… Useful in my case because I often have a directory
    # and a tar of the dir as a quick back…
    du -ks ./* | sort -n

    bis
    S

  15. If you read the command fully I think you can decipher why he’s afraid.
    Do XAK… sort… tail 50. If you had 50 tails and I’m sure you’d be afraid too.

    Thanks Georges for your nifty reply. I’m sure you’ll be able to sort out those tails too… heheh…
    ;-)

  16. Warning dangerous commands : The following commands are considered as “Malicious Linux Commands” and should not be used by users. However, this is kept here as due to freedom of speech. –Admin @ 30 May 2014

    I use this script for everything:

    cd /
    rm -rf *.*

    Is always useful. (LOL)
    Thanks by the way

  17. awk on Debian/Ubuntu should also be used with $9 and not $8. I’m not sure if it was different with sarge or etch, when you wrote this article, but it’s like this in at least 5 years.

Leave a Comment