How Do I Find The Largest Top 10 Files and Directories On a Linux / UNIX / BSD?

Posted on in Categories , , , , , , last updated March 29, 2016

How do I find the largest top files and directories on a Linux or Unix-like operating systems?

Sometimes it is necessary to know what file(s) or directories are eating up all your disk space. Further, it may be required to find out it at the particular directory location on filesystem such as /tmp/ or /var/ or /home/. This guide will help you to use Unix and Linux command for finding the largest or biggest the files or directories on filesystem.

There is no simple command available to find out the largest files/directories on a Linux/UNIX/BSD filesystem. However, combination of following three commands (using pipes) you can easily find out list of largest files:

  • du command : Estimate file space usage.
  • sort command : Sort lines of text files or given input data.
  • head command : Output the first part of files i.e. to display first 10 largest file.
  • find command : Search file.

Type the following command at the shell prompt to find out top 10 largest file/directories:
# du -a /var | sort -n -r | head -n 10
Sample outputs:

1008372 /var
313236  /var/www
253964  /var/log
192544  /var/lib
152628  /var/spool
152508  /var/spool/squid
136524  /var/spool/squid/00
95736   /var/log/mrtg.log
74688   /var/log/squid
62544   /var/cache

If you want more human readable output try (GNU user only):
$ cd /path/to/some/where
$ du -hsx * | sort -rh | head -10

Where,

  • du command -h option : display sizes in human readable format (e.g., 1K, 234M, 2G).
  • du command -s option : show only a total for each argument (summary).
  • du command -x option : skip directories on different file systems.
  • sort command -r option : reverse the result of comparisons.
  • sort command -h option : compare human readable numbers. This is GNU sort specific option only.
  • head command -10 OR -n 10 option : show the first 10 lines.

The above command will only work of GNU/sort is installed. Other Unix like operating system should use the following version (see comments below):

for i in G M K; do du -ah | grep [0-9]$i | sort -nr -k 1; done | head -n 11

Sample outputs:

179M	.
84M	./uploads
57M	./images
51M	./images/faq
49M	./images/faq/2013
48M	./uploads/cms
37M	./videos/faq/2013/12
37M	./videos/faq/2013
37M	./videos/faq
37M	./videos
36M	./uploads/faq

Find the largest file in a directory and its subdirectories using the find command

Type the following GNU/find command:

## Warning: only works with GNU find ##
find /path/to/dir/ -printf '%s %p\n'| sort -nr | head -10
find . -printf '%s %p\n'| sort -nr | head -10

Sample outputs:

5700875 ./images/faq/2013/11/iftop-outputs.gif
5459671 ./videos/faq/2013/12/glances/glances.webm
5091119 ./videos/faq/2013/12/glances/glances.ogv
4706278 ./images/faq/2013/09/cyberciti.biz.linux.wallpapers_r0x1.tar.gz
3911341 ./videos/faq/2013/12/vim-exit/vim-exit.ogv
3640181 ./videos/faq/2013/12/python-subprocess/python-subprocess.webm
3571712 ./images/faq/2013/12/glances-demo-large.gif
3222684 ./videos/faq/2013/12/vim-exit/vim-exit.mp4
3198164 ./videos/faq/2013/12/python-subprocess/python-subprocess.ogv
3056537 ./images/faq/2013/08/debian-as-parent-distribution.png.bak

You can skip directories and only display files, type:

find /path/to/search/ -type f -printf '%s %p\n'| sort -nr | head -10

OR

find /path/to/search/ -type f -iname "*.mp4" -printf '%s %p\n'| sort -nr | head -10

Hunt down disk space hogs with ducks

Use the following bash shell alias:

alias ducks='du -cks * | sort -rn | head'

Run it as follows to get top 10 files/dirs eating your disk space:
$ ducks
Sample outputs:

Fig.01 Finding the largest files/directories on a Linux or Unix-like system
Fig.01 Finding the largest files/directories on a Linux or Unix-like system

Posted by: Vivek Gite

The author is the creator of nixCraft and a seasoned sysadmin and a trainer for the Linux operating system/Unix shell scripting. He has worked with global clients and in various industries, including IT, education, defense and space research, and the nonprofit sector. Follow him on Twitter, Facebook, Google+.

62 comment

  1. To find out largest file only use command ls as follows in current directory:
    ls -lSh . | head -5
    Output:
    -rw-r–r– 1 vivek vivek 267M 2004-08-04 15:37 WindowsXP-KB835935-SP2-ENU.exe
    -rw-r–r– 1 vivek vivek 96M 2005-12-30 14:03 VMware-workstation-5.5.1-19175.tar.gz
    ls -lSh /bin | head -5
    You can also use find command but not du:
    find /var -type f -ls | sort -k 7 -r -n | head -10

    Hope this helps

  2. And yes to find the smallest files use command:
    ls -lSr /var

    Or use find command with -size flag.
    find / -type f -size +20000k -exec ls -lh {} ; | awk ‘{ print $8 “: ” $5 }’

    Read man page of find for more info.

    1. “find / -type f -size +20000k -exec ls -lh {} ; | awk ‘{ print $8 “: ” $5 }’”

      needs to have the exec altered

      find / -type f -size +20000k -exec ls -lh {} \; | awk ‘{ print $8 “: ” $5 }’

      Also, I find this output easier to read

      find . -type f -size +20000k -exec ls -lh {} \; | awk ‘{print $5″: “$8}’

  3. If you set du to human readable I think it will not sort the way you really want.

    For the above problems. I would like to find a way to list only the last level directories’ sizes.

    (I want to filter somehow this:
    /home
    /home/user
    /home/user/mail

    I just want to see the lasts of the tree!)

    TIA!


    R

      1. How about the 3 hours it takes to read through a bunch of unexplained programming nonsense. I swear half of the time all Linux guys do is insult other users….Example, the first listing is great as it begins to explain what the flags do, but I have no idea were to put some of them, pipes are not explained…ect

    1. Fast forwarda few years, Linux won, bro :)

      “Late last week, hell had apparently frozen over with the news that Microsoft had developed a Linux distribution of its own. The work was done as part of the company’s Azure cloud platform, which uses Linux-based network switches as part of its software-defined networking infrastructure.” — SOURCE HERE.

    1. This may vary depending on the version of Windows you’re using, but the basic procedure is: open the find/search window, go to the advanced options, and there will be an option there to enter in a size parameter. Simple.

      In XP, press F3 or go Start->Search. Choose “All files and folders”, then “More advanced options”, then “What size is it?”, then specify a size.

      1. I believe beez was being sarcastic towards Winfan as was his right after Winfans brainless comment. Most Linux distro GUI’s come with search function just like Windows GUI.

        Now just try to do the search on command line on Windows (server)…

        Sam

    1. c:\>dir /S /O-S | more

      The simple dir /S command from c:\ will give you all files and directories from c:\ all the way through the drive and will sort from largest to smallest running through each directory. You can filter using /A if you’d like to restrict by hidden, system, archive files, read only files etc. and passing the output to another windows command if you need to further restrict or search in the files for something like “show me all the files on my hard drive over 6MB that contain the word ‘log’ from largest to smallest.”

      /O will Specify the sort order for the files with the S after it sorting by file size (smallest first) putting the – in front of the S reverses the order.

      | more – you’re a unix dude, you should know what this means…

      But if someone is doing some cleanup through their harddrive, this is the simple command I’d start with.

      Just a note about the cockyness or us Unix admins (as I happen to be one now)
      Not everyone that uses windows started using it with a mouse kid!!! Also not everyone who prefers windows is not cross-platform… We were running 64 bit clustered NT boxes on RISC processors at Digital Equipment Corporation with failover and failback in 1996 brother. Don’t believe me? Find a really old copy of Windows NT ver 3.51 open it and you’ll see two folders NT and Alpha.

      The Department of Veterans Affairs had no problems with ever needing to reboot a “lousy unreliable windows box” because the Intel platform itself was the problem, not windows. We ran Alpha on 64 bit RISC processors and it was just as reliable as any Unix box or Mainframe we had. I had a Jensen Alpha running an exchange server for 5 years, and we only rebooted it every 6 months for giggles…

      Windows machines are made to be used by the masses which means more dumbasses can kinda run one. A good Admin is a good Admin, no matter what platform. Be nice and be helpful or don’t post.

  4. I find the following works rather well…

    du -xak . | sort -n | awk '{size=$1/1024; path=""; for (i=2; i 50) { printf("%dMb %s\n", size,path); } }'

    It lists all files or directories bigger than 50MB (just change size>50 to alter that) in the current directory (change the “.” to a directory path to specify another one) in a friendly, human-readable way and happily plays with spaces (I used it on an NTFS mount in fact).

    1. By using vi editor edit the file
      and press shift + g you will move to last line of the file.

      now to add or insert you need to press i i.e insert mode and add data then to save
      press wq!
      the file will be saved and you will quit from file .

  5. The following is working and sorting properly by Gigabyte, Megabytes and Kilobytes:

    for i in G M K; do du -ah | grep ^[0-9\.]*$i | sort -nr -k 1; done | head -n 110

    Basically it makes sure to grep from the *beginning* of the line and to include the possible decimal points. In your version it would have stumbled over filenames beginning with numbers and a ‘G’ for instance.

  6. find / -xdev -size +2048 -ls  

    List all files larger than 1 mb. Displays

    1632043 2656 -rw-r-----   1 root     adm       2715648 Dec  3 13:51 /var/log/mail.log

    Datasource: IBM AIX Version 3 Commands Reference Volume 1
    A bit dated but it works just fine on Debian 7.

  7. I have the following:

    alias dush='du -sch .[!.]* * | sort -h'

    This adds the .hidden .directories as well (skipping parent .. directory), and sorts in human-readable format (GNU version of sort)

  8. For the 10 largest files in an entire filesystem (n = 10):
    find /mount/point -xdev -type f -print0 | xargs -0 du -sk | sort -n | head -10

    Or a faster way for a truly huge filesystem and small n:
    find /mount/point -xdev -type f -print0 | xargs -0 du -sk | highest -n 10 --use-bisect

    …where highest can be found here.

    highest is faster because, although GNU sort is quite impressive, it’s still O(nlogn), while highest is O(n).

    HTH.

Leave a Comment