Shell du command tip – estimate file space usage and exclude particular files

The du command estimate file space usage and summarize disk usage of each FILE, recursively for directories.

It displays the file system block usage for each file argument and for each directory in the file hierarchy rooted in each direc tory argument. If no file is specified it will use current directory.

But why use du command?

You must be wondering why I’m throwing out a light on du command. du is commonly used by system administrators to automate monitoring and notification programs that help prevent directories from becoming full.

du command Examples

Type du to display usage in current directory :
$ du

Pass -h option to display output in Byte, Kilobyte, Megabyte, Gigabyte, Terabyte and Petabyte (Human-readable output)
$ du -h

Display the name and size of each png file in the /ramdisk/figs/ directory as well as a total for all of the pngs:
$ du -hc /ramdisk/figs/*.png

Another useful option is -c which produce a grand total:
$ du -c

Show the disk usage of the /home/vivek subdirectory:
$ du /home/vivek

Show only a summary of the disk usage of the /home/vivek
$ du -hs /home/vivek

Exclude files that match PATTERN. For example do not count *.obj or *.jpg files:

$ du -h --exclude='*.obj'
$ du -h --exclude='*.jpg'

A PATTERN is a shell pattern (not a regular perl or other expression). The pattern ? matches any one character, whereas * matches any string.

Pipes and filters with du

Now display everything sorted by filesize:
$ du -sk .[A-z]* *| sort -n

Display screenful output at a time as du generate more output than can fit on the console / screen:
$ du -h | less

To find top 3 directories, enter :
$ cd /chroot
$ du -sk * | sort -nr | head -3

4620348 var
651972  home
27896   usr
21384   lib64

Working without du

Finally here is one liner (without du command) that prints top 10 filesize in Mb (thanks to dreyser for submitting idea):
# find /var -type f | xargs ls -s | sort -rn | awk '{size=$1/1024; printf("%dMb %s\n", size,$2);}' | head

31Mb /var/crash/_usr_lib_firefox_firefox-bin.1000.crash
22Mb /var/cache/apt/archives/linux-image-2.6.20-16-generic_2.6.20-16.28_i386.deb
16Mb /var/lib/apt/lists/in.archive.ubuntu.com_ubuntu_dists_feisty_universe_binary-i386_Packages
15Mb /var/cache/apt/archives/linux-restricted-modules-2.6.20-16-generic_2.6.20.5-16.28_i386.deb
9Mb /var/cache/apt/srcpkgcache.bin
9Mb /var/cache/apt/pkgcache.bin
8Mb /var/cache/apt/archives/firefox_2.0.0.4+1-0ubuntu1_i386.deb
7Mb /var/cache/apt/archives/linux-headers-2.6.20-16_2.6.20-16.28_i386.deb
5Mb /var/lib/apt/lists/archive.ubuntu.com_ubuntu_dists_feisty_main_binary-i386_Packages
5Mb /var/lib/apt/lists/in.archive.ubuntu.com_ubuntu_dists_feisty_universe_source_Sources

A note about GUI tools

You can use GUI tools for finding the sizes of files and directory trees. Just right click on file name and then select Properties from the popup menu.
du and file system properties
This is good for new users but it doesn’t provide scripting facility and fine-gain reporting option that du give us.

More on du command…

  1. Correction on the exclude. It should read:

    $ du -h –exclude=’*.obj’
    $ du -h –exclude=’*.jpg’

    Excellent article on du usage, and yes thanks to dreyser for the one-liner.

  2. I use a very similar findTop10, but I let find to the printing.
    find . -xdev -printf '%s %p\n' |sort -nr| head -10

  3. is there a du for debian. apt-get install du is always end with Couldn’t find package. Is the tool a piece of other package?

  4. Is it possible to filter “du -sh /home” output, not to show the full path (and just the space occupied)?

    $ du -sh /home

    8,5G /home/

    I need only the “8,5G” part of the output. How to do this?

  5. A simpler way to show directory space usage (can be inserted into a script, added to $PATH and then run from any location)


    $du -hs | cut -f1

    And in a script:

    dh -hs | cut -f1

    You save this with the name dsu in, let’s say, /usr/bin then

    chmod +x dsu

    and presto, you have a directory space usage command which you can run from anywhere in the system. Not much but hope it helps.

  6. Does any one know or experienced the output of “du -csh” mismatched with grand total ?

    For ex: du -csh *
    10M File_A
    14M Dir_A
    22M total

    but expectation is 24 MB as total.

