agedu: Unix / Linux Command For Tracking Down Wasted Disk Space

by on October 20, 2012 · 11 comments· LAST UPDATED October 20, 2012

in Command Line Hacks, Hardware, Howto, Storage

Most sysadmin will run low on disk space. Users will demand more space and you need to free space. You will find out files that's a waste of space and delete it or move to an archive medium. But, how do you find the right files to delete that can help recover maximum space? Say hello to agedu tool (pronounced as 'age dee you') - it scans a directory tree and produces reports about how much disk space is used in each directory and subdirectory, and also how that usage of disk space corresponds to files with last-access times a long time ago. In other words, this command might help you to free up disk space.

du vs agedu

The du command summarize disk usage of each file, recursively for directories. This tool is like du, but unlike du, it also distinguishes between large collections of data which are still in use and ones which have not been accessed in months or years - for instance, large archives downloaded, unpacked, used once, and never cleaned up. Where du helps you find what's using your disk space, agedu helps you find what's wasting your disk space.

How does agedu works?

From the man page:

Most Unix file systems, in their default mode, helpfully record when a file was last accessed. Not just when it was written or modified, but when it was even read. So if you generated a large amount of data years ago, forgot to clean it up, and have never used it since, then it ought in principle to be possible to use those last-access time stamps to tell the difference between that and a large amount of data you're still using regularly.

agedu is a program which does this. It does basically the same sort of disk scan as du, but it also records the last-access times of everything it scans. Then it builds an index that lets it efficiently generate reports giving a summary of the results for each subdirectory, and then it produces those reports on demand.

Install agedu

Deiban / Ubuntu Linux user type the following apt-get command to install agedu:
# apt-get install agedu
Sample outputs:

Reading package lists... Done
Building dependency tree
Reading state information... Done
The following NEW packages will be installed:
  agedu
0 upgraded, 1 newly installed, 0 to remove and 13 not upgraded.
Need to get 46.4 kB of archives.
After this operation, 131 kB of additional disk space will be used.
Get:1 http://debian.osuosl.org/debian/ squeeze/main agedu amd64 8928-1 [46.4 kB]
Fetched 46.4 kB in 1s (29.8 kB/s)
Selecting previously deselected package agedu.
(Reading database ... 274216 files and directories currently installed.)
Unpacking agedu (from .../agedu_8928-1_amd64.deb) ...
Processing triggers for man-db ...
Setting up agedu (8928-1) ...

FreeBSD unix user can use the port as follows:
# cd /usr/ports/sysutils/agedu/
# make install clean

OR alternatively use the binary package provided by FreeBSD:
# pkg_add -r agedu
RHEL / CentOS / Fedora / Scientific Linux user turn on EPEL repo and type the following yum command to install agedu:
# yum install agedu
Sample outupts:

 
Loaded plugins: auto-update-debuginfo, product-id, protectbase, rhnplugin,
              : subscription-manager
Updating certificate-based repositories.
Unable to read consumer identity
0 packages excluded due to repository protections
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package agedu.x86_64 0:0-2.r9153.el6 will be installed
--> Finished Dependency Resolution
 
Dependencies Resolved
 
================================================================================
 Package         Arch             Version                  Repository      Size
================================================================================
Installing:
 agedu           x86_64           0-2.r9153.el6            epel            47 k
 
Transaction Summary
================================================================================
Install       1 Package(s)
 
Total download size: 47 k
Installed size: 83 k
Is this ok [y/N]: y
Downloading Packages:
agedu-0-2.r9153.el6.x86_64.rpm                           |  47 kB     00:00
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing : agedu-0-2.r9153.el6.x86_64                                   1/1
Installed products updated.
  Verifying  : agedu-0-2.r9153.el6.x86_64                                   1/1
 
Installed:
  agedu.x86_64 0:0-2.r9153.el6
 
Complete!
 

How do I use agedu command?

First, you need to scan your disk and build an index file containing a special data structure, enter:
$ agedu -s /home/vivek
$ sudo agedu -s /var
$ agedu -s /nas05

Sample outputs:

Built pathname index, 103484 entries, 8642180 bytes of index
Faking directory atimes
Building index
Final index file size = 23654856 bytes

In above examples, I started agedu by telling it to do a scan of a directory tree and build an index for /home/vivek, /var, and /nas05 directories. Next, logical step is to query the index by typing the following command:
$ agedu -w
Sample outputs:

Using Linux /proc/net magic authentication
URL: http://127.0.0.1:42113/

Fire up a graphical web browser, type the following url:
http://127.0.0.1:42113/
Sample outputs:

Unix / Linux: Correlate Disk Usage With Last-access Times

Fig.01: agedu report


You can see a graphical representation of the disk usage in /home/vivek and its immediate subdirectories, with varying colours used to show the difference between disused and recently-accessed data. Feel free to click on any subdirectory to descend into it and see a report for its subdirectories in turn; click on parts of the pathname at the top of any page to return to higher-level directories. To terminate this mode, just press [CTRL]+[D]. You can set the network address and port number on which agedu should listen when running its web server:
$ agedu -w --address addr[:port]
$ agedu -w --address 192.168.1.5:9000

You can also control access to the web pages it serves:
$ agedu -w --address 192.168.1.5:9000 --auth basic
Sample outputs:

Username: agedu
Password: 696cv6r297upqzmt
URL: http://192.168.1.5:9000/

agedu will normally make up a username and password for you. But, you can set your own user name and password:
$ agedu -w --address 192.168.1.5:9000 --auth basic --auth-fd 0
Sample outputs:

vivek:cAnd1Bar

The authentication details should consist of the username (vivek), followed by a colon (:), followed by the password (cAnd1Bar), followed immediately by end of file (press [CTRL]+[D]).

Can I access agedu reports using terminal mode?

Type the following command (replace /home/wwwroot with actual path):
$ agedu -t /home/wwwroot
Sample outputs:

53569312    /home/wwwroot
30427672    /home/wwwroot/logs
83997004    /home/wwwroot/images

You will get a summary of the disk usage in /home/wwwroot and its subdirectories. The output is much the same format as du command. To see how much old data is there, pass the the -a option to show only files last accessed a certain length of time ago. For example, to see only files which haven't been looked at in twelve months or more:
$ agedu -t /home/wwwroot -a 12m
Sample outputs:

2220        /home/wwwroot
2236        /home

How do I see only the disk space taken by MP3 files or .AVI files?

Run the command as follows in current directory:
$ agedu -s . --exclude '*' --include '*.mp3'
$ agedu -s . --exclude '*' --include '*.avi'

To view reports, run:
$ agedu -w
The reports and first two command will cause everything to be omitted from the scan, but then the MP3 files to be put back in. If you then wanted only a subset of those MP3s, you could then exclude some of them again by adding, say, --exclude-path './steviewonder/*'
$ agedu -s ~/Downloads/drm-free/music --exclude '*' --include '*.mp3' --exclude-path './steviewonder/*'

How do I delete files and reclaim disk space again?

In my case /home/vivek/iso-images/ was taking too much space. I found out older vmware-workstation, Linux / Unix iso and binary files. I deleted all those old unwanted files and recovered 16.5GB disk space using nothing but simple rm command:
$ rm ~/iso-images/vmware*
$ rm ~/iso-images/centos-4* ~/iso-images/centos-5*

How do I remove agedu index file?

Use the ls command to see the size of agedu index file:
$ ls agedu.dat -lh
Sample outputs:

-rw------- 1 vivek vivek 23M Oct 21 01:55 agedu.dat

To remove index file, enter:
$ agedu -R
OR
$ rm agedu.dat
However, you can also put -R on the end of a command line to indicate that agedu should delete its index file after it finishes performing other operations such as displaying web pages:
$ agedu -w -R

This blog post gives you the quick tour of what agedu does. This command has many more options for complex situations that comes with usual array of unix command line options. So, I recommend that you read the man page for more information or visit the project home page to grab the latest source code:
$ man agedu

Related posts:

If you've got a favorite command or hack I didn't mention, let us know about it in the comments below.

Editor's note: Due to some xml markup error you may missed our previous blog entry in rss reader/email news letter. This entry is available at - Howto: Setup Linux hard disk encryption with LUKS page.

TwitterFacebookGoogle+PDF versionFound an error/typo on this page? Help us!

  • David

    Wow, great post, very detailed. Thanks.

  • Ashok

    Good one !

  • Sumit Goel

    Indeed Nice Post!

    Just wanted to check if possible to install agedu on one server and use to fetch the information from other servers in the same network?

    Thank you.

  • http://www.cyberciti.biz/tips/about-us nixCraft

    Yes, you can generate index (called as dump) on Windows (use ageduscan.exe) or Linux or Unix server:

    ### on server42 ###
    agedu -s /path/to/dir1
    agedu -D > server42.dump
    rm agedu.dat
    

    Copy server42.dump file and run it using the following syntax on another server:

    ### on wks01 ###
    scp user@server42:/path/to/server42.dump /path/to/local/
    agedu -L
    See man page for more info :P
  • gouthamk10

    Hi
    i just want see the agedu with root directory.

    thanks
    goutham .k

  • Ed Greenberg

    I love it. Thank you.

    I was wondering about why stuff I hadn’t thought of in years had shown up as current, but I believe that by running grep over the directory structure, I might have reset all those access times. :(

    Excellent utility though.

    Ed

  • Anthony G. Basile

    Nice. I just added agedu to Gentoo’s mirrors. I’m surprised such a nice little tidbit wasn’t already there.

  • matt

    I’m getting 403 errors when trying to visit the url…

    Anyone able to point me into the right direction of solving this?

    Thanks!

  • Anonymous

    try this: “–auth none” (without quotes)

  • setevoй червь™

    Does I need to run # agedu -s every time before I want to see disk space? Can I use cron for it? Thanks!

  • Syafeuq

    Why it not bind with my server ip? URL: http://127.0.0.1:41520/

Previous post:

Next post: