Linux wget your ultimate command line downloader

by nixcraft · 15 comments

It is a common practice to manage UNIX/Linux/BSD server remotely over ssh session. As you manage servers, you need to download the software or other files for installation or even download latest ISO of Linux distribution (or even MP3s). These days we have lots of GUI downloaders for X window such as:

  • d4x: http://www.krasu.ru/soft/chuchelo
  • kget: KDE download manager
  • gwget2 - GNOME 2 wget front-end

However, when it comes to command line (shell prompt) wget the non-interactive downloader rules. It supports http, ftp, https protocols along with authentication facility, and tons of options. Here are some tips to get most out of it:

Download a single file using wget

$ wget http://www.cyberciti.biz/here/lsst.tar.gz
$ wget ftp://ftp.freebsd.org/pub/sys.tar.gz

Download multiple files on command line using wget

$ wget http://www.cyberciti.biz/download/lsst.tar.gz ftp://ftp.freebsd.org/pub/sys.tar.gz ftp://ftp.redhat.com/pub/xyz-1rc-i386.rpmOR

i) Create variable that holds all urls and later use 'BASH for loop' to download all files:
$ URLS=”http://www.cyberciti.biz/download/lsst.tar.gz ftp://ftp.freebsd.org/pub/sys.tar.gz ftp://ftp.redhat.com/pub/xyz-1rc-i386.rpm http://xyz.com/abc.iso" ii) Use for loop as follows:
$ for u in $URLS; do wget $u; doneiii) However, a better way is to put all urls in text file and use -i option to wget to download all files:

(a) Create text file using vi
$ vi /tmp/download.txtAdd list of urls:
http://www.cyberciti.biz/download/lsst.tar.gz
ftp://ftp.freebsd.org/pub/sys.tar.gz
ftp://ftp.redhat.com/pub/xyz-1rc-i386.rpm
http://xyz.com/abc.iso
(b) Run wget as follows:
$ wget -i /tmp/download.txt(c) Force wget to resume download
You can use -c option to wget. This is useful when you want to finish up a download started by a previous instance of wget and the net connection was lost. In such case you can add -c option as follows:
$ wget -c http://www.cyberciti.biz/download/lsst.tar.gz
$ wget -c -i /tmp/download.txt
Please note that all ftp/http server does not supports the download resume feature.

Force wget to download all files in background, and log the activity in a file:

$ wget -cb -o /tmp/download.log -i /tmp/download.txtOR$ nohup wget -c -o /tmp/download.log -i /tmp/download.txt &nohup runs the given COMMAND (in this example wget) with hangup signals ignored, so that the command can continue running in the background after you log out.

Limit the download speed to amount bytes/kilobytes per seconds.

This is useful when you download a large file file, such as an ISO image. Recently one of admin started to download SuSe Linux DVD on one of production server for evaluation purpose. Soon wget started to eat up all bandwidth. No need to predict end result of such a disaster.
$ wget -c -o /tmp/susedvd.log --limit-rate=50k ftp://ftp.novell.com/pub/suse/dvd1.iso Use m suffix for megabytes (--limit-rate=1m). Above command will limit the retrieval rate to 50KB/s. It is also possible to specify disk quota for automatic retrievals to avoid disk DoS attack. Following command will be aborted when the quota is
(100MB+) exceeded.
$ wget -cb -o /tmp/download.log -i /tmp/download.txt --quota=100mF) Use http username/password on an HTTP server:
$ wget –http-user=foo –http-password=bar http://cyberciti.biz/vivek/csits.tar.gzG) Download all mp3 or pdf file from remote FTP server:
Generally you can use shell special character aka wildcards such as *, ?, [] to specify selection criteria for files. Same can be use with FTP servers while downloading files.
$ wget ftp://somedom.com/pub/downloads/*.pdf
$ wget ftp://somedom.com/pub/downloads/*.pdf
OR$ wget -g on ftp://somedom.com/pub/downloads/*.pdfH) Use aget when you need multithreaded http download:
aget fetches HTTP URLs in a manner similar to wget, but segments the retrieval into multiple parts to increase download speed. It can be many times as fast as wget in some circumstances( it is just like Flashget under MS Windows but with CLI):
$ aget -n=5 http://download.soft.com/soft1.tar.gzAbove command will download soft1.tar.gz in 5 segments.

Please note that wget command is available on Linux and UNIX/BSD like oses.

See man page of wget(1) for more advanced options.

Featured Articles:

Want to read Linux tips and tricks, but don't have time to check our blog everyday? Subscribe to our daily email newsletter to make sure you don't miss a single tip/tricks. Subscribe to our weekly newsletter here!

{ 15 comments… read them below or add one }

1 JZA 11.01.06 at 10:49 pm

I remember there was a way to download all .txt file from a URL however I haven’t been able to find it. You mention that wget http://../../*.txt will do it however it generate an error saying:
HTTP request sent, awaiting response… 404 Not Found

2 nixcraft 11.01.06 at 10:56 pm

Yes. It can be done provided that remote ftp/web server support this feature. Due to abuse/security or to avoid server load most remote system disables this feature.

Try to use -i option as described above to fetch list of files from text file.
wget -i list.txt

3 dushyant 01.03.07 at 11:32 am

if some site not giving ( showing its full URL )and also talking http passwd and user login than
how to get the data directory and how can i use these with wget command
plz mail me on my id i cant remember this site name .

4 mangesh 07.13.07 at 7:42 am

download recessively, resume with passive

wget -c -r –passive-ftp -nH ftp://:@//ftp_dir/*.zip

5 akhil 12.04.07 at 1:14 pm

wget is not working for HTTPS protocol..please tell me y?

6 JoshuaTaylor.info 05.09.08 at 4:46 am

Lol, these instructions are great for my dedicated server for spreading out my RS.com downloads.

Cheers!

7 Interesting 11.20.08 at 2:54 pm

Very interesting thanks for the information on Wget my favorite way of uploading to my dedicated servers. Be it an over complicated and complex way but my favorite none the less

8 sara 02.17.09 at 6:10 am

thanks alot for your help and usefull informations

9 David D. 03.08.09 at 10:03 pm

Is there a way to specify the filename wget writes to?
Would the following do the trick?
“wget $options $url > customfilenameiwant.otherextension”,

10 Vivek Gite 03.08.09 at 10:47 pm

Try following to save file.pdf as output.pdf
wget -O output.pdf http://example.com/file.pdf

11 Lakshmipathi.G 04.29.09 at 6:42 am

Really useful info….exactly what i wanted …. thanks

12 mimo 05.02.09 at 6:57 am

Nice hints!

What I am missing (and still searching for) is –> how to limit file size of a file to download? One of my current bash scripts is ’spidering’ certain sites for certain archive files. IF encountered a 5GB *.zip file, It would happily download it, which I don’t want. So: what would be a good practise to limit downloads to, say, 2 MB?

Cheers

13 Neela.V 09.15.09 at 6:31 am

Hi, i need to retrieve a file from an http site, which asks for username and password to access the site. How can i retrieve the file using wget? please guide me.

14 vivian 09.18.09 at 4:46 am

Hello, i am not able to use wget in my ubuntu system.. whenever i try to download anything for example,
sudo wget http://www.cyberciti.biz/here/lsst.tar.gz
what i get is
–2009-09-18 10:15:15– http://www.cyberciti.biz/here/lsst.tar.gz
Resolving http://www.cyberciti.biz... 74.86.48.99
Connecting to http://www.cyberciti.biz|74.86.48.99|:80… ^C
The above is same for any website.
i am able to do sudo apt-get install and this is the evidence that i am connecting to the internet. But for wget case i am not able to. Is there any config settings that i need to do?

15 Palash 10.15.09 at 8:28 am

Many many thanks for give important tips of wget.

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Previous post:

Next post: