wget.md 2.1 KB

Wget recursive download options

wget --recursive -np -nc -nH --cut-dirs=4 --random-wait --wait 1 -e robots=off https://site.example/aaa/bbb/ccc/ddd/

This downloads the files to whatever directory you ran the command in. To use Wget to recursively download using FTP, simply change https:// to ftp:// using the FTP directory.

--recursive

download recursively (and place in recursive folders on your PC)

--recursive --level=1

recurse but --level=1 don’t go below specified directory

-Q 1g

total overall download --quota option, for example to stop downloading after 1 GB has been downloaded altogether

-np

Never get parent directories (sometimes a site will link upwards)

-nc

no clobber – don’t re-download files you already have

-nd

no directory structure on download (put all files in one directory commanded by -P)

-nH

don’t put vestigial site name directories on your PC

-A

only accept files matching globbed pattern

--cut-dirs=4

don’t put a vestigial hierarchy of directories above the desired directory on your PC. Set the number equal to the number of directories on server (here aaa/bbb/ccc/ddd is four)

-e robots=off

Many sites will block robots from mindlessly consuming huge amounts of data. Here we override this setting telling Apache that we’re (somewhat) human.

--random-wait

To avoid excessive download requests (that can get you auto-banned from downloading) we politely wait in-between file downloads

--wait 1

making the random wait time average to about 1 second before starting to download the next file. This helps avoid anti-leeching measures.

References: