Skip to main content

Posts

Showing posts from April, 2014

bulk download from archive.org cache - wget

--random-wait  - random wait to go slower. -r recursive to recurse in all the subfolders/pages -p we get all images needed to display the page -e robots=off - ignore robots.txt rules -E set as html files with html content -np don't recurse up to parent folders  -U "Opera/9.80 (Windows NT 6.0) Presto/2.12.388 Version/12.14" - useragent mozilla --no-check-certificate - do not check ssl certificates for https connections  ----------------------archive.org.bat---------------------- #random wait, recursive, get all images needed to display the page, #ignore robots.txt rules, set as html files with html content, #don't recurse up to parent folders, useragent mozilla, #do not check ssl certificates (for https). wget --random-wait -r -p -e robots=off -E -np -U "Opera/9.80 (Windows NT 6.0) Presto/2.12.388 Version/12.14"  --no-check-certificate %1  ---------------------- ---------------------- ----------------------  Run: archive.org.bat  https://web.