I want to get all the files for a given website at archive.org. Reasons might include:
- the original author did not archived his own website and it is now offline, I want to make a public cache from it
- I am the original author of some website and lost some content. I want to recover it
- ...
How do I do that ?
Taking into consideration that the archive.org wayback machine is very special: webpage links are not pointing to the archive itself, but to a web page that might no longer be there. JavaScript is used client-side to update the links, but a trick like a recursive wget won't work.
gem install wayback_machine_downloader. Run wayback_machine_downloader with the base url of the website you want to retrieve as a parameter:wayback_machine_downloader http://example.comMore information: https://github.com/hartator/wayback_machine_downloader – Hartator Aug 10 '15 at 06:32