Zombse

The Zombie Stack Exchanges That Just Won't Die

View the Project on GitHub anjackson/zombse

Has archive.org removed or deleted sites from its database?

I've seen many archive.org pages that I could no longer access because after the site got deleted, a domain took over the site and put robots.txt restriction protocol over the site...

InquilineKea

Comments

Answer by alxp

There seem to be a few threads about this in archive.org's forms, such as this example.

It seems to be an issue that they know about but are choosing to keep the status quo about. My guess is because there'd be no reliable way to tell that a 'new owner' is not just the same group who happened to change domain registrars, etc. without having to devote many more resources than they have available to check over all of the domains in their archive.

Comments

Answer by Henry Mensch

Probably. From IA's own T&C's:

While we collect publicly available Internet documents, sometimes authors and publishers express a desire for their documents not to be included in the Collections (by tagging a file for robot exclusion or by contacting us or the original crawler group). If the author or publisher of some part of the Archive does not want his or her work in our Collections, then we may remove that portion of the Collections without notice.

I wouldn't be surprised if they collect first and ask permission later.

Comments