Coping with spam in web archivers

There is a known problem with the Wayback Machine regarding domain name ownership - oftentimes, the domain changes hands, and the new owner either puts a spam blog on it, or puts a robots.txt which forbids crawling, and then WM promptly deletes the entire website history.

So, are there any options to combat this problem? Archive crawlers which don't retroactively delete content, for example.

EDIT: This question is an offspring from Preserving website content. This one is about "how to deal with spam", and the other one is "what websites we could/should preserve".

wizzard0

web-archiving
crawling
digital-born
data-curation

Comments

Ben Fino-Radin: essentially a duplicate of your question here: http://digitalpreservation.stackexchange.com/questions/87/preserving-website-content
wizzard0: No, this is not a duplicate. I specifically split that question into two due to suggestion from @Nick Krabbenhoeft.
Andy Jackson: This question could do with a heavy rewrite. It is true that the archive.org retroactively respects robots.txt. However, I am pretty sure this just blocks access and does not delete the history. More importantly, this is all a policy and is not inherent to the tool. My Wayback Machine does not apply robots.txt in this way.