Zombse

The Zombie Stack Exchanges That Just Won't Die

View the Project on GitHub anjackson/zombse

What should go into a decision about how many copies of digital objects libraries should keep for digital preservation

Is there a right answer for how many copies one should be keeping of digital content an organization is keeping? Given the amount of issues involved the answer is likely no. Clearly only one copy is a significant liability, where two is better, three is nice in that you now can break a tie if you end up with one of the copies failing a fixity check.

At some point you get diminishing returns for having more copies. So the question at hand is what factors should go into making a decision about how many copies an organization should be keeping?

Trevor Owens

Comments

Answer by Henry Mensch

Value of the data and resources available should drive some of this. Set a baseline for ordinary items, and understand which items are exceptional and need more protection.

For my digital library (scientific data for a research lab) I take a full backup every ninety days and store it offsite This is in addition to daily incrementals, and these are "forever" backups--the tapes are never recycled. You never know when bitrot sets in; having regular copies to refer to gives you a fighting chance to restore to a known good state.

Comments

Answer by AaronC

The answer is Four. Original, local backup, offsite backup, offline backup.

Hah, if only it were that easy. But, that is the approach we've taken. I tend to think of it as layers of infrastructure and not copies. We put one file in one directory and it is replicated across our entire system. There are four copies, but really its just one object sitting in our digital preservation environment.

Comments