Zombse

The Zombie Stack Exchanges That Just Won't Die

View the Project on GitHub anjackson/zombse

What is an inexpensive way to back up a digital collection?

My collecting institution has a small digital collection. Because the collection is rather small backing items up to DuraCloud or in a LOCKSS system are not feasible. Backing items up on the institution's server is currently happening, but there are currently no redundant copies of materials. DuraCloud and LOCKSS backs up items multiple times in multiple locations, but both of these cost about \$4k per year.

Raminta

Comments

Answer by jeff

Others have pointed out that there are a lot of similarities between backing up a "digital collection" and backing up just about any kind of data.

Some ways in which your situation may differ from a general data backup challenge:

Here's an overview of options, without going into too much detail on each. Any of the options may be combined as you see fit.

Some of the options may not be suitable for all collections. Some of these might not strictly qualify as "backups".

Almost Free

Accession by Others as Backup

If you upload your collection (or a portion of your collection) to the Internet Archive or The Commons on Flickr, there's now one more copy than there was before.

Ad-hoc Reciprocal Storage

Find one or more organizations with similar interests or needs, and arrange to keep a copy of each other's data.

You can make this as simple or as complex as you want. Ideally, it should be automated unless your data is extremely static.

Options include:

The BitTorrent option doesn't have a lot of benefit if there are only two sites participating. If you are looking at a larger number of sites, you may wish to consider the next option.

Private LOCKSS Network

In a lot of ways, a "private" LOCKSS network (as opposed to participation in the "global" LOCKSS network) is similar to the ad-hoc options listed above.

There will be some setup involved, and you may need to dedicate more computing resources to it than the ad-hoc methods.

One advantage of LOCKSS is that a lot of thought has gone into handling issues which you'll be dealing with yourself in some of the ad-hoc methods.

It also benefits from having a catchy name. ;-) (albeit trademarked -- read on)

The LOCKSS software itself is available under an open source (BSD) license. It's unclear if you need any kind of license to use the LOCKSS name itself.

Inexpensive

Take External Media Offsite

Get a few hard drives or flash drives, and instead of exchanging them with another organization, place them in a safe deposit box or (depending on policy/etc) have staff members store them at home.

Simple but not automated, and somewhat more feasible due to the likely public nature of the data.

Commercial Online Backup Services

These are perhaps the most generic "backing up any kind of data" solutions. Your data is likely be have a different profile than traditional "backing up everyone's Documents folders in my company", so watch out for fine print in terms of "unlimited storage for a single computer" plans. Your data may not qualify.

Some providers / services:

Some online backup services make use of storage platforms provided by other companies. See the next option for how you may be able to cut out the middleman by using those services directly.

Commodity Cloud Storage

There are a number of companies providing storage platforms supported by Application Programming Interfaces (APIs). You can use open source tools to copy your data into the storage platforms, where the data is then replicated to multiple physical locations.

Software:

(there are a huge number of software packages out there with support for these kinds of storage platforms, both open source and not)

This option is perhaps the easiest to put a price tag on:

Yearly cost for 20 GB of storage:

Providers may have specials for new users (X gb/mo free for the first Y months).

Your per-gigabyte costs may decrease with these providers once you pass a certain threshold. There's at least one study which indicates you can get better value from doing it yourself at certain levels of storage, but your mileage may vary.

Pricing is very similar, so you may want to pick the software first and the platform second -- unless you have a strong preference for/against any of the companies in question.

Extremely Generic Conclusion

There are some creative ways to back up digital collections. Use what works for you, start with the size you need, grow as you need to.

Thanks for putting thought into keeping your collection safe!

Comments

Answer by MGallinger

Small institutional collections have a lot in common with robust personal collections. Perhaps it would make sense to look at ways in which practices for backing up personal collections (for example, an off-site external drive or two on which your data is regularly backed up) and see if they can be integrated into your institution's data management practices. The challenge with this is that while it is a very cheap alternative, real effort needs to be made to make sure that this data doesn't get lost in your organization and that its management is systematic and rigorous.

Comments