What aspects of the process for a "complete" digital preservation system are missing from current software packages?

I know a little about Ex Libris' digital preservation system Rosetta, but never saw it live. Yesterday I noticed that there seems to be an open source alternative in development: Archivematica.

Am I correct, that both products are developed to fullfill more or less the same purpose?

What are the critical features of a digital preservation software that should be sought after? What parts of the workflow of digital preservation should be automated by integrated software packages and in what way? How to identify a "good" software solution?

And concerning mainly Archivematica: if I would use it for digital preservation, what main (technical) issues of long term preservation would still remain unsolved? It does normalization, format identification, format migration, packaging of AIPs etc. What is missing for a "complete" digital preservation system - at least on the software side?

Roland Lukesch

Comments

Answer by Trevor Owens

Software (and hardware for that matter) can never become "complete" digital preservation systems. Staff, budgets, infrastructure like buildings, and policy are always part of the system that authenticates, preserves, manages and provides access.

Software systems (like those you mentioned) are always going to be tools that automate parts of your workflow. I should add that much of that workflow is going to be content dependent. Now, if you want a list of features you might take a look at the Trustworthy Repositories Audit & Certification: Criteria and Checklist

Or the ongoing work of the National Digital Stewardship Alliance to identify tiered levels of digital preservation.

Comments

Answer by Nick Krabbenhoeft

In general, open-source and commercial digital preservation software offer many of the same features. Most of them even incorporate the same software like DROID. The major differences that I've found are in the storage of metadata and the granularity of user permissions.

Metadata Storage

First, one of the larger differences you'll see right now is the use of databases to record metadata. Alot of individual tools generate a METS/PREMIS object to contain the technical (format, file specifications, etc), administrative (copyright, etc) and descriptive metadata for each object or group of objects it processes. Digital preservation environments like Archivematica, Ex Libris Rosetta, and Tessella System Preservica/SDB also place this information into a database.

Using a database allows you to hook up your archive to a search system more easily and, I think most importantly, it gives you better control over future preservation actions. For instance, you can run a report to return a list of all the formats you're holding. Then, when you identify a format you'd like to migrate, you can search for a list of those specific files. The metadata and database should then track whatever preservation actions you perform on those objects. Note that it's probably safer to maintain both the metadata object and the database to guard against either becoming disassociated from the preserved objects.

User Permissions

Second, commercial systems often include more complex systems for defining user access. Rosetta and SDB/Preservica both allow administrators to restrict actions like ingest, preservation, and metadata editing to certain classes of users, and will record the ID of users that perform these actions. Restricting access increases the security of files and builds greater provenance. However, this feature is not as necessary for smaller organizations.

Edit 2013/03/04: Thanks to Artefactual Systems for clearing up an error on my part.

Comments