Zombse

The Zombie Stack Exchanges That Just Won't Die

View the Project on GitHub anjackson/zombse

What is the recommended practice for retaining logs of fixity checks?

I am wondering what the recommended practice is for retaining event metadata generated by fixity checks. Let's say my collection is in the range of 100 TB. If I have a process running fixity checks on stored archival data continuously (because it's going to take a while to get through everything), this will produce a sizeable log of fixity check event data: file path, hash value, date/time, outcome, etc. Storing this data for every fixity event in a PREMIS XML file seems inefficient and unwieldy (of course, storing the original hash value/algorithm/datetime created in PREMIS makes sense -- this is presumed to be the data that future fixity checks would validate against). Likewise, I'm not sure if storing this data in a database is worthwhile either. It seems that text files with fixity event logs would be the more lightweight option. My interrelated questions on this issue are:

I know some of this is a question of retention policy and specific to my repository's architecture, and that could be case by case for each repository, but I'm wondering what the general recommended practice is for retention and storage of this data.

kvanmalssen

Comments