Zombse

The Zombie Stack Exchanges That Just Won't Die

View the Project on GitHub anjackson/zombse

What storage hardware should I use for a LOCKSS node?

I am planning to set up a LOCKSS (“Lots Of Copies Keep Stuff Safe”) node for a small PLN (“Private Locks Network”) intended to safely store research and publication data from a few universities.

A partner in this PLN recommended against using a RAID (“Redundant Array of Inexpensive Disks”) for storage because RAID controllers can be buggy, leading to a complete loss of all data stored on this node. On the other hand, most data centers rely on RAso it cannot be all that bad. LOCKSS was designed for commodity hardware, but at a university, it can be more difficult to get a consumer-grade computer with a cheap external USB hard drive than to get a server with redundant storage or a virtual machine running on enterprise-class hardware.

In my mind, the two technologies should complement each other fairly well: LOCKSS guards against bitrot and other inconsistencies in the files, and RAID avoids data loss when a hard drive fails. Are there serious reasons for not running a LOCKSS node on a server with RAID storage? How about using a virtual machine?

Christian Pietsch

Comments

Answer by Fulup

De Montfort University is a part of the UK LOCKSS Alliance and public LOCKSS network. We considered and rejected using RAID and currently have 4 separate disks in a separate server.

LOCKSS can be put on a virtual server, but is a busy service and may not play nicely with any other co-hosted servers.

Comments

Answer by Nick Krabbenhoeft

I work for MetaArchive, a LOCKSS PLN. We publish the technical specifications for our LOCKSS nodes here.

We recommend the use of at least RAID 5 on all of our servers. RAID 5 dedicates a portion of each drive for parity data on all the drives. In our recommended server configuration, we ask for 8 2TB drives, but only 14 TB of this is usable for storage and 2 TB is used for parity. If a hard drive crashes, and it will happen, the parity data ensures that you can not only reconstruct the lost data, but also ensure that all the other drives remain safe.

LOCKSS networks have redundant copies spread around a network with voting-and-polling to check fixity. This provides parity on a network level, but it's much faster to recover from a hard drive crash locally than over a network.

For more general advice, Backblaze, a cloud-based backup service, recently published their 180 TB server configurations with lots of advice. Also, put all your drives through a burn-in test, since it appears that drives have higher infant mortality rate (page 4) It's easier to replace an empty drive under warranty than to wait for the warranty to ship so you can rebuild an array.

Comments