Zombse

The Zombie Stack Exchanges That Just Won't Die

View the Project on GitHub anjackson/zombse

How should archives decide between creating logical or forensic disk images of drives on computers accessioned as part of personal papers collections?

The emerging workflow for bringing in born digital media to an archival collection seems to be converging on creating forensic disk images, that is bit-for-bit copies of the entire disk. This sort of copy would capture forensic traces of previous use of the contents of the drive.

There is considerable value in these kinds of copies, but they also have the possibility of making those offering their materials to an archive feel rather exposed. In general, individuals giving their papers to a library have had the ability to review exactly what it is they are handing over. The possibility to read overwrite sectors of disks can creep out a potential individual offering up their papers.

In contrast to forensic disk images, there is the possibility of creating logical disk images. Instead of being a bit-for-bit copy, these images are simply copies of the contents of the directories on a disk. The logical image captures the organization of the disk but does not capture the considerable additional information that a forensic image would capture.

Given these differences, in what cases should archives go with the logical images vs. the forensic ones? On the one hand, the forensic disks seem to be more in keeping with not changing the object and in a sense with the ideals of original order. On the other, the agreements between those offering their papers to an archive frequently involve discussing what's in and what is out. How should archives go about weighing these different values in their decisions to create forensic or logical images?

Trevor Owens

Comments

Answer by Nick Krabbenhoeft

It's very tempting to think that we should save everything, but for two reasons we really shouldn't.

​1) Economics: Our ability to produce information will outpace our ability to store it. David Rosenthal does a fantastic job breaking down the costs of digital storage in a number of posts on his blog (e.g. http://blog.dshr.org/2012/05/lets-just-keep-everything-forever-in.html) In short, although the price per gigabyte will continue falling, we will produce data at a faster rate. Therefore, digital repositories that commit to saving everything will have to commit increasing amounts of their budgets to purchasing storage.

​2) Ethics: As Trevor said in his questions, most donors have little idea that deleted files remain on their disks until they are over written. Even when they are overwritten, because of the way computer storage works, traces of deleted files can remain in the slack space. I would argue most donors are not agreeing to give files that they had deleted.

Archivists work with shelf-space and budget constraints. They also discard portions of accessions during processing and restrict access to other parts. I don't imagine these practices will disappear with mature born-digital archives. Logical images should be the norm in digital archives.

Forensic images might appear to offer the advantage of better representing the original workflow, but with the random writes, defrags, and rewrites of modern storage hardware, I wonder how useful that might be.

Comments

Answer by anarchivist

I think it largely depends on the age of the material. While storing a full sector image of a contemporary 500 GB hard drive, it's probably not going to be terribly useful if you're only interested in obtaining user data.

However, if you're imaging older floppies or optical media, the likelihood of obtaining a useful logical image is likely going to be lower unless you can be certain that your imaging application and hardware can interpret the encoding system, file systems, partition maps, etc. on the media.

The other issue is that containerized logical imaging formats (like AccessData's "Custom Content Image" [AD1] format, used by FTK Imager and Forensic Toolkit) are proprietary and usually cannot be used outside of their creating application.

If you want to use a similar mechanism to the logical imaging process, you can use a tool like Curator's Workbench or Duke Data Accessioner, or if you simply want to package assets for transfer, you could use packages that comply to the BagIt specification.

Comments