Zombse

The Zombie Stack Exchanges That Just Won't Die

View the Project on GitHub anjackson/zombse

Is there any reason to digitize in full colour dissertations that were originally submitted in black & white?

We are in the midst of a large digitization project of the earliest dissertations at mpow. For the most part they were submitted in black & white, though some have colour inserts (maps, graphics). At the moment we are digitizing in full colour, as we do for rare items, etc etc, but I wonder if this isn't overkill? Inserting colour pages does monkey our workflow, but it's manageable. OCRing works much better when we use grayscale, and the files are lighter for our IR, both of which are definitely priorities.

jambina

Comments

Answer by dsalo

I find that text gets a bit icky if we're talking straight-up bitonal, but grayscale works pretty nicely. Full color does seem like overkill to me; I rather doubt that historians of print culture will find much out from inspecting dissertation scans that they wouldn't learn from reading dissertation-formatting guidelines.

Comments

Answer by Flimzy

For documents where you use OCR, it's not as much of an issue--simply do whatever renders the best OCR results, since presumably you're only storing the text, and not the image form long-term.

Where the question matters is for image content--or text that cannot be OCRed for whatever reason. And for these sorts of items, it is important to determine your archival goals. You say you're archiving dissertations, so presumably your goal is simply to preserve the information in a reproducible way--as opposed, say, to recreating an authentic-looking replica.

For sheer information preservation, grayscale (or even true black & white for line drawings or text) will probably be preferable in most cases, for two reasons:

  1. Smaller file sizes.
  2. Imperfections in the original (coffee stains?) are less noticeable.

Color should only ever be important when the original is in color, and information would thus be lost by storing B&W. The other reason to archive a color version is to reproduce (physically or electronically) something as close to the original as possible (coffee stains included!). This motivation is generally only interesting for objects of historical significance, so probably not so much for most dissertations.

Keep in mind this is meant to answer the question of how to archive the information. It may still be desirable to scan in full color to allow for additional manipulation (manual or automatic removal of coffee stains?) prior to archiving in the final B&W format.

Comments

Answer by Brian Herzog

I took a archiving workshop in library school, and the teacher there recommended always scanning in full color. Her reason was that color will render the reproduction as close to the original as possible - right down to paper color, ink color, stains, dogears creases, watermark reflections, etc. It wasn't necessarily for the content itself, but all of the "ephemera" that might be of interest to researchers. It doesn't seem like that would be necessary for dissertations, but eventually they will be "historical" to someone.

Comments

Answer by SJeffery

I work in a science library that makes heavy use of the theses and dissertations produced at other institutions. Having all inserts/foldouts that were produced in color available to us in color is an absolute requirement. Our researchers (geologists, engineers, etc.) require this for their work.

When considering scanning these is anything other than color you will need to keep in mind how this will impact your library and users in the future. I regularly get in touch with libraries to request either a copy or permission to borrow their copy of a thesis/dissertation because their copy (often Proquest/UMI) is in black and white. This ends up taking several hours of our time which could be prevented by having it made available in its original coloring.

For what it is worth, we microfilmed a set of our records in the 80's and are now in the process of trying to track down the originals so that they can be scanned in color as the black and white copies and not of any use.

Comments

Answer by M. Alan Thomas II

Do greyscale for the pages submitted in B&W and color for the color inserts. Like you said, it monkeys with the workflow, but the color is presumably there for a reason and not preserving it loses information and value. However, full-color inserts does not mean you have to do the whole thing in color unless there's some technological limitation to that effect on your end.

Comments