stub

Codes

META: Defining format, but avoiding baggage.

The First Preservation Strategy: Format

Describe Chomsky Heirarchy more fully.

A format is a social convention for exchanging data. A format is a way of grouping documents that share some features.

Perhaps the difference here is whether we are trying to document all possible ‘strains’ of a format, or whether we are just trying to find what format(s) the item is consistent with, e.g. in order to pass the item on for deeper inspection. Or in other words, are we defining format by implementation (software) and context or by specification?

Format specs must be implemented to come alive, and it is the processes that implement the specs that are definitive.

And in the background of all this, we have the weakest format model. Format, as defined by a set of internal or external features that can be used to group related files. e.g. All files ending in .PDF, or starting with %%PDF1.4, etc. Such identifiers are neither permanent nor unique, and yet, association by extension is precisely how software identifies the

Perhaps we can live with an ambigious concept of format. Perhaps it’s more important to work out how to create the right environment to encourage the generation of more trustworthy data. First, we need to decide what data we really need, and understand the validation processes that will help us ensure we can trust it. Then we can choose the right registry. And then we can work out encourgement.

Trends, social, obsolescense, etc.

Floppies are crazy; CD is crazy but kind of standardized; DVD is one format always https://twitter.com/archivetype/status/378176040696545280

Ah, it seems I misinterpreted ‘specification’ to mean some kind of documentation or definition independent of the specific software/user that reads or writes the bitstreams. But, of course, something somewhere ‘specifies’ the content of a bitstream, but I don’t tend to think of that as ‘a specification’ because it can be so weak/ambiguous. But you are right, a poor specification is still a specification. Not sure I agree, because read and write can disagree, and it can hardly be said to be specified then. i.e. specification is a definition that is used to ensure read and write behave consistently, which is not necessary. http://qanda.digipres.org/38/what-is-a-file-format

On modelling? Theory of generalized markup, 2014

LINK TO What to Preserve

Communicating with the Future
Creative Commons Licence
Keeping Codes
by Andrew N. Jackson
comments powered by Disqus