In the analogue world, few resources that we might want to keep over the long term are revisable. Some can easily be annotated; you can write in the margins of books, and on the backs of photos, although not so easily on films or videos. But the annotations can be readily distinguished from the real thing.
In the digital world, however, most resources are at least plastic and very frequently revisable. By plastic I mean things like email messages or web pages that look very different depending on tools the reader chooses. By revisable I mean things like word processing documents or spreadsheets. When someone sends me one of the latter as an email attachment, I will almost always open it with a word processor (a machine for writing and revising), rather than a document reader. The same thing happens when I download one from the web. For web pages, the space bar is a shortcut for “page down”, and quite often I find myself attempting to use the same shortcut on a downloaded word processing document. In doing so, I have revised it (even if in only trivial ways). Typically if I want to save the down-loaded document in some logical place on my laptop, I’ll use “Save As” from within the word processor, potentially saving my revisions as invisible changes.
Annotation is rarer in the digital world, although many word processors now have excellent comment and edit-tracking facilities (by the way, there’s a nice blog post on da blog which points to an OR08 presentation on annotations, and an announcement on the DCC web site about an annotation product that looks interesting, and much of social networking is about annotation, and…).
My feeling is that there’s a default assumption in the analogue world that a document is not revisable, and an opposite assumption in the digital world.
One of the ways we deal with this when we worry about it, at least for documents designed to be read by humans, is to use PDF. We tend to think of PDF as a non-revisable format, although for those who pay for the tools it is perhaps more revisable than we think. PDF/A, I think, was designed to take out those elements that promote revisability in the documents.
If you are given a digital document and are asked to preserve it, the default assumption nearly always seems to kick in. People talk about preserving spreadsheets, worry about whether they can capture the formulae, or about preserving word processing documents, and worry about whether the field codes will be damaged. In some cases, this is entirely reasonable; in others it doesn’t matter a hoot.
When I read the InSPECT Framework for the definition of significant properties, I was delighted to find the FRBR model referenced, but disappointed that it was subsequently ignored. To my mind, this model is critical when thinking about preservation and significant properties in particular. In the FRBR object model, there are 4 levels of abstraction:
- Work (the most abstract view of the intellectual creation)
- Expression (a realisation of the work, perhaps a book or a film)
- Manifestation (eg a particular edition of the book)
- Item (eg a particular copy of the book; possibly less important in the digital world, given the triviality and transience of making copies).
Why is this digression important? Because many of the significant properties of digital objects are bound to the manifestation level. Preserving them is only important if the work demands it, or the nature of the repository demands it. Comparatively few digital objects have major significant properties at the work level. Some kinds of digital art would have, and maybe software does (I haven’t read the software significant properties report yet). If you focus on the object, you can get hung up on properties that you might not care about if you focus on the work.
Last time, I said the questions raised in my mind included:
"* what properties?Here I’m suggesting that for many kinds of works, revisability is not an important significant property for most users and many repositories. That would mean that for those works, transforming them into non-revisable forms on ingest is perfectly valid, indeed might make much more sense than keeping them in revisable form. This isn’t at all what I would have thought a few years ago!
* of which objects?
* for whom?
* for what purposes?
* and maybe where?"