Wednesday, 28 May 2008

Archiving images

One of our Associates recently posted a query on the DCC forum about image archiving formats:

'For long term preservation of digital images should we be archiving the proprietary .RAW files that come out of the digital SLRs we use or should we be converting these to the more open uncompressed .TIFF format and archiving those? Or indeed should we be archiving both?

Whilst they are proprietary uncompressed .RAW files contain much useful information that both helps us manage the image and may help us preserve the image in the future. They are no larger than typicall TIFF files, well, they needn't be. However, future version migration seems inevitable

TIFF files offer flecibility of working, and many tools are able to manipulate and format shift TIFF to produce a range of dissemination manifestations. However, if we work on the do it onece, do it right principle then long term storage becomes an issue of both capacity and cost.

Storing both TIFF and RAW only exacerbates the storage issue, plus maybe creates long term management issues.

What would you do? Store both, store the more open and flexible format or boldly go with the richness of RAW?

Your thoughts and comments would be appreciated.'

As image preservation is relevant to those curating scientific data, I thought I'd reproduce his post (with permission of course) here to try and stimulate a bit more debate.

First off - does it really have to be an 'either or' option? Why not store both? The cost of storage seems to just keep on coming down, and most architectures are more than capable of catering for multiple representations of an object. '
Do it once, do it right' is great if you can. But the 'lots of copies keeps stuff safe' principle has obvious benefits too. Storing multiple representations gives you more options in the future and whilst it's true that you managing additional files may require additional effort, I'm not sure that the potentially minimal costs of this would negate the potential benefits of multiple storage. (Reminder to self - read up on costs!) .

It's probably an opportune moment to go back to your user and preservation requirements too, to determine what you really want from your image archive and to what extent each format meets these requirements.

In terms of potential migrations, it's not inconceivable that we'll eventually come up with a better format than TIFF, so migration is potentially an issue regardless of which format you go for. More immediate - for me anyway- would be the length of time between migrations, and the ease of migration from one to another. These issues would need assessing too if going for one of the other (and it wouldn't be a bad thing to be aware of them even if you choose to store both versions).

Finally, Adobe recently submitted their 'DNG Universal RAW format' to ISO, so the issue of this one (because it seems there are several RAW formats) being a proprietary format may not be a lengthy concern. I'm not that familiar with DNG RAW so I don't know how much extra information it may contain when compared to TIFF. Another thing to add to my 'to-do' list... .

I'm sure our Associate would be grateful for any more input so do feel free to leave comments and I'll make sure he gets them.

1 comment:

  1. Hi, I came to this from Jill Hurst-Wahl's blog.

    This is, of course, a difficult situation. I personally face it, and I can see how any archives dealing with this format can also face it and wonder.

    The good news is that the bulk of the raw information is not proprietary and the folks at have tried to get the manufacturers to understand what a bad idea proprietary pieces of metadata are.

    For example, it is my understanding that the camera white balance information in certain Nikon DSLRs (the brand I use) is encrypted which means when PhotoShop goes to open the raw image, it needs to make its own assumptions about white balance.

    Saving the file as a TIFF creates another set of problems, as the raw pixel data has been rendered into the TIFF and is no longer accessible.

    What I do is watch the software and make sure that the older RAW files that I have (NEFs actually) don't become orphans with each new release. If they were to become orphans, I'd migrate them.

    With my second generation DSLR, it makes in-camera JPEGs so I archive both. The in-camera JPEG gives me an idea of Nikon's white balance and other factors, while the raw gives me unlimited access to the original data (including the encrypted data if I use Nikon's software, which I do some times).

    I think in the case of cameras which make both RAW and JPEG available, saving both is a reasonable compromise. Having the JPEGs makes browsing faster, and often the JPEG is good enough to use for the web, or something like that.

    Someday, I will also do a major scan of my 35 mm slides and negatives, and that will raise another issue as scanner NEFs may not open in anyting other than Nikon's software, and they are different from camera NEFs as I understand it. Maybe in this case, I'll save NEFs TIFFs and JPEGs, or maybe NEFs and JPEGs with the quality knob turned up to 10 or 12. I hate not saving raw data.




Please note that this blog has a Creative Commons Attribution licence, and that by posting a comment you agree to your comment being published under this licence. You must be registered to comment, but I'm turning off moderation as an experiment.