Monday, 29 September 2008

iPres 2008: Risks and Costs

Paul Wheatley of the BL talking about LIFE2, a refinement of the earlier LIFE project. The LIFE model is a map of the digital life cycle from the point of view of the preserving organisation. The LIFE V2.0 model is slightly reorganised and refined, for example including sub-elements. It also has a methodology for application. In addition, there is the beginning of a Generic Preservation Model (GPM), derived from some desk study. Many issues still remain with the revised GPM, but further work will be carried out with the associated experts group, in anticipation of a possible LIFE3 project. Paul commented on how expensive it was to obtain these costs. It’s clear that much still needs to be done before these costs can be used as predictors

Rory McLeod of the BL talking about a risk-based approach to preservation. The 8 Business Change Managers act through the risk assessment to identify the assets, the risks to those assets, and possible reactions to those risks (ie “save” those assets). 23 separate risks were identified, and were aggregated into 6 direct and 2 indirect groups of risks. Physical media deterioration was the major risk, together with technical obsolescence for hand-held media. These two groups were complemented by format obsolescence and 3 further software-related risk groups. The two indirect risks are related to policy. In risk assessment, they are using the AS/NZ 4360:2004 risk standard, plus the DCC/DPE DRAMBORA toolkit.

Richard Wright talking about storage and the “cost of risk”. In early days dropping a storage device meant losing a few kilobytes, now it could be GBytes and years of work. Storage costs declining and capacity increasing exponentially roughly related to Moore’s law (doubling every 18 months). Usage is going up, too, and risk is proportionate to usage, so risk is going up too. Risk proportional to no of devices and to size and to use… plus the more commonly discussed format obsolescence, IT infrastructure obsolescence etc. So if storage gets really cheap, it gets really risky! Control of loss gets most attention: reduce MTBF, make copies, use storage management layers, introduce virtual storage, using digital library technology, etc. Mitigation of loss gets much less attention. Simple files can be read despite errors; more complex compressed files can be extremely fragile. Files with independent units have good properties. Reports work from Manfred Thaller of Koln: one bad byte affects only that byte of a TIFF (does this depend on selected compression?), 2% of a JPEG, and 17% of a JPEG2000. Demonstrated 5 errors on a PNG and a BMP: former illegible, latter has a few dots scattered about. Text files the best: one byte corruption affects only that byte! Risk can always be reduced by adding money: more copies, more devices, more reliable devices, less data per data manager. However, you actually have a finite budget, so the trade-offs are important. Can’t emphasise how important this is: one of the most worrying preconceptions in digital preservation is that the bit preservation element is a solved problem. It isn’t!

Bill Lefurgy of Library of Congress setting the scene for a report on the impact of international copyright laws on digital preservation. Adrienne Muir summarising the 4 country reports. All had preservation-related exceptions to copyright and related laws, however none of them are fully adequate to allow preservation managers to do what they need to do. All inadequate in different ways; UK maybe the strictest (as ever!). Differences in scope (which libraries), purpose and timing of copying, and material types within scope. Different legal and voluntary deposit arrangements; none comprehensive, they are out of date and don’t reflect the digital, online world. Orphan works (those whose copyright owners cannot be identified) a problem. Technical protection measures are a real preservation problem: current laws often prohibit reverse engineering or other circumvention. In the UK, contracts can trump copyright! Access to preserved content is an issue. Rights holders worry about the effects of the market of any legal changes, and aim to prevent such change. Recommendations include: there should be laws and policies to encourage digital preservation.

Questions: should we just ignore the copyright problems like Internet Archive and Google? We don’t have their resources in case there are serious problems, and there are serious potential impacts if we stamp all over these rights. Maybe for high value stuff, rights would be more vigorously pursued by rights holders, but they may also have motivation to preserve themselves. So maybe we should be looking for the stuff that’s most at risk of loss, some of which is low risk from the rights holders.


Post a Comment

Please note that this blog has a Creative Commons Attribution licence, and that by posting a comment you agree to your comment being published under this licence. You must be registered to comment, but I'm turning off moderation as an experiment.