Wednesday, 7 October 2009

iPres 2009: van Horik on MIXED framework for curation of file formats

Scholars in the Netherlands can deposit or search information in a repository system called DANS EASY, containing about 500,000 files, with a wide diversity of formats. How do I deal with a file called cars.DBF, now an obsolete format. There system can read such formats and convert them to the XML-based MIXED format, which identifies the data type and contains information on structure and content. So this was a smart conversion from the binary, obsolete dbase file to an XML reusable file. In the future it can be converted from this format to a current format of choice. This process (allegedly) does not require multiple migrations…

They have a SDFP community model for spreadsheet and tabular data. Have created some code for DBF and DataPerfect formats that they had to reverse engineer, in SourceForge; this a very labour-intensive activity, and really should be a community effort.

Question: does reverse engineering expose to risk? Don’t know…

1 comment:

  1. Attended the ECDL workshop on Digital Curation in the Human Sciences where MIXED was presented and discussed. As mentioned the facility currently supports a limited number of file formats for conversion into intermediate XML. What was interesting however was Rene van Horik’s suggestion that we should lobby major software vendors to have as standard an export facility into recognised preservation formats. Now that would be progressive!
    I believe that MIXED is/will be made available through the PLANETS suite of software tools.

    Stuart Macdonald


Please note that this blog has a Creative Commons Attribution licence, and that by posting a comment you agree to your comment being published under this licence. You must be registered to comment, but I'm turning off moderation as an experiment.