Wednesday, 2 July 2008

Research Repository System data management

This is the sixth of a series of posts aiming to expand on the idea of the negative click, positive value repository, which I'm now calling a Research Repository System. I've suggested it should contain these elements:
Data management support is where this starts to link more strongly back to digital curation. Bear in mind here, this is a Research Repository System; not all of these functions, or the next group, need to be supported by anything that looks like one of the current repository implementations! I’m not quite clear on all of the features you might need here, but we beginning to talk about a Data Repository.

It is essential that the Data Management elements support current, dynamic data, not just static data. You may need to capture data from instruments, process it through workflow pipelines, or simply sit and edit objects, eg correcting database entries. Data Management also needs to support the opposite: persistent data that you want to keep un-changed (or perhaps append other data to while keeping the first elements un-changed).

One important element could be the ability to check-point dynamic, changing or appending objects at various points in time (eg corresponding to an article). In support of an article, you might have a particular subset available as supplementary data, and other smaller subsets to link to graphs and tables. These checkpoints might be permanent (maybe not always), and would require careful disclosure control (for example, unknown reviewers might need access to check your results, prior to publication).

Some parts of Data Management might support laboratory notebook capabilities, keeping records with time-stamps on what you are doing, and automatically providing contextual metadata for some of the captured datasets. Some of these elements might also provide some Health and Safety support (who was doing what, where, when, with whom and for how long).

1 comment:

  1. Do we need to take the database right into consideration here? Would it fall within the definition of a database (for the purposes of the right) which is:

    A collection of independent works, data or other materials which-
    (a) are arranged in a systematic or methodical way, and
    (b) are individually accessible by electronic or other means


Please note that this blog has a Creative Commons Attribution licence, and that by posting a comment you agree to your comment being published under this licence. You must be registered to comment, but I'm turning off moderation as an experiment.