One of the other trends he observed emerging is that of “re-use” of data. We are no longer just interested in preserving, but evaluating the prospects of re-use for data and improving those prospects, where possible, to derive greater value from our data.
Lynch noted that there is a deepening linkage between the tools and workflows that researchers use, so data curation needs to be increasingly integrated, as this will help solve the problems of meta data, providence and so on to make curation more effective.
Lynch was very happy to hear mention of the notion that we need to get the scientific equipment developers and vendors involved. This could help feed curation into the workflow more effectively – he gave the example of cultural heritage researchers who found their cameras “knew” a lot of the meta data that they had to laboriously enter to fulfil their curation needs, and so could use the equipment to aid in the curation of the data it produced.
Lynch took a lot of heart from the focus on education to give us a generation of data preservers and data curators. He was also heartened by comments that funding agencies were taking data curation seriously as part of the grant proposal and review processes. He also suggested it would be great if we could actually track the progress of this type of cultural shift.
In concluding, Lynch looked at the more speculative elements of the day's discussion, including the Citizen Science debate – referring to Liz Lyon's paper on the topic. However, he wants us to recall that there is a whole range of computational and observational citizen science tasks, not just the survey-based BBC Lab UK model. He also reminded us that this is not just applicable to science... we are seeing the emergence of Citizen Humanities and amateur study in other areas which has been revitalised by the web. What we need need to look to is building data support for citizen scholarship as a whole.
Finally, Lynch made a speculation involving the measure “scientific papers per minute” which underscores how badly out of control scientific communication is and creates a huge problem when propagating and curating knowledge. It seems to Lynch that one of the things we need to recognise is that many of these papers don't need to be papers, but database submissions. This would be a better way to do things if we are going to manage the data – without the emphasis on the traditional individual-voice analysis paper. So we need to have is a hard conversation about traditional forms of scientific communication and data curation to determine how data curation fits into scholarly communication and how scholarly communication may need to change to help us manage the sheer volume of the output.
I caught up with Cliff just after his summary of day one to ask what he is looking forward to most from day two of IDCC 09...
“I am looking forward to hearing from Ed Seidel. Most of us in the States know that there are three more DataNet awards in their final stages, so we would love to know who has got the inside track on those... although I suspect he will say that he can't comment on that!
Following on from my summary, I would like to know what people think about how we can track the uptake on data curation in funding bids.
Having been involved in the paper review process, I know that the best peer-review paper is very good, and there are some other great papers being presented tomorrow, so I am very much looking forward to it!”