Monday, 7 December 2009

IDCC 09: Prof. Carole Palmer - "The Data Conservancy: A Digital Resource & Curation Virtual Organisation"

Professor Carole Palmer introduced the Data Conservancy which was “cooked up” at the IDCC when it was held in Glasgow. The Data Conservancy asserts that Research Libraries are a core part of the emerging distributed network of data collections and services.

Palmer noted that there is not really an adequate analogy for data services yet (are data sets the new library stacks? Or the new special collections?) but emphasised that data collections and services are consistent with the research library mission.

The Data Conservancy is a diverse group of domain and data scientists, enterprise experts, librarians and engineers. Palmer introduced the range of partners involved in the project, and then moved on to discuss how they look to move forward in a very “non-rigid way”, learning to build principles of navigation and how large the solution space actually is – with technical solutions being only a small part of that. She also noted how an NSF report discussing how successful infrastructure evolves has inspired their group.

Their goals align with the original programme call for DataNet. They are going to collect, organise, validate and preserve data as part of a data curation strategy, as necessary for the call. They are also going to examine how to bring data together to address grand research challenges that society is currently facing. However, the strategy is to connect systems to infrastructure and to be highly informed by user-centred design. They found it was very very important to build on existing exemplar projects and to engage with communities that already have deep involvement with scientists.

Palmer took us through diagrams showing who they are intending to support and how the responsibilities of each of the teams within the projects. They are trying to strike a balance between the research and implementation – which is a requirement of DataNet.

The Data Conservancy believes in a flexible architecture, but this has to support a wide range of requirements, data and uses that they have across their constituencies. As a research library project, they are committed to bringing data in, but Palmer noted that not all research libraries can or should do this.

A big part of their project has to do with building a data framework, so they are thinking a lot about the notion of the “scientific observation” as a common concept across scientific disciplines. They will be examining existing models and building on this work. In particular Palmer talked us through an ORE resource map and noted the need to link data to literature and explained that as libraries they are well positioned to work in this area and improve upon such models.

The launch pad for the project is looking at data from the Astronomy – specifically the Sloane Digial Sky Survey, which is almost 3 times bigger than data held at Johns Hopkins University in total, which presents a big initial problem in terms of scale. They will then be taking what they learn from working with this core community and applying this to other areas, including Life Sciences, Earth Science and Social Science, after a deep study of the history of astronomic research processes.

As part of her presentation, Palmer gave us an over view of the types of projects they are involved with and how they intend to start interfacing between these projects. She also explained further about their work at Illinois as a number of her colleagues from Illinois were present at the conference. Their work has noted that it is not just the big instrument driven science that will drive this forwards, but also smaller science projects. They are also working to understand how they can determine, early on, the long-term potential for data. The IDCC will be hosted at Illinois in 2010, which will be followed by a partner summit for DataNet projects as they move forward.

To conclude, Palmer discussed the education element of their work, which includes a data curation specialisation in the Masters of Science, with the third class running this semester involving 31 students. The Data Conservancy is expected to infuse teaching practices and help to educate a more diverse range of students. She showed us a slide demonstrating the strategy for building the new workforce at Illinois, with the Data Conservancy working across the various areas.

There are lots of connections between the Data Conservancy and research groups and, so Palmer is looking forward to sharing results, work practices and ideas as they move forward with the DataNet.


Post a Comment

Please note that this blog has a Creative Commons Attribution licence, and that by posting a comment you agree to your comment being published under this licence. You must be registered to comment, but I'm turning off moderation as an experiment.