ARROW Repositories day: 3

Dr Alex Cook from the Australian Research Council (a money man! Important!) talking on the Excellence in Research Framework (ERA), the Access Framework and ASHER. ERA appears to be like the UK’s erstwhile RAE, and will use existing HE Research Data Collection rules for publication and research income information where possible. 8 clusters of disciplines have been identified. Currently looking at the bibliometric and other indicators which will be discipline-specific (principles, methodologies and a matrix showing which are used where). Developing the System to Evaluate Excellence of Research (SEER) that will involve institutions uploading their data; sounds like something that MUST (and they say will) work with repositories (RAE in UK too often went for separate bibliographic databases, not interoperable, which meant doing the same thing twice). Copyright still an issue (I wonder if they are brave enough to take a NIH mandate approach? See below… not so explicit, but some pressure that way). Where a research output is required for review purposes, institutions will be required to store and reference their Research Output Digital Asset. Repositories a natural home for this.

Accessibility framework requires research outputs to be made sufficiently accessible to allow maximum use of the research. ASHER a short term (2 year?) funding stream to help institutions make their systems and repositories more suitable for working in this new framework.

Andrew Treloar talking about ANDS, the Australian National Data Service, and its implications for repositories. Mentions again the Australian Code for the Responsible Conduct of Research. Institutions need to think about their obligations under this code, obligations that are quite significant! Good story about Hubble telescope: data must be made available at worst 6 months after capture; most of published data from Hubble is not “first use”! (Would this frighten researchers? I put all the work in, but someone else just does the analysis and gets the credit?)

Structure of ANDS: developing frameworks (eg encouraging moves towards discipline-acceptable default sharing practices), providing utilities (building and delivering national technical services to support the data commons, eg discovery, persistent identifiers, and collections registry), seeding the commons, and building capabilities, plus service development activities. ISO 2146 information model important here (I think I’ve already talked about this stuff from the iPres posts [update: no, it was from the e-Science All Hands Meeting, but the links still points to the right place]).

Australian Strategic Roadmap Review talks about national data fabric, based on institutional nodes, and a coordination component to integrate eResearch activities, and see expertise as part of infrastructure. There’s also a review of the National Innovation System: ensure data get into repositories, try to get data more freely available.

Implications for repositories: ANDS dependent on repositories, but doesn’t fund repositories! May need range of repository types for storing data. Big R and little r repositories: real scale issues in data repositories. Lots of opportunities for groups to take part in consultation etc. Different but related discovery service. Links to Research Information Systems. Persistent Identifier Service (in collaboration with Digital Education Revolution). (I worry about this: surely persistence requires local commitment, rather than remote services?)

Panel discussion… question about CAIRS, the Consortium of Australian Institutional Repositories, funding left over from ARROW, being run by CAUL, the Australian University Librarians, which now has an Invitation to Offer to find someone to run the proposed service.

Some questions about data, whether they should be in repositories or in filestore. Scale issues (numbers, size, rapidity of deposit and re-use etc) suggest filestore, but there are then maybe issues about integrity and authenticity. There will be ways of fixing these, but they imply new solutions that we don’t yet have. The data world will soon go beyond the current ePrint/DSpace/Fedora realms.

A few more questions, too hard to summarise, on issues such as the importance of early career scientists, on the nature of raw data etc. But overall, an extremely interesting day!


