Yesterday and today I am at the JISC Repositories Conference in Manchester. This turns out to be (at least in my small section, and the two plenaries so far) a much more interesting event than I expected. There has been a useful focus on the fringes of the repository movement, such as the role of data, and as well an interesting re-exploration of what repositories are, and what they are for.
The keynote was from Andy Powell of Eduserv (his slides are available on Slideshare), talking about some work that he and Rachel Heery did for JISC on a repositories roadmap. Drawing an interesting parallel with today’s GPS systems, which will give you a new route if you go wrong, he wanted to look at the recommendations they had made, one year later, and see if they still stood up. By and large, the answer seemed to be that they did; however, he still had criticisms. Partly this related to an under-estimation of the role of technology in general and the web in particular, against an over-estimation of the role of policy and advocacy. The repository movement is much less successful than Web 2.0 style “equivalents” (sort of) like Flickr, Slideshare, Scribd etc. Why? He felt it was because the technology of the latter was so much better; it afforded (not his word) what people wanted to do in a much better way than typical repositories do. So in response he wanted us to re-think repositories in terms of the web architecture, and to re-think scholarly communication as a social activity (in Web 2.0 terms). So we must build services that users choose to use, not try to make them use services even though they don’t work well for them!
BTW I had trouble with my presentation, done on Powerpoint on a Mac; I couldn’t connect the Mac to the conference centre projector, nor could I get either of the Windows machines to read the file on my memory stick. I mention this because, inspired by Andy, I registered with Slideshare and in a matter of minutes had managed to upload the presentation and have it converted into the Slideshare system. Not until after my (Powerpoint-free) presentation however!
The second keynote was from Keith Jeffrey from STFC (was CCLRC until March 2007). He said many sensible things relating to the idea of the relationship between eprint repositories and data repositories (keep them separate, they are trying to do different things), and the relationship of both to the CRIS (which I think meant the Campus Research Information System). The latter is needed for research management and evaluation, the RAE etc. The interesting thing is that it contains a large amount of contextual information, which if captured automatically can make it much easier to create useful metadata cheaply. Building metadata capture into the end to end workflow is one of my interest areas, because the high cost of metadata is a significant barrier to the re-use of data, so I was pleased to hear this stated so coherently. Moreover he was implying that using the CERIF data model (from an EU project, I think), you can tie all these entities together in a way that makes harvesting and re-using them (a la Semantic Web) very much easier.
Two other things I’ll briefly mention. Liz Lyon spoke about her consultancy, which will be available as a report in a month or so, on Rights, Roles, Responsibilities and Relationships in dealing with data. I won’t attempt to summarise, but it looks an interesting piece of work, and one well worth watching out for (truth in advertising: Liz is a DCC colleague, based at Bath where she is also and much more importantly Director of UKOLN).
The other particularly interesting paper that really caught me (on about the 3rd hearing I guess) was Simon Coles talking about the R4L (Repository for the Laboratory) project. There are so many interesting features of this project: the use of a private repository to keep intermediate results, the use of health and safety plans to provide useful metadata, the (possibly generic) probity service where you can register your claim to a discovery, the use of both a human blogging service (so interim results can be easily discussed within the group) and a machine blogging service (so autonomous machines continue to blog their results, unattended, for perhaps days), the ability to annotate results with scribbles, sketches and other annotations, the idea of a structured data report based on a template, from which data for publication can be extracted (and which could form the basis of a citable data object), and the various sustainability options they have in place. This struck me again as one of the most interesting projects (or perhaps family of projects, along with Smart Tea and the eBank group) around.