Friday, 20 March 2009

International Repositories Infrastructure workshop

Amsterdam in Spring, who could turn down the offer? Perhaps it would be irresistible a little later in Spring than early March (brrr), but when the sun did come out, and the workshop was done, it was lovely. I was in Amsterdam for a curious International workshop on repository infrastructure, funded by JISC and SURF, with the DRIVER project. It turns out I had no idea what repository infrastructure meant before I went, and I guess I know only a little more now.

I was asked to take part in this workshop late last year, and was supposed to stimulate discussion on a use case on preservation. There were 4 use cases:
  • Preservation
  • Access
  • Deposit Workflows and Environments
  • Online reputation and reporting
Maybe some of us didn’t do enough work on this, as somewhere along the line this morphed into 4 proposed action plans, each with a breakout group:
  • Identifier infrastructure
  • Citation Services
  • Repository handshake
  • Organisation Structures
The first 3 of those probably fit with the last 2 use cases, but I don’t know what happened to the others, or where the last one came from.

If you use Twitter, you can read (in reverse order) some of the backchat on this through the hashtags #repinf09 (for the whole workshop), #rihs09 (for the handshake group) and #reporg09 (for the organization group). If you do have a look, bear in mind that what you read is devoid of context, and represents only a small part of what was going on, probably from a vocal, cantankerous, ornery, rumbustious and maybe just plain rude subset! (As a new Twit, I haven’t yet worked out how much ruder tweeting during a meeting is, compared with passing notes in class, say.) Oh, and not all the Twits were actually there.

So what happened? The introductory presentation was given by Norbert Lossau of DRIVER. Lots of interesting stuff, but I was a bit taken aback by claims that the defining vision of repositories was the Berlin Open Access Declaration, and that data were out of scope. Personally I think Open Access is most often a Good Thing, but sometimes it’s inappropriate: repositories of sensitive data, for instance. And I’m not at all sure that you can cleanly distinguish document from data, or that it makes sense to do so, especially when supplementary materials and extended documents come into the frame.

I was in the organisation group, so I don’t know in detail what happened in the other groups, but we did come together right at the end for a plenary. I was particularly impressed by what Les Carr said about the Citation Group, including the idea of creating a test corpus, and also a competition for the best text mining algorithms to find citations and references (and even their surrounding context: “we illustrate the errors in Rusbridge (2006)" is clearly different from “following the excellent suggestions given in Carr (2009)"). I don’t think much was said about the citation microformat (perhaps it really is dead?), nor about whether text mining might be aided by embedding RDF etc in documents. Nevertheless, a sensible plan of work was laid out.

On identifiers, it also sounded as if some sensible progress was made. In particular, it seemed as if identifiers for authors (as disambiguation tools for citations etc) went from being ruled out to being a strong part of the work plan. They had the ultimately cool mindmap.

From the repository handshake tweets, it looked like they had a bumpy ride getting their 8 use cases agreed, but did get through their “stormin’, normin’ and formin’” stages into at least the first part of “performin’”. I did like “beg” as a repository verb!

So to the repository organisation group. Here too, there were plenty of storms, and maybe the odd teacup. It was quite hard to work out what sort of organization we were interested in; what it was intended to do. There was a strong feeling of unspoken sub-text; whatever it was we were talking about was proposed to launch on October 7. One of the best quotes for me came from Sandy Payette; she said the DSpace/FEDORA organizations could provide some of the things that were being talked about, but they would have to “feel the hunger”. It was difficult to feel the hunger for whatever this thing was. At one point, I’m afraid I compared it unfavourably with jelly; whenever I thought I had hold of it, something bulged out somewhere else (yucky image, sorry).

On the start of day 2 our organisers had a new approach, which worked better. They broke the breakout into yet smaller breakouts, and gave each of us a role to play and 6 questions to answer. My table was “funders”. That’s OK, I’ve been one, I could handle that (even funded repository work back in 1998 or thereabouts, think Harnad’s CogPrints and Krichel’s WoPEc/RePEc, which made it particularly annoying to be patronized at one point as someone who obviously didn’t know much about repositories). I think the two main “expectations” for our group were “clarity of aims” and “benefits to justify investment”. Anyway, the result of all this was that, by the skin of our teeth, we did have something to say at the final plenary. Not that I yet know quite what this organization would be for!

Was it entirely coincidental in this context that the DRIVER project was ending in a few months? Well, we had heard that DRIVER did repository infrastructure, that DRIVER did published papers, not data, that DRIVER Guideline enabled interoperability, that DRIVER Guidelines were in demand internationally and had been translated into (2? 3?) other languages, and that the continuation of the DRIVER brand was seen as important. Some Twit asked “If the DRIVER Guidelines are the answer, what is the question?” Join the dots…

The workshop was ended with some closing remarks from Cliff Lynch, as perspicacious as ever, but slightly more hoarse than usual (not from shouting at us; too much inter-planetary travel, I suspect). And after the workshop there was a “funders’ meeting”. Real funders, the kind with money. What went on there, I have no idea!


Post a Comment

Please note that this blog has a Creative Commons Attribution licence, and that by posting a comment you agree to your comment being published under this licence. You must be registered to comment, but I'm turning off moderation as an experiment.