The discussion then kicked across into a closed JISC repository discussion list. It's not fair to report that conversation here, but I can at least quote from one of my comments:
"Part of Andy's original point (he's being saying this a lot), was that the repository platforms we use (DSpace, ePrints et al) are not good web-natives. He did a second post looking at [an] ePrints-based site [...], which showed some improvement over DSpace, but some related issues.I'm pretty sure some of these issues are also part of what reduces the usefulness of these platforms for data sharing.
[...] Let's remember the list of problems that Andy spotted (in only a cursory examination; there may be more). He's at the DSpace splash page for my item:I think if those responsible for these repositories were building them today as web sites, they would not have constructed them with (all of) those deficiencies. (Some are, I think, matters of design choice, or at least in the vanguard of web thinking, eg (e).) But the repository platform brings (or is intended to bring) advantages of workflow, scalability, additional features such as OAI-PMH if it is actually useful, etc. It's supposed to save the repository manager the effort of building a purpose web site, and to provide some consistency across different repositories, as opposed to the general confusion of University web sites in general.
- a) PDF instead of (or not as well as) HTML for the deposited item
- b) confusion over 4 different URIs
- c) wrong Title, so Delicious etc won't work well
- d) no embedded metadata in the HTML
- e) no cross-linking of keywords (cf Flickr tags)
- f) ditto author, publisher
- g) unhelpful link text for PDF and additional material etc.
So one approach, as I suggested in an earlier email, is simply to invest to improve the repository platforms, so that they are better web natives, using up to date technologies, and so present a more appropriate web presence. Many of those issues could be solved with some software effort, benefiting everyone who uses DSpace (and similar effort could improve ePrints etc). Just add money and time.
But the multi-URI thing is a bit more of a problem; it sounds like the repository model we use is conceptually broken here. Herbert's solution sounds like a major design change for repository world. Move from depositing items in a FRBR sense (I know Andy has some issues with this) to depositing some thing much more akin to works, instantiated as Resource Maps. I think this is sufficiently different that there would be some serious debate about it. It certainly sounds like it would make sense, although I'm not sure how much good it will do in Googlejuice terms (Google dropped the OAI-PMH maps, so no huge reason to believe they will be interested in OAI-ORE Resource Maps per se, although this may not be the issue).
It would be very difficult for repository managers to craft those Resource Maps by hand in today's world (although clearly Oxford is doing SOMETHING with them). However, it does sound like the sort of thing that repository platforms could do fairly automatically for you. But again there would be significant development work involved. So in this case, we first have an effort to refine Herbert's sketch to a shared view across the repository world, followed by an effort to implement in major platforms. So add money twice, and more time.
Of course, Herbert's probably built it by now!
So my proposal to JISC would be:
- invest in upgrading the common UK repository platform software for better web-nativeness (!)
- invest in an effort along the lines of previous SWAP/CRIG approaches to get consensus in the Resource Map approach
- invest in upgrading the common UK repository platform software to support this approach."
OAI-ORE breaks googleJuice unless we serialise ORE as lists (e.g. foresite -> RSS yes but even more juicy if exposed as SiteMaps and browsable terms / breadcrumbs <- this is how Oxford makes its resources look like pacman pellets to google spiders (amongst a dozen other things @oxfordben uses: coins, unapi, etc).
ReplyDelete