<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-1303975371294158246</id><updated>2011-12-20T08:08:49.339Z</updated><category term='SWORD'/><category term='Format registries'/><category term='TDR'/><category term='Disaster recovery'/><category term='tools'/><category term='UKOLN30'/><category term='life-cycle'/><category term='#lotf09'/><category term='PASIG'/><category term='Infrastructure'/><category term='RDFa'/><category term='Preservation'/><category term='Social bookmarking'/><category term='Curation'/><category term='Permanent Access'/><category term='IDCC4'/><category term='RRS'/><category term='sneep'/><category term='JISC'/><category term='sustainability'/><category term='Research data'/><category term='Digital Curation'/><category term='Authenticity'/><category term='Rendering'/><category term='Long Term'/><category term='Backup'/><category term='Movage'/><category term='CNI'/><category term='Social media'/><category term='Institutional Repositories'/><category term='Workflows'/><category term='Research project'/><category term='Significant Properties'/><category term='Cool URIs'/><category term='XAM'/><category term='Semantic web'/><category term='Knowledge base'/><category term='Authoring support'/><category term='training'/><category term='Data Re-use'/><category term='OR08'/><category term='supplementary materials'/><category term='curated databases'/><category term='Data Services'/><category term='Persistent IDs'/><category term='Nature'/><category term='#idcc09'/><category term='Digital Preservation'/><category term='RDF'/><category term='RepInf09'/><category term='Disclosure control'/><category term='PDF'/><category term='Obsolete media'/><category term='Long Term Accessibility'/><category term='JHOVE2'/><category term='Data publishing'/><category term='Sun-PASIG'/><category term='UKRDS'/><category term='Laboratory Repositories'/><category term='XML'/><category term='Positive value'/><category term='AHDS'/><category term='Citation'/><category term='CRIS'/><category term='Compression'/><category term='microformats'/><category term='AHM08'/><category term='BRTF-SDPA'/><category term='Scolarly HTML'/><category term='Designated Community'/><category term='Email lists'/><category term='Disk errors'/><category term='iPres09'/><category term='Collaboration'/><category term='adding value'/><category term='Revisability'/><category term='ORE'/><category term='JIF08'/><category term='Open Access'/><category term='Science publishing'/><category term='UKDA'/><category term='Engineering data'/><category term='PRONOM'/><category term='eScience'/><category term='archiving images'/><category term='Data sharing'/><category term='FRBR'/><category term='Blog'/><category term='Live Streaming'/><category term='Twitter'/><category term='Open Science'/><category term='Optical drives'/><category term='Image formats'/><category term='skills'/><category term='trust'/><category term='Load test'/><category term='Panton Principles'/><category term='data corruption'/><category term='Statistics'/><category term='Biocuration'/><category term='Geospatial data'/><category term='Representation Information'/><category term='data curation'/><category term='Security'/><category term='Legal issues'/><category term='IDCC09'/><category term='Subject Repositories'/><category term='Persistent storage'/><category term='Archives'/><category term='Libraries'/><category term='Forum'/><category term='Linked Data'/><category term='Text Mining'/><category term='Open formats'/><category term='﻿Authoring support'/><category term='survey'/><category term='deposit formats'/><category term='Conference'/><category term='Clouds'/><category term='eJournals'/><category term='Health studies'/><category term='DOI'/><category term='Identity management'/><category term='Provenance'/><category term='AHRC'/><category term='Negative click'/><category term='Data management'/><category term='Licensing'/><category term='Data audit'/><category term='eResearch'/><category term='Metadata'/><category term='Open Data'/><category term='scale'/><category term='digital records'/><category term='iPres'/><category term='Migration'/><category term='IJDC'/><category term='Preservation formats'/><category term='indexing'/><category term='Fun'/><category term='DPC'/><category term='Web 2.0'/><category term='Archival media'/><category term='Open Source'/><category term='publishing'/><category term='Web preservation'/><category term='copyright'/><category term='NLM DTD'/><category term='IDCC3'/><category term='Data'/><category term='iPres-2008'/><category term='Obsolescence scale'/><category term='Qualitative data'/><category term='Data Recovery'/><category term='Technical Specifications'/><category term='Research Repository System'/><category term='Databases'/><category term='legacy formats'/><category term='AdaLovelaceDay09'/><category term='OAIS'/><category term='Repositories'/><category term='Science Commons'/><category term='DCC'/><title type='text'>Digital Curation Blog</title><subtitle type='html'>Blog inspired by the Digital Curation Centre to discuss issues relating to the curation and long term preservation of digital science and research data.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default?start-index=101&amp;max-results=100'/><author><name>Kevin Ashley</name><uri>http://www.blogger.com/profile/15371869767865369079</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>281</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7121489494883418209</id><published>2011-10-06T16:04:00.006+01:00</published><updated>2011-10-06T17:56:01.860+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='JISC'/><category scheme='http://www.blogger.com/atom/ns#' term='DPC'/><category scheme='http://www.blogger.com/atom/ns#' term='Web preservation'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><title type='text'>Thoughts before "The Future of the Past of the Web"</title><content type='html'>Tomorrow I'm going to be in London for a joint JISC/DPC event on web archiving, "&lt;a href="http://www.jisc.ac.uk/events/2011/10/futureoftheweb.aspx"&gt;The Future of the Past of the Web&lt;/a&gt;" (hashtag #fpw11 if you're so inclined.) It's the third in an occasional series; I gave the closing presentation at &lt;a href="http://www.dpconline.org/events/previous-events/425-missing-links-the-enduring-web"&gt;the second event&lt;/a&gt; and I have been asked to be on a closing panel this time round. One of the things we've been asked to reflect on is what changes have taken place since the last event and how far our expectations have been realised. I thought it would be useful to set my thoughts on this down in advance, partly to help me articulate my own thinking. It will be interesting to see how various views develop during the panel session tomorrow.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://4.bp.blogspot.com/-PGTDrZUXVug/To3cqOrILjI/AAAAAAAAAAQ/uQiTg1QRY7g/s1600/cybergeography.PNG"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 288px; height: 320px;" src="http://4.bp.blogspot.com/-PGTDrZUXVug/To3cqOrILjI/AAAAAAAAAAQ/uQiTg1QRY7g/s320/cybergeography.PNG" border="0" alt="Image Courtesy Martin Dodge's Cybergeography collection" id="BLOGGER_PHOTO_ID_5660422924726185522" caption="From Martin Dodge's Cybergeography collection"/&gt;&lt;/a&gt;&lt;br /&gt;Looking back at my concerns in mid-2009 I'm greatly reassured. There were a number of worrying trends apparent in web archives at that time and an apparent lack of bold vision in how we might use web archives in the future - or even in the present. My fear was that the collecting policies, preservation policies and interfaces offered were all taking a very human and document-centric view of what a web archive should do. In OAIS terms, the Designated Community was people who wanted to view individual old web pages having done a search for a particular site, or possibly for a keyword of some sort. The National Archives had taken one incremental but powerful step beyond that, automatically linking archived web pages to 404 pages on government web sites via simple plugins for Apache &amp; IIS, but in the end this still involved serving individual pages for people to read.&lt;br /&gt;&lt;br /&gt;That's a valid use case, but by no means the only ones. I set out a few other things we might want to be able to do but could not with the interfaces that web archives gave us.&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;What search results would we have got on the web of 1998 using the search engines of 1998?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;What results would we have got using current search engines on the web of 1998?&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;How can we visualise the set of links to or from a particular site changing over time?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Treating the web as a corpora of text over time how can we track the emergence of words or concepts and their emergence from specialist vocabulary to general use?&lt;/li&gt;&lt;br /&gt;&lt;li&gt;As historians of technology, how we can use a web archive to track things like the emergence of PNG as an image format and the decline of XPM (the original icon format for graphical browsers such as Mosaic)?&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt; I also wanted to show how open APIs or RESTful interfaces can allow others to develop innovative ways to view content. Since there weren't any web archives with such interfaces I fell back on demonstrating the point with Flickr, more particularly with simple visual beauty that is &lt;a href="http://www.taggalaxy.de/"&gt;TagGalaxy&lt;/a&gt;. TagGalaxy shows how the ability to search and retrieve images and tags lets someone else build a completely different interface to the Flickr repository, one which minimises textual interaction and which encourages serendipitous discovery. It would have been wonderful to be able to do that with a web archive. Similarly, if &lt;a href="http://ukwebfocus.wordpress.com/"&gt;Brian Kelly&lt;/a&gt; had been able to say to the Internet Archive 'give me all the versions of the home page of the University of Bath between these dates' in a single interaction, it would have been much easier for him to build the informative animation he used in his own presentations for &lt;a href="http://jiscpowr.jiscinvolve.org/"&gt;JISC PoWR&lt;/a&gt;. I could go on, and at the time I did.&lt;br /&gt;&lt;br /&gt;Much of what I hoped for then has happened. The architecture of Memento makes it straightforward to view collections of web archives as a single entity from some viewpoints. Projects funded by "&lt;a href="http://www.diggingintodata.org/"&gt;Digging Into Data&lt;/a&gt;" have shown the power of large web collections in viewing the web as data at many levels. And although most (all?) web archives are not yet offering the APIs or interfaces that would permit us to do some of the things above, I think they at least accept that these are valid aspirations.&lt;br /&gt;&lt;br /&gt;Moreover, web archiving has moved from being a specialist concern to something that appears in the &lt;a href="http://jiscpowr.jiscinvolve.org/wp/2010/01/12/web-archiving-in-the-wider-world/"&gt;letters pages of national newspapers&lt;/a&gt;. That, and the type of talks we're going to hear tomorrow, show how far we've moved in 2 1/2 years. I'm quietly confident that things are getting better.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7121489494883418209?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7121489494883418209/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2011/10/thoughts-before-future-of-past-of-web.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7121489494883418209'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7121489494883418209'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2011/10/thoughts-before-future-of-past-of-web.html' title='Thoughts before &quot;The Future of the Past of the Web&quot;'/><author><name>Kevin Ashley</name><uri>http://www.blogger.com/profile/15371869767865369079</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/-PGTDrZUXVug/To3cqOrILjI/AAAAAAAAAAQ/uQiTg1QRY7g/s72-c/cybergeography.PNG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1356173642452371244</id><published>2010-04-30T09:50:00.003+01:00</published><updated>2010-04-30T14:16:24.408+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Social media'/><category scheme='http://www.blogger.com/atom/ns#' term='Research data'/><category scheme='http://www.blogger.com/atom/ns#' term='survey'/><title type='text'>DCC survey on social media use</title><content type='html'>The DCC is currently trying to gather information on the use of social media by those looking at research data management issues. It's already been publicised through a number of routes, so you may already be aware of it. If not, please give us 5 minutes of your time (yes, really 5 minutes - perhaps even less!) to answer a few questions on the survey page we set up:&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.surveymonkey.com/s/8KXJDMW"&gt;http://www.surveymonkey.com/s/8KXJDMW&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;There's no need to identify yourself and we'll only be using the data in aggregate form.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1356173642452371244?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1356173642452371244/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/04/dcc-survey-on-social-media-use.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1356173642452371244'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1356173642452371244'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/04/dcc-survey-on-social-media-use.html' title='DCC survey on social media use'/><author><name>Kevin Ashley</name><uri>http://www.blogger.com/profile/15371869767865369079</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-6530378811022853458</id><published>2010-03-31T10:29:00.002+01:00</published><updated>2010-03-31T10:55:53.786+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Linked Data'/><category scheme='http://www.blogger.com/atom/ns#' term='curated databases'/><category scheme='http://www.blogger.com/atom/ns#' term='Data Re-use'/><title type='text'>Linked Data and Reality</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;I have a copy of the really interesting book “Data and Reality” by William Kent. It’s interesting at several levels; first published in 1978, this appears to be a “print-on-demand” version of the second edition from 1987. Its imprint page simply says “Copyright © 1998, 2000 by William Kent”. &lt;/p&gt;&lt;p class="MsoNormal"&gt;The book is full of really scary ways in which the ambiguity of language can cause problems for what Kent often calls “data processing systems”. He quotes Metaxides:&lt;/p&gt;&lt;blockquote&gt; “Entities are a state of mind. No two people agree on what the real world view is”&lt;/blockquote&gt; Here’s an example of Kent from the first page: &lt;blockquote&gt;“Becoming an expert in data structures is… not of much value if the thoughts you want to express are all muddled”&lt;/blockquote&gt;But it soon becomes clear that most of us are all too easily muddled, at least when&lt;blockquote&gt; “... the thing that makes computers so hard is not their complexity, but their utter simplicity… [possessing] incredibly little ordinary intelligence”&lt;/blockquote&gt;I do commend this book to those (like me) who haven’t had formal training in data structures and modelling. &lt;p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;I was reminded of this book by the very interesting attempt by Brain Kelly to find out whether Linked Data could be used to answer a fairly simple question. His &lt;a href="http://ukwebfocus.wordpress.com/2010/02/12/a-challenge-to-linked-data-developers/"&gt;challenge&lt;/a&gt; was ‘to make use of the data stored in &lt;a href="http://dbpedia.org/About"&gt;DBpedia&lt;/a&gt; (which is harvested from &lt;a href="http://www.wikipedia.org/"&gt;Wikipedia&lt;/a&gt;) to answer the query&lt;span style="font-size:11.0pt;font-family:Verdana;color:#0E0E0E;mso-ansi-language:EN-US"&gt; &lt;/span&gt;&lt;i&gt;&lt;/i&gt;&lt;/p&gt;&lt;i&gt;&lt;blockquote&gt;“Which town or city in the UK has the highest proportion of students?"&lt;/blockquote&gt;&lt;/i&gt;&lt;span style="font-style:normal"&gt;He has written some further posts on the process of &lt;a href="http://ukwebfocus.wordpress.com/2010/02/19/response-to-my-linked-data-challenge/"&gt;answering&lt;/a&gt; the query, and attempting to &lt;a href="http://ukwebfocus.wordpress.com/2010/02/24/approaches-to-debugging-the-dbpedia-query/"&gt;debug&lt;/a&gt; the results.&lt;/span&gt;&lt;p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;So what was the answer? The query produced the answer Cambridge. That’s a little surprising, but for a while you might convince yourself it’s right; after all, it’s not a large town and it has 2 universities based there. The table of results shows the student population as 38,696, while the population of the town is… hang on… 12? So the percentage of students is 3224%. Yes, something is clearly wrong here, and Brian goes on to investigate a bit more. No clear answer yet, although it begins to look as if the process of going from Wikipedia to DBpedia might be involved. Specifically, Wikipedia gives (gave, it might have changed) “three population counts: the district and city population (122,800), urban population (130,000), and county population (752,900)”. But querying DBpedia gave him “&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;three values for population: 12, 73 and 752,900”.&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;There is of course something faintly alarming about this. What’s the point of Linked Data if it can so easily produce such stupid results? Or worse, produce seriously wrong but not quite so obviously stupid results? But in the end, I don’t think this is the right reaction. If we care about our queries, we should care about our sources; we should use curated resources that we can trust. Resources from, say… the UK&lt;span style="mso-spacerun: yes"&gt;  &lt;/span&gt;government? &lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;And that’s what &lt;/span&gt;&lt;span lang="EN-US"&gt;&lt;a href="http://kitwallace.posterous.com/university-problem-again"&gt;Chris Wallace has done&lt;/a&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;. He used pretty reliable data (although the Guardian’s in there somewhere ;-), and built a robust query. He really knows what he’s doing. And the answer is… drum roll… Milton Keynes!&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;I have to admit I’d been worrying a bit about this outcome. For non-Brits, Milton Keynes is a New Town north west of London with a collection of &lt;/span&gt;&lt;span lang="EN-US"&gt;&lt;a href="http://en.wikipedia.org/wiki/Concrete_Cows"&gt;concrete cows&lt;/a&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;, more roundabouts than anywhere (except possibly Swindon, but that’s another story), and some impeccable transport connections. It’s also home to Britain’s largest University, the &lt;/span&gt;&lt;span lang="EN-US"&gt;&lt;a href="http://www.open.ac.uk/"&gt;Open University&lt;/a&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;. The trouble is, very few of those students live in Milton Keynes, or even come to visit for any length of time (just the odd Summer School), as the OU operates almost entirely by distance learning. So if you read the query as “Which town or city in the UK is home to one or more universities whose registered students divided by the local population gives the largest percentage?”, then it would be fine.&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;And hang on again. I just made an explicit transition there that has been implicit so far. We’ve been talking about students, and I’ve turned that into university students. We can be pretty sure that’s what Brian meant, but it’s not what he asked. If you start to include primary and secondary school students, I couldn’t guess which town you’d end up with (and it might even be Milton Keynes, with a youngish population).&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;My sense of Brian’s question is “Which town or city in the UK is home to one or more university campuses whose registered full or part time (non-distance) students divided by the local population gives the largest percentage?”. Or something like that (remember Metaxides, above). Go on, have a go at expressing your own version more precisely!&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;The point is, these things are hard. Understanding your data structures and their semantics, understanding the actual data and their provenance, understanding your questions, expressing them really clearly: these are hard things. That’s why informatics takes years to learn properly. Why people worry about how the &lt;/span&gt;&lt;span lang="EN-US"&gt;&lt;a href="http://www.w3.org/Submission/2010/SUBM-vcard-rdf-20100120"&gt;parameters in a VCard should be expressed in RDF&lt;/a&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;. It matters, and you can mess up if you get it wrong.&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;People sometimes say there’s so much dross and rubbish on the Internet, that searches such as Google provides are no good. But in fact with text, the human reader is mostly extraordinarily good at distinguishing dross from diamonds. A couple of side searches will usually clear up any doubts.&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;But people don’t do data well. Automated systems do, SPARQL queries do. We ought to remember a lot more from William Kent, about the ambiguities of concepts, but especially that bit about&lt;span style="mso-spacerun: yes"&gt;  &lt;/span&gt;computers possessing incredibly little ordinary intelligence. I’m beginning to worry that Linked Data may be slightly dangerous except for very well-designed systems and very smart people…&lt;/span&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-6530378811022853458?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/6530378811022853458/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/03/linked-data-and-reality.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6530378811022853458'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6530378811022853458'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/03/linked-data-and-reality.html' title='Linked Data and Reality'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-876721428839709277</id><published>2010-03-09T22:50:00.003Z</published><updated>2010-03-09T23:01:36.601Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Science Commons'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Data'/><category scheme='http://www.blogger.com/atom/ns#' term='Panton Principles'/><title type='text'>When data shouldn’t be open?</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;There is a big momentum these days about data being accessible, available, and re-usable. Increasingly people want open data; Science Commons have been recommending using CC0 to make the fully open status of data clear. More recently the &lt;a href="http://pantonprinciples.org/"&gt;Panton Principle&lt;/a&gt;s start:&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p class="MsoNormal"&gt;“Science is based on building on, reusing and openly criticising the published body of scientific knowledge.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;For science to effectively function, and for society to reap the full benefits from scientific endeavours, it is crucial that science data be made &lt;a href="http://opendefinition.org/"&gt;&lt;b&gt;open&lt;/b&gt;&lt;/a&gt;.”&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p class="MsoNormal"&gt;We’ve been big fans of Open Access at the DCC since its early days. We use a Creative Commons licence for our content by default. This blog was one of the earliest to be specific about a Creative Commons licence not only for the core text that we write, but also for the comments that you might add here.&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;So we strongly support the Open Data approach… where possible. For of course in some areas of science and research, there are data that cannot be open. Usually this is because the data are sensitive. They could be personal data, protected under Data Protection laws. Sensitive personal data (such as medical record data) has extra requirements under those laws. They could be financial microdata, commercially sensitive. Or perhaps data with strong commercial exploitation potential. They could be anthropological data, sensitive through cultural requirements. Research needs to go anywhere, whatever the issues; we can’t be constrained to only research where the data can be open.&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;So perhaps it’s as simple as that: some science should have open data, and some should have closed data?&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Well, maybe not. Because the underlying issue of the Panton Principles must still apply. Research should be verifiable, whether through repeatable experiments or through re-analysable data. Unverifiable research is, well, unreliable- perhaps indistinguishable from fraud. Some access is needed; perhaps we should think of even sensitive data as Less Open Data rather than closed data.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;So how do you go about dealing with sensitive data? Keep it secure, transfer securely, provide access under strict licences and controls in dat enclaves, aggregate, de-identify, anonymise, there are plenty of tricks in the book. That’s the topic of the 4&lt;sup&gt;th&lt;/sup&gt; Research Data Management Forum starting tomorrow in Manchester. I’ll hope to have more to write about what we learn later.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-876721428839709277?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/876721428839709277/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/03/when-data-shouldnt-be-open.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/876721428839709277'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/876721428839709277'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/03/when-data-shouldnt-be-open.html' title='When data shouldn’t be open?'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-6686784291526668946</id><published>2010-03-09T11:21:00.003Z</published><updated>2010-03-09T11:33:02.691Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='sustainability'/><category scheme='http://www.blogger.com/atom/ns#' term='BRTF-SDPA'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><title type='text'>A Blue Ribbon for Sustainability?</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;When we talk about long term digital preservation, about access for the future, about the digital records of science, or of government, or of companies, or the designs of ships or aircraft, the locations of toxic wastes, and so on being accessible for tens or hundreds of years, we are often whistling in the dark to keep the bogeys at bay. These things are all possible, and increasingly we know how to achieve them technically. But much more than non-digital forms, the digital record needs to be continuously sustained, and we just don’t know how to assure that. Providing future access to digital records needs action now and into that future to provide a continuous flow of the necessary will, community participation, energy and (not least) money. Future access requires a sustainable infrastructure. Ensuring sustainability is one of the major unsolved problems in providing future access through digital preservation.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;For the past two years I have been lucky enough to be a member of the grandly named &lt;a href="http://brtf.sdsc.edu/"&gt;Blue Ribbon Task Force on Sustainable Digital Preservation and Access&lt;/a&gt;, along with a stellar cast of experts in preservation, in the library and archives worlds, in data, in movies… and in economics. C0-chaired by Fran Berman (previously of SDSC, now of RPI) and Brian Lavoie of OCLC, the Task Force produced an &lt;a href="http://brtf.sdsc.edu/biblio/BRTF_Interim_Report.pdf"&gt;Interim Report&lt;/a&gt; (PDF) a year ago, and has just released its &lt;a href="http://brtf.sdsc.edu/biblio/BRTF_Final_Report.pdf"&gt;Final Report&lt;/a&gt; (Sustainable Economics for a Digital Planet: Ensuring Long-Term Access to Digital Information, also PDF). (The Task Force was itself sustained by an equally stellar cast of sponsors, including the US National Science Foundation and the Andrew W. Mellon Foundation, in partnership with the Library of Congress, the UK’s JISC, the Council on Library and Information Resources, and NARA.)&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Sustainability is often equated to keeping up the money supply, but we think it’s much more than that. The Task Force specifically looks at economic sustainability; it says early in the Executive Summary that it’s about&lt;/p&gt;  &lt;p class="MsoNormal" style="mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p class="MsoNormal" style="mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;“&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;… mobilizing resources—human, technical, and financial—across a spectrum of stakeholders diffuse over both space and time.”&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p class="MsoNormal"&gt;If you want a FAQ on funding your project over the long term you won’t find it here. Nor will you find a list of benefactors, or pointers to tax breaks, or arguments for your Provost. Instead you should find a report that helps you think in new ways about sustainability, and apply that new thinking to your particular domain. For one of our major conclusions is that there are no general, across the board answers.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;One of the great things about this Task Force was its sweeping ambition. Not just content with bringing together a new economics of sustainable digital preservation, but thinking so broadly. This was never about some few resources, or this Repository or that Archive, it was about the preservation and long term access of major areas of our intellectual life, like scholarly communication, like research data, like commercially owned cultural content (the movie industry is part of this), and the blogosphere and variants (collectively produced web content). Looking at those four areas holistically rather than as fragments forced us to recognise how different they are, and how much those differences affect their sustainability. They aren’t the only areas, and indeed further work on other areas would be valuable, but they were enough to make the Task Force think differently from any activity I have taken part in before.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The report is, to my mind, exceedingly well written, thanks to Abby Smith Rumsey; it far exceeds the many rather muddled conversations we had during our investigations. It has many quotable quotes; among my favourites is&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p class="MsoNormal"&gt;“When making the case for preservation, make the case for use.”&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p class="MsoNormal"&gt;Reading the report is not without its challenges, as you might expect. It has to marry two technical vocabularies and make them understandable to both communities. I’ve been living partly in this world for two years, and still sometimes stumble over it; I remember many times screwing up my forehead, raising my hand and asking “Tell us again, what’s a choice variable?” And the reader will have to think about things like derived demand for depreciable durable assets, nonrival in consumption, temporally dynamic and path-dependent, not to mention the free rider problem. These concepts are there for a reason however; get them straight and you’ll understand the game a lot better.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;And there are not surprisingly big underlying US-based assumptions in places, although the two resident Brits (myself and Paul Ayris of UCL) did manage to inject some internationalism. Further work grounded in other jurisdictions would be extremely valuable.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Overall I don’t think this report is too big an ask for anyone anywhere who is serious about understanding the economic sustainability of digital preservation and future access to digital materials. I hope you find the great value that I believe exists here.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-6686784291526668946?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/6686784291526668946/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/03/blue-ribbon-for-sustainability.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6686784291526668946'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6686784291526668946'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/03/blue-ribbon-for-sustainability.html' title='A Blue Ribbon for Sustainability?'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2086137967347287930</id><published>2010-03-01T20:23:00.004Z</published><updated>2010-03-01T20:39:03.907Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='DCC'/><title type='text'>DCC: A new phase, a new perspective, a new Director</title><content type='html'>&lt;div&gt;As the DCC begins its third phase today, I am delighted to announce the appointment of our new Director, Kevin Ashley, who will succeed me upon my retirement in April 2010.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Kevin Ashley has been Head of Digital Archives at the University of London Computer Centre (ULCC) since 1997, during which time his multi-disciplinary group has provided services related to the preservation and reusability of digital resources on behalf of other organisations, as well as conducting research, development and training.  The group has operated the National Digital Archive of Datasets for The National Archives of the UK for over twelve years, delivering customised digital repository services to a range of organisations.  As a member of the JISC's Infrastructure and Resources Committee, the Advisory Council for ERPANET, plus several advisory boards for data and archives projects and services, Kevin has contributed widely to the research information community.  As a firm and trusted proponent of the DCC we look forward to his energetic leadership in this new phase of our evolution.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So far so press release. But I'd go further. I can't tell you how pleased I am with this appointment. As some readers will know, I have personally lobbied all and any potential candidates for this post since before I officially announced I was leaving. I understand we had some excellent candidates (I wasn't directly involved), more than one of whom might have made an excellent Director. But I'm particularly pleased at Kevin's appointment for several reasons: he is well engaged in the community including good connections with JISC, our major funder), he's tough enough to keep this tricky collaboration thing going, he has an excellent technical understanding, and he has great experience of actually managing this stuff in all its crusty awfulness. I particularly remember his discussion (on a visit to the Edinburgh Informatics Database Group) about issues like how best to deal with an archived dataset where they came across the characters "five" in a field defined as numeric! You can make it work or make it a record but not both...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So congratulations Kevin, and good luck!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2086137967347287930?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2086137967347287930/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/03/dcc-new-phase-new-perspective-new.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2086137967347287930'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2086137967347287930'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/03/dcc-new-phase-new-perspective-new.html' title='DCC: A new phase, a new perspective, a new Director'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-486953703120520989</id><published>2010-02-04T19:46:00.005Z</published><updated>2010-02-04T20:06:32.523Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='DOI'/><category scheme='http://www.blogger.com/atom/ns#' term='Citation'/><category scheme='http://www.blogger.com/atom/ns#' term='Cool URIs'/><category scheme='http://www.blogger.com/atom/ns#' term='Persistent IDs'/><title type='text'>Persistent identifiers workshop comes round again</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;It seems to be the one event that people think is important enough to go to, even though they fear in their hearts that, yet again, not a lot of progress will be made. Most of those at yesterday’s JISC-funded Persistent Identifiers workshop yesterday had been to several such meetings before. For my part, I learned quite a lot, but the slightly flat outcome was not all that unexpected. It’s not quite Groundhog Day, as things do move forward slightly from one meeting to the next.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Part of the trouble is in the name. There is this tendency to think that persistent identifiers can be made persistent by some kind of technical solution.&lt;span style="mso-spacerun: yes"&gt;  &lt;/span&gt;To my mind this is a childish belief in the power of magic, and a total abrogation of responsibility; the real issues with “persistent” identifiers are policy and social issues. Basically, far too many people just don’t get some simple truths. If you have a resource which has been given some kind of identifier that resolves to its address (so people can use it), and you change that address without telling those who manage the identifier/resolution, then the identifier will be broken. End of, as they say! &lt;/p&gt;&lt;p class="MsoNormal"&gt;This applies whether you have an externally managed identifier (DOI, Handle, PURL) or an internally managed identifier (eg a well-designed HTTP URI… Paul Walk threatened to throw a biscuit at the first person to mention “Cool URLs”, but had to throw it at himself!).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Now clearly some identifiers have traction in some areas. Thanks to the efforts of &lt;a href="http://www.crossref.org/"&gt;CrossRef&lt;/a&gt; and its member publishers, the DOI is extremely useful in the scholarly journal literature world. You really wouldn’t want to invent a new identifier for journal articles now, and if you have a journal that doesn’t use DOIs (ahem!), you would be well-advised to sign up. It looks very affordable for a small publisher: $275 per year plus $1 per article.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Even for such a well-established identifier, with well-defined policies and a strong set of social obligations, things do go wrong. I give you &lt;a href="http://home.badc.rl.ac.uk/lawrence/blog/2009/06/04/unhinged_doi%3A_who_ya_gonna_call%3F"&gt;Exhibit A&lt;/a&gt;, for example, in which Bryan Lawrence discovers that dereferencing a DOI for a 2001 article on his publications list leads to "Content not found" (apologies for the “acerbic” nature of my comment there). It looks like this was due to a failure of two publishers to handle a journal transfer properly; the new publisher made up a new DOI for the article, and abandoned the old one. Aaaaarrrrrrggggghhhhhhh! Moving a resource and giving it a new DOI is a failure of policy and social underpinning (let alone competence) that no persistent identifier scheme can survive! CrossRef does its best to prevent such fiascos occurring, but see social issues above. People fail to understand how important this is, or simple things like: the DOI prefix is not part of your brand!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Whether a DOI is the right identifier to use for research data seems to me a much more open question. The issue here is whether the very different nature of&lt;span style="mso-spacerun: yes"&gt;  &lt;/span&gt;(at least some kinds of) research data would make the DOI less useful. The &lt;a href="http://www.datacite.org/"&gt;DataCite&lt;/a&gt; group is committed to improving the citability of research data (which I applaud), but also seems to be committed to use of the DOI, which is a little more worrying. While the DOI is clearly useful for a set of relatively small, unchanging digital objects published in relatively small numbers each year (eg articles published in the scholarly literature), is it so useful for a resource type which varies by many orders of magnitude in terms of numbers of objects, rate of production, size of object, granularity of identified subset, and rate of change? In particular, the issue of how a DOI should relate to an object that is constantly changing (as so many research datasets do) appears relatively un-examined.&lt;/p&gt;&lt;p class="MsoNormal"&gt;There was some discussion, interesting to me at least, on the relationships of DOIs to the Linked Data world. If you remember, in that world things are identified by URIs, preferably HTTP URIs. We were told (via the twitter backchannel, about which I might say more later) that DOIs are not URIs, and that the dx.doi.org version is not a DOI (nor presumably is the INFO URI version). This may be fact, but seems to me rather a problem, as it means that "real DOIs" don't work as 1st class citizens of a Linked data World. If the International DOI Foundation were to declare that the HTTP version was equivalent to a DOI, and could be used wherever a DOI could be used, then the usefulness of the DOI as an identifier in a Linked Data world might be greatly increased.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;A question that’s been bothering me for a while is when an “arms-length” scheme, like PURL, Handle, DOI etc is preferable to a well-managed local HTTP identifier. We know that such well-managed HTTP identifiers can be extremely persistent; as far as I know all of the eLib programme URIs established by UKOLN in 1995 still work, even though UKOLN web infrastructure has completely changed (and I suspect that those identifiers have outlasted the oldest extant DOI, which must have happened after 1998). Such a local identifier remains under your control, free of external costs, and can participate fully in the Linked Data world; these are quite significant advantages. It seems to me that the main advantage of the set of “arms-length” identifiers is that they are independent of the domain, so they can be managed even if the original domain is lost; at that point, a HTTP URI redirect table could not be set up. So I’m afraid I joked on twitter that perhaps “use of a DOI was a public statement of lack of confidence in the future of your organisation”. Sadly I missed waving the irony flag on this, so it caused a certain amount of twitter outrage that was unintentional!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;In fact the twitter backchannel was extremely interesting. Around a third or so of the twits were not actually at the meeting, which of course was not apparent to all. And it is in the nature of a backchannel to be responding to a heard discourse, not apparent to the absent twits; in other words, the tweets represent a flawed and extremely partial view of the meeting. Some of those who were not present (who included people in the DOI world, the IETF and big publishers) seemed to get quite the wrong end of the stick about what was being said. On the other hand, some external contributions were extremely useful and added value for the meat-space participants!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;I will end with one more twitter contribution. We had been talking a bit about the publishing world, and someone asked how persistent are academic publishers. The tweet came back from somewhere “well, their salespeople are always ringing us up ;-) !&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-486953703120520989?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/486953703120520989/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/02/persistent-identifiers-workshop-comes.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/486953703120520989'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/486953703120520989'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/02/persistent-identifiers-workshop-comes.html' title='Persistent identifiers workshop comes round again'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-9168182737858760532</id><published>2010-02-02T19:38:00.003Z</published><updated>2010-02-02T20:21:59.832Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Semantic web'/><category scheme='http://www.blogger.com/atom/ns#' term='RDFa'/><category scheme='http://www.blogger.com/atom/ns#' term='Linked Data'/><category scheme='http://www.blogger.com/atom/ns#' term='RDF'/><title type='text'>More on contact pages and linked data</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;I wrote about RDF-encoding contact information a little &lt;a href="http://digitalcuration.blogspot.com/2010/01/linked-data-and-staff-contact-pages.html"&gt;earlier&lt;/a&gt; and had some very helpful comments. On reflection, and after exploring the “View Source” options for a couple of institutional contact pages, I’ve had some further thoughts. &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;- Contacts pages are rarely authored, they are nearly always created on the fly from an underlying database. This makes them natural for expressing in RDF (or microformats). It’s just a question of tweaking the way the HTML wrapper is assembled. Bath University’s &lt;a href="http://www.bath.ac.uk/contact"&gt;Person Finder &lt;/a&gt;pages do encode their data in &lt;a href="http://microformats.org/wiki/hcard#Specification"&gt;microformats&lt;/a&gt;.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;- I wondered why more universities don’t encode their data in microformats or (even better) in RDF for Linked Data. One possible answer is that the contact pages were probably one of the earliest examples of constructing web pages from databases. It works, it ain’t broke, so they haven’t needed to fix it! If so, a reasonable case would need to be made for any change, but once made it would be comparatively cheap to carry out.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;- A second problem is that it is not at all clear to me what the best encoding and vocabulary for institutional (or organisational unit) contact pages might be. So maybe it’s even less surprising that things have not changed. To say I'm confused is putting it mildly! So what follows list some of the options after further (but perhaps not complete) investigation...&lt;/p&gt;  &lt;p class="MsoNormal"&gt;One approach is the &lt;a href="http://microformats.org/wiki/hcard#Specification"&gt;hCard microforma&lt;/a&gt;t, based on the widely used vCard specification, &lt;a href="http://www.ietf.org/rfc/rfc2426.txt"&gt;RFC2426&lt;/a&gt; (this is what Bath uses). That’s fine as far as it goes, but microformats don’t seem to fit directly in the Linked Data world. I’m no expect (clearly!), but in particular, microformats don’t use URIs for the names of things, and don’t use RDF. They appear useful for extracting information from a web page, but not much beyond that (I guess I stand to be corrected here!).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Looking at RDF-based encodings, there are options based on vCard, there are FOAF and SIOC (both really coming from a social networking view point), and there’s the Portable Contacts specification.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Given that vCard is a standard for contact information, it would seem sensible to look for a vCard encoding in RDF. It turns out that there are two RDF encodings of vCard, one supposedly deprecated, and the other apparently unchanged since 2006. I now discover an activity to formalise a W3C approach in this area, with a &lt;a href="http://www.w3.org/Submission/vcard-rdf"&gt;draft submission &lt;/a&gt;to W3C edited by Renato Ianella and dating only from last December (2009), but I would need a W3C username and password to see the latest version, so I can't tell how it's going,&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Someone asked me a while ago who sets the standards for Linked Data vocabularies. My response at the time was that the users did, by choosing which specification to adopt. At the time, FOAF seemed to have most mentions in this general area, and I rather assumed (see the previous post) that it would have the appropriate elements. However, the “Friend of a Friend” angle really does seem to dominate; this vocabulary does seem to be more about relationships, and to be lacking in some of the elements needed for a contacts page. I suspect this might have stemmed from a desire to stop people compromising their privacy in a spam-laden world. However, those of us in public service posts often need to expose our contact details. However, FOAF does have email as foaf:mbox, which apparently includes phone and fax as well, as you can see from the sample FOAF extract in my earlier post.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;In a tweet Dan Brickley suggested: “We'll probably round out FOAF’s address book coverage to align with Portable Contacts spec”, so I had a look at the latter. The main web site didn’t answer, but Google’s &lt;a href="http://66.102.9.132/search?q=cache:E5kvxT0KfKoJ:portablecontacts.net/draft-spec.html+portable+contacts+spec&amp;amp;cd=1&amp;amp;hl=en&amp;amp;ct=clnk&amp;amp;client=safari"&gt;cache provided me with a draft spec&lt;/a&gt;, which does appear to have the elements I need. &lt;/p&gt;  &lt;p class="MsoNormal"&gt;What elements do I need for a contact page? Roughly I would want some or all of:&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Name&lt;/li&gt;&lt;li&gt;Job title/role in DCC (my virtual organisation)&lt;/li&gt;&lt;li&gt;(Optional job title/role in home organisation)&lt;/li&gt;&lt;li&gt;Organisational unit/Organisation&lt;/li&gt;&lt;li&gt;Address/location&lt;/li&gt;&lt;li&gt;Phone/fax numbers&lt;/li&gt;&lt;li&gt;Email address&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;              &lt;p class="MsoNormal"&gt;So what could I do if this information were expressed in RDF in the contact pages for a partner institution (say UKOLN at Bath)? Well, presumably the DCC contact pages would be based on a database showing the staff who work on the DCC, with the contact information directly extracted from the remote pages (either linked in real time or perhaps cached in some way). And if Bath changed their telephone numbers again, our contact details would remain up to date. But more. Given that there are some staff members who have roles in several projects, it would be easy to see who the linkages were between the DCC and the other project (eg RSP in the past, or I2S2 now). Part of the point of Linked Data (rather than microformats) is that one can reason with it; follow the edges of the great global graph…&lt;/p&gt;  &lt;p class="MsoNormal"&gt;And perhaps I would be able to find a simple app that extracts a vCard from the contact page to import into my Mac’s Address Book, which is where I started this search from! You wouldn’t think it would be hard, would you? I mean, this isn’t rocket science, surely?&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-9168182737858760532?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/9168182737858760532/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/02/more-on-contact-pages-and-linked-data.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/9168182737858760532'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/9168182737858760532'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/02/more-on-contact-pages-and-linked-data.html' title='More on contact pages and linked data'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7079289955504029372</id><published>2010-01-21T16:38:00.003Z</published><updated>2010-01-21T17:04:58.581Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Linked Data'/><category scheme='http://www.blogger.com/atom/ns#' term='Geospatial data'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Data'/><category scheme='http://www.blogger.com/atom/ns#' term='Data Services'/><title type='text'>Digimap is 10</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;I had a very enjoyable day yesterday helping &lt;a href="http://edina.ac.uk/"&gt;EDINA&lt;/a&gt; celebrate &lt;a href="http://edina.ac.uk/event/digimap10/"&gt;10 years&lt;/a&gt; of the &lt;a href="http://edina.ac.uk/digimap/"&gt;Digimap&lt;/a&gt; service. What began as an &lt;a href="http://www.ukoln.ac.uk/services/elib/"&gt;eLib&lt;/a&gt; project and experiment with 6 Universities in 1996 has grown to a mature service with over 100,000 users, 45,000 of them active, in pretty much every UK University, and soon in UK schools as well.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;In 1996 I was Programme Director of the eLib Programme, and my earliest email about Digimap was from the JISC money man, Dave Cook, on 30 January 1996 to Peter Burnhill of the Edinburgh Data Library (as it then was). Dave told Peter we were interested in his idea (for an Images project!) but had a few concerns (that the &lt;a href="http://www.ordnancesurvey.co.uk/"&gt;Ordnance Survey&lt;/a&gt; might not agree to let us use their mapping data; it’s hard to remember now how difficult some of those 1990s persuasions were!). Three days later, Dave was offering real money, although it had to be spent by 20 March that year. Done!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;By late 1997 the &lt;a href="http://www.ukoln.ac.uk/services/elib/projects/digimap/"&gt;Digimap project&lt;/a&gt; (*) had a trial service; I remember experimenting with it and having some problems (this was with Netscape 3 on a PowerMac Duo or something like that; woefully under-powered in retrospect). By the end of 1999, they were moving to a new GIS system, and we were beginning to discuss turning Digimap into a service, and that went live in January 2000. They had to get 37 subscribing Universities by a particular deadline, and I think managed 39 by somewhat earlier. &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Since then the service has grown in scope, quality, usage and value. In my personal opinion (full disclosure, I’m not neutral here, having been associated with it through advisory groups of various kinds throughout its life), Digimap is the best service funded by JISC. Best in quality, best in professionalism, best in innovation, best in support. A lot of people deserve credit for that, and EDINA should all be extremely proud of what they have created. By the way, the OS have managed some major shifts in attitude over the years, from suspicious tolerance through to strong support, and the success is partly down to them, and to the efforts of the negotiators in what is now &lt;a href="http://www.jisc-collections.ac.uk/"&gt;JISC Collections&lt;/a&gt;.&lt;/p&gt;&lt;p class="MsoNormal"&gt;As well as various forms of OS mapping for GB (whose trademark names always escape me... and it is GB rather than UK, for weird historical reasons), Digimap now offers 4 “epochs” of historic maps from Landmark, plus Geology maps from BGS and Marine maps from SeaZone. Due to licence restrictions it is only available to registered staff and students at subscribing UK institutions, but I hope that those of you unlucky enough not to fall in that category can soon read more about it on the pages to be put up related to the celebration.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Digimap has been a bit clunky at times compared with the innovations introduced by some others, but with the new underlying GIS, the interfaces are being upgraded; they now have “slippy maps” (called Digimap Roam) on the base service, and it looks really smart and much more functional. It's tough for a small group to keep up with the likes of Google, Yahoo and MS! Soon this slippy map interface will be extended to the Historic service (“Ancient Roam”?), Geology (“Rock’n Roam”?) and Marine (a rather dull “H2Roam”!)… I think those might be internal names, but if you can complete the set with an even punnier marine name, who knows they might keep them!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The day was good fun, and we heard quite a bit about what Digimap is and how it is being used (far more widely than geography departments). The most exciting was a student project using Digimap and a GPS for a light aircraft CFIT-avoidance system (CFIT is Controlled Flight Into Terrain, referred to as “having a bad day”!). We heard from the data suppliers, with a bit more about what’s coming. It was interesting to hear the OS man talking about moves towards Linked Data; I wasn’t sure how that would square with the closed access, but I think I muddled my question (confused Linked Data with OGC web services, I suspect). The service providers didn’t appear to be talking to each other about Linked Data, which might be a good start.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;A highlight was the closing keynote from Vanessa Lawrence, CEO of OS, clearly extremely supportive of OS. Choosing her words very carefully (she is not allowed to influence anyone) she outlined the government’s open data initiative and the &lt;a href="http://www.communities.gov.uk/publications/corporate/ordnancesurveyconsultation"&gt;consultation&lt;/a&gt; on its implications for the OS; this consultation closes late March 2010, but she urged us to make any responses, whether collectively or as private citizens well before then. The consultation isn’t simply “should we open up access to OS data?”, it’s much more “how can we open up access to OS data and still sustain the quality of the data into the future”.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The celebration ended with a reception and dinner, with an amusing after-dinner talk by &lt;a href="http://www.amazon.co.uk/Map-Addict-Mike-Parker/dp/0007300840"&gt;Michael Parker, author of Map Addict&lt;/a&gt;. All in all, a very enjoyable and worthwhile day to celebrate a significant anniversary.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;PS the twitter tag is &lt;a href="http://twitter.com/#search?q=%23digimap10"&gt;#digimap10&lt;/a&gt;; I’m not going to tag the post with it, as I’ve got far too many one-time tags that are a pain to manage…&lt;/p&gt;&lt;p class="MsoNormal"&gt;PPS (*) Unfortunately the original Digimap project pages seem to have vanished, and the earliest Wayback Machine gathers appear to be faulty; the first successful gather I can find is&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;a href="http://web.archive.org/web/20011021051021/edina.ac.uk/digimap/"&gt;http://web.archive.org/web/20011021051021/edina.ac.uk/digimap/&lt;/a&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;... which seems to refer to the service, not the project.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7079289955504029372?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7079289955504029372/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/digimap-is-10.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7079289955504029372'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7079289955504029372'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/digimap-is-10.html' title='Digimap is 10'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1349075615660288706</id><published>2010-01-21T14:42:00.003Z</published><updated>2010-01-21T15:14:45.361Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Semantic web'/><category scheme='http://www.blogger.com/atom/ns#' term='microformats'/><category scheme='http://www.blogger.com/atom/ns#' term='RDFa'/><category scheme='http://www.blogger.com/atom/ns#' term='Linked Data'/><category scheme='http://www.blogger.com/atom/ns#' term='RDF'/><title type='text'>Linked data and staff contact pages</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;You may remember that I am interested in the &lt;a href="http://digitalcuration.blogspot.com/2009/08/dcc-web-site-and-linked-data.html"&gt;extent to which we should use Semantic Web&lt;/a&gt; (or &lt;a href="http://linkeddata.org/"&gt;Linked Data&lt;/a&gt;) on the DCC web site. After some discussions, I reached the conclusion that we should do so, but the tools were not ready yet (this isn’t quite an Augustinian “Oh Lord, make me good but not yet”; specifically, we are moving our web site to Drupal 6, the Linked Data stuff will not be native until Drupal 7, and our consultants are not yet up to speed with Linked Data). I have to say that not all our staff are convinced of the benefits of using &lt;a href="http://www.w3.org/RDF/"&gt;RDF&lt;/a&gt; etc on the web site, and I have had a mental note to write more about this, real soon now.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;I was reminded of this recently. I wanted to phone a colleague who worked at &lt;a href="http://www.ukoln.ac.uk/"&gt;UKOLN&lt;/a&gt;, one of our partners, and I didn’t have his details in my address book. So I looked on their web site and navigated to his contacts page. Once there I copied his details into the address book, before lifting the phone to give him a ring. After the call (he wasn’t there; the snow had closed the office), I thought about that process. I had to copy all those details! Wouldn’t it be great if I could just import them somehow? How could that be? UKOLN have expertise in such matters, so I tweeted &lt;a href="http://www.ukoln.ac.uk/ukoln/staff/p.walk/"&gt;Paul Walk&lt;/a&gt; (now Deputy Director, previously technical manager) asking whether they had considered making the details accessible as Linked Data using something like &lt;a href="http://xmlns.com/foaf/spec/"&gt;FOAF&lt;/a&gt;. You can guess I’m not fully up to speed with this stuff, but I’m certainly trying to learn!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Paul replied that they had considered putting &lt;a href="http://microformats.org/"&gt;microformats&lt;/a&gt; into the page (I guess this is the &lt;a href="http://microformats.org/wiki/hcard"&gt;hCard&lt;/a&gt; microformat), and then asked me whether my address book understood RDF, or if I was going to script something? I was pretty sure the answer to the second part was “no” as I suspect such scripting currently is beyond me, and told Paul that I was using MacOSX 10.6 Address Book; it says nothing about RDF, but will import a &lt;a href="http://microformats.org/wiki/rfc-2426"&gt;vcard&lt;/a&gt;. I was thinking that if there was appropriate stuff (either hCard microformat or &lt;a href="http://www.w3.org/TR/xhtml-rdfa-primer/"&gt;RDFa&lt;/a&gt; with FOAF) on the page, I might find an app somewhere that would scrape it off and make a vcard I could import.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Paul’s final tweet was: “@cardcc see the use-case, not sure it's a 'linked data' problem though. What are the links that matter if you're scraping a single contact?”&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Well, I couldn’t think of a 140-character answer to that question, which seemed to raise issues I had not thought about properly. What are the links that matter? Was it linked data, or just coded data that I wanted? Is this really a semantic web question rather than linked data? Or is it a RDF question? Or a vocabulary question? Gulp!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;After some thought, perhaps Paul was as constrained by his 140 characters as I was. Surely a contacts page contains both facts and links within itself. See the &lt;a href="http://en.wikipedia.org/wiki/FOAF_(software)"&gt;Wikipedia page on FOAF&lt;/a&gt; for examples of a FOAF file in turtle for Jimmy Wales; the coverage is pretty much like a contacts page.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;So Paul’s contact page says he works for UKOLN at the University of Bath, and gives the latter’s address (I guess formally speaking he works in UKOLN, an administrative unit, and is employed by the University); that his position in UKOLN is Deputy Director, that his phone, fax and email addresses are x, y and z. All of these are relationships between facts, expressible in the FOAF vocabulary. With RDFa, that information could be explicitly encoded in the HTML of the page and understood by machines, rather than inferred from the co-location of some characters on the page (the human eye is much better at such inferences). So there’s RDF, right there. Is that Linked Data? Is it Semantic Web? I’m not really sure.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;More to the point, would it have been any greater use to me if it had been so encoded? A FOAF-hunting spider could traverse the web and build up a network of people, and I might be able to query that network, and even get the results downloaded in the form of a vcard that I could import into my Mac Address Book. That sounds quite possible, and the tools may already exist. Or, there may exist an app (what we used to call a Small Matter Of Programming, or a SMOP) that I could point at a web page with FOAF RDFa on it. Perhaps that’s what Paul was after in relation to scripting. Maybe the upcoming Dev8D might find this an interesting task to look at?&lt;/p&gt;  &lt;p class="MsoNormal"&gt;What other things could be done with such a page? Well, Paul or others might use it to disambiguate the many Paul Walk alter egos out there. You’ll see I have a simple link to Paul’s contact page above, but if this blog were RDF-enabled, perhaps we could have a more formal link to the assertions on the page, eg to that Paul Walk’s phone number, that Paul Walk’s email address, etc.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Well I’m not sure if this makes sense, and it does feel like one of those “first fax machine” situations. However FOAF has been around for a long while now. Does that mean that folk don’t perceive an advantage in such formal encodings to balance their costs, or is this an absence of value because of a lack of exploitable tools? If so, anyone going to Dev8D want to make an app for me?&lt;/p&gt;  &lt;p class="MsoNormal"&gt;(It’s also possible of course that Paul doesn’t want his details to be spidered up in this way, but I guess none of us should put contact details on the web if that’s our position.)&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  By the way, I found a web page called &lt;a href="http://www.ldodds.com/foaf/foaf-a-matic"&gt;FOAF-a-matic&lt;/a&gt; that will create FOAF RDF for you. Here's an extract from what it created for me, in RDF:&lt;div&gt;&lt;span class="Apple-style-span"   style="  white-space: pre-wrap; font-family:'Lucida Grande';font-size:11px;"&gt;&lt;/span&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="  white-space: pre-wrap; font-family:'Lucida Grande';font-size:11px;"&gt;&amp;lt;foaf:Person rdf:ID="me"&amp;gt; &amp;lt;foaf:name&gt;Chris Rusbridge&amp;lt;/foaf:name&amp;gt; &amp;lt;foaf:title&amp;gt;Mr&amp;lt;/foaf:title&amp;gt; &amp;lt;foaf:givenname&amp;gt;Chris&amp;lt;/foaf:givenname&amp;gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="  white-space: pre-wrap; font-family:'Lucida Grande';font-size:11px;"&gt;&amp;lt;foaf:family_name&amp;gt;Rusbridge&amp;lt;/foaf:family_name&amp;gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="  white-space: pre-wrap; font-family:'Lucida Grande';font-size:11px;"&gt;&amp;lt;foaf:mbox rdf:resource="mailto:c.rusbridge@xxxxx"/&amp;gt; &amp;lt;foaf:workplaceHomepage rdf:resource="http://www.dcc.ac.uk/"/&amp;gt; &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="font-family:'Lucida Grande', serif;font-size:100%;"&gt;&lt;span class="Apple-style-span"  style=" white-space: pre-wrap;font-size:11px;"&gt;&amp;lt;/foaf:Person&amp;gt; &lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="white-space: pre-wrap; "&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;What could I do with that now?&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1349075615660288706?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1349075615660288706/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/linked-data-and-staff-contact-pages.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1349075615660288706'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1349075615660288706'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/linked-data-and-staff-contact-pages.html' title='Linked data and staff contact pages'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-9125810657529132626</id><published>2010-01-13T15:34:00.004Z</published><updated>2010-01-13T16:15:24.088Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='PDF'/><category scheme='http://www.blogger.com/atom/ns#' term='Scolarly HTML'/><category scheme='http://www.blogger.com/atom/ns#' term='Science publishing'/><category scheme='http://www.blogger.com/atom/ns#' term='publishing'/><title type='text'>Scholarly HTML would be nice, but...</title><content type='html'>I'm quite interested in the idea of &lt;a href="http://ptsefton.com/2009/08/19/towards-scholarly-html.htm"&gt;Scholarly HTML&lt;/a&gt;, as espoused in Pete Sefton's blog, and I've commented on some of Peter Murray Rust's hamburger PDF comments previously (although I do think a lot of people confuse wild PDF with well-made, should one say Scholarly PDF). I've always been slightly worried by one thing though.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A well-known advantage of PDF is that it pretty much assures I can save a document, share it, move it around etc and it will still be intact and readable. That's one of the reasons it's so popular.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Mostly we don't do that with HTML. Mostly we just point to it. But if I see an article these days, I want it on my computer if I'm allowed; this let's me study it at leisure, drop it in my Mendeley system, etc. As pointed out, that works a treat with PDF, and pretty well with Word or OpenOffice documents as well. This applies even where the document is quite heavily compound, with many embedded images, tables etc.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;But if I try saving a HTML document to my hard disk, nothing very standard happens. OK, if I use Safari on my Mac, I get a .webarchive file, which is quite nice as I can do all the things with it that I could do with a PDF and Word etc, and when I open it later it will be as it was before, with all the images in place. But neither IE nor Firefox seem capable of opening a .webarchive file.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;If I try saving the same article from Firefox, I get a .html file with the main article in it, and a directory with associated files in it (eg images). Safari does seem capable of opening this combination, but it's pretty ugly, and hard to move around. I haven't tried IE as I don't have easy access to it.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Is there in existence or development a standard approach to packaging the HTML and associated files that would be as convenient as the .webarchive, but usable across all browsers? If so, Scholarly HTML would be that little bit closer!&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-9125810657529132626?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/9125810657529132626/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/scholarly-html-would-be-nice-but.html#comment-form' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/9125810657529132626'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/9125810657529132626'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/scholarly-html-would-be-nice-but.html' title='Scholarly HTML would be nice, but...'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-5698669678663412616</id><published>2010-01-13T15:24:00.003Z</published><updated>2010-01-13T15:33:39.515Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Web preservation'/><category scheme='http://www.blogger.com/atom/ns#' term='Persistent IDs'/><title type='text'>Persistence of domain names</title><content type='html'>&lt;span class="Apple-style-span" style="font-family: Helvetica; font-size: medium; "&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;I had a chat before Christmas with Henry Thompson, who works both in Edinburgh Informatics and also on the W3C TAG. Insofar as the Internet is important in sustaining long term access to information in digital form, there is a sustainability problem that we rather seem to have ignored. Everything on the Internet (literally) depends on domain names, and these are only ever rented. There is no mechanism for permanently reserving a domain name. Domain names can be lost by mistake (overlooking a bill, perhaps having moved in the interim and not informed the relevant domain name registrar), but they can also be lost on business failure. Although domain names can be a business asset, I understand that the registrars have some discretion on transfers, and in particular one cannot make a "domain name will" seeking transfer of the domain name to some benevolent organisation. Note, the mechanism for renting domain names has sustainability advantages, providing sustainability to important services that underpin the DNS.&lt;br /&gt;&lt;br /&gt;There are two kinds of problem, one on a massive scale and one more fine-grained. The massive problem is that the entire infrastructure of the Internet depends on URIs, most of which are http URIs that in turn depend on the domain name system. So there are a number of organisations whose domain names are embedded in that infrastructure in a way and to an extent that is very difficult to change. W3C is clearly such an organisation. Many of these organisations seem rather fragile (not a comment on W3C, by the way, although its sustainability model is opaque to me). Should they fail and the domain names disappear, the relevant URIs will cease to work and various pieces of Internet machinery will fall apart.&lt;/span&gt;&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman', serif;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Helvetica; font-size: medium; "&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;(By the way, this does seem to be one case where a persistent ID that is independent of the original domain, such as a DOI, has advantages over a HTTP URI plus a redirect table. If the domain name no longer exists, you can't get to a redirect, whereas someone can still relink the DOI to a new location.)&lt;br /&gt;&lt;br /&gt;On the more fine-grained scale, many documents (particularly in HTML) are not easily separable from their location, depending on other local files and documents. In addition of course, documents in some sense exist through their citations or bookmarks, that begin to exist separately from the document. Moving a document to a new domain can make it "fail" or disappear. So sustainability is linked to the domain as well as the other preservation factors.&lt;br /&gt;&lt;br /&gt;This seems to me to be not at all a technical problem, but it seems to have legal/regulatory, governance, social, business and economic aspects.&lt;br /&gt;&lt;br /&gt;Among the solutions might be creating a new top level domain designed for persistence, with different rules of succession, etc. Another (either instead of or in conjunction with the first) might be creating an organisation designed for persistence, to hold endowed domain names. Somehow the ongoing revenue stream for those underpinning services must be retained indefinitely into the future.&lt;br /&gt;&lt;br /&gt;We don't think we have the answers, but we do think there is a problem here; I'm not yet sure if we have articulated it accurately at all. I would appreciate any comments. Thanks,&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-5698669678663412616?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/5698669678663412616/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/persistence-of-domain-names.html#comment-form' title='8 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5698669678663412616'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5698669678663412616'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/persistence-of-domain-names.html' title='Persistence of domain names'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>8</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-8873600622336308918</id><published>2010-01-13T13:05:00.001Z</published><updated>2010-02-02T19:36:25.625Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='DCC'/><title type='text'>Director of Digital Curation Centre: still time to apply</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;I’m particularly keen that there be a good slate of candidates for this post, for which applications close on Friday 15 January, 2010. The details can be found at &lt;/span&gt;&lt;a href="http://www.jobs.ed.ac.uk/vacancies/index.cfm?fuseaction=vacancies.detail&amp;amp;vacancy_ref=3012085"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;http://www.jobs.ed.ac.uk/vacancies/index.cfm?fuseaction=vacancies.detail&amp;amp;vacancy_ref=3012085&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt; (sorry about the dodgy URL; I hope it works)&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;The further details say&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;"The mission of the Digital Curation Centre is to help build capacity, capability and skills for data curation across the UK higher education research community, while supporting and promoting emerging data curation practice.  It also has a key role in supporting JISC, especially its new research data management programme. Overall, the DCC is an agent for change, committed to the diffusion of best practice in the curation of digital research data across the Higher Education sector, and providing an authoritative source of advocacy, resources and guidance to the UK research community.  This mission is informed by five priorities:&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left:36.0pt;text-indent:-18.0pt;mso-list:l0 level1 lfo1"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;•&lt;/span&gt;&lt;span style="font:7.0pt &amp;quot;Times New Roman&amp;quot;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;       &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;to identify, gather, record and disseminate curation best practice, providing access to resources, tools, training and information that will equip data practitioners to make informed decisions regarding the management of their data assets;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left:36.0pt;text-indent:-18.0pt;mso-list:l0 level1 lfo1"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;•&lt;/span&gt;&lt;span style="font:7.0pt &amp;quot;Times New Roman&amp;quot;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;       &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;to facilitate knowledge exchange between those currently and newly engaged in the generation and management of digital research data;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left:36.0pt;text-indent:-18.0pt;mso-list:l0 level1 lfo1"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;•&lt;/span&gt;&lt;span style="font:7.0pt &amp;quot;Times New Roman&amp;quot;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;       &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;to build and support a community of informed practitioners that has the capacity to sustain itself, with the capability to manage and curate its data appropriately;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left:36.0pt;text-indent:-18.0pt;mso-list:l0 level1 lfo1"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;•&lt;/span&gt;&lt;span style="font:7.0pt &amp;quot;Times New Roman&amp;quot;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;       &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;to identify crucial and important innovations in data curation, and seek additional resources to provide them;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left:36.0pt;text-indent:-18.0pt;mso-list:l0 level1 lfo1"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;•&lt;/span&gt;&lt;span style="font:7.0pt &amp;quot;Times New Roman&amp;quot;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;       &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;to support JISC, especially in its repository, preservation and data management programmes.&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;To achieve this, the Director must be a persuasive advocate for better curation and management of research data, on a national or international scale. Able to listen and engage with researchers and with research management, publishers and research funders, the Director will build a strong, shared vision of the changes needed, and the ability (working with others in the DCC and beyond) to mobilise the community towards that end."&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;So if this fits you, or you know someone that it fits, please persuade that person to apply!&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;By the way, although the DCC may not escape further budget cuts, like all public services in the UK, I have been told that we are funded from core JISC funding rather than capital funding, and as such Phase 3 will not be curtailed as the proposed JISC Managing Research Data programme has been.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;[NOTE: This vacancy has now CLOSED; no further applications will be accepted!]&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-8873600622336308918?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/8873600622336308918/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/director-of-digital-curation-centre.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8873600622336308918'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8873600622336308918'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/director-of-digital-curation-centre.html' title='Director of Digital Curation Centre: still time to apply'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-5669024272654263842</id><published>2010-01-05T17:12:00.003Z</published><updated>2010-01-05T17:23:06.885Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Digital Curation'/><category scheme='http://www.blogger.com/atom/ns#' term='DCC'/><title type='text'>Digital Curation Centre User Survey 2009: Highlights</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-size:6;"&gt;&lt;span class="Apple-style-span" style="font-size: 19px;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;span class="Apple-style-span"  style="font-size:6;"&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;My colleague Angus Whyte has provided the following brief summary of two surveys carried out in Phases 1 and 2 of the Digital Curation Centre, in 2006 and 2009 respectively, as part of our evaluations. In retrospect, we might have done better revising the questions for the second survey rather more than we did; nevertheless I thought it worth while sharing this with you.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Angus writes:&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;In 2009 DCC users were surveyed, repeating a similar survey carried out in 2006. In the highlights below we draw conclusions both from the more recent results and also changes over the 3 year period. Both surveys were publicised on the DCC website and via several mailing lists, principally the DCC-Associates and (in 2009) the JISC sponsored Research-Dataman list.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Our conclusions take into account that the online questionnaire was self-completed by a self-selected group of respondents (75 in 2009 and 125 in 2006). DCC Associates (640 approx.) provided the bulk of the responses[1]. The results indicated broad patterns, relatively wide differences and consistent responses over the two surveys, even though these are not taken to be statistically representative.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Highlights&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;In both surveys around 90% of respondents are familiar with the term ‘digital curation’ and regard it as a critical issue within their project or unit. The DCC is consistently given as the main source of information on curation issues by around 70% of respondents, with “on the job challenges/ research” second at around 60%.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Between the two surveys there is a large jump (from 13% to 32%) in the number of respondents indicating that DCC has been “very effective” in raising awareness about digital curation, and those believing it to be “slightly effective” has correspondingly fallen from 53% to 31%.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Of a list of DCC resources, five are identified as “most helpful” by at least 1 in 5 of the 2009 survey respondents, these being (in descending order) the DCC website, Briefing Papers (of various sorts), the DCC Curation Lifecycle Model, Case Studies, and the Digital Curation Manual.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Respondents universally associate digital curation with “ensuring the long-term accessibility and re-usability of digital information”, and large majorities (around 90%) also relate it to “performing archiving activities on digital information such as selection, appraisal and retention” and “ensuring the authenticity, integrity and provenance of digital information are maintained over time”. Rather lower but still significant numbers (around 60%) associate digital curation with “managing digital information from its point of creation” and “managing risks to digital information” – although many more highlight the latter in 2009 (up to 84% from 61%).&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Curation or preservation addresses risks to the respondents’ organisations with “loss of organisational memory” consistently topping their list (identified by around 75% of respondents) and “business risks” second, identified by just under half, again across both surveys.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;More than two thirds indicate that their main reasons for curating and preserving digital information are its educational/research or historical value; in both years a minority cites other reasons. Similarly, the main obstacles are indicated as financial or staff resources, with around half also indicating lack of awareness or appropriate policies.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;For around 40% of respondents, management and preservation of digital information has an indefinite timescale. For a further 15% or so it is “beyond the life of the project/organisation”, and similar numbers indicate these are tasks “for the life of the project/organisation”.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;The 2009 survey respondents are no strangers to the ‘data deluge’, most dealing with at least 100Gb and some (7%) more than 100Tb. Overall 79% expect this to increase in the next two years, surprisingly 3% do not, while 7% do not know. Most need to manage a mixture of open and proprietary formats, and report a wide variety of formats in use, predominantly common office applications, PDF documents and multimedia formats. Curation and preservation challenges are most frequently identified with obsolete proprietary formats. Image, video, and geospatial data are also often identified as challenges, as are web sites combining these. &lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Respondents were also asked in 2009 about re-use, and around a third indicate that research data is re-used internally, with similar numbers offering data generated by their project/unit for re-use by others, or re-using external data.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Access issues facing research projects/units are identified in both surveys and along similar lines; intellectual property rights (e.g. copyright) is the most frequently cited issue, followed by “privacy or ethical issues”, however “embargo on research findings” is least prevalent, identified by only a fifth of respondents.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Asked about funding for curation and preservation, responses show no clear picture. Around half of 2009 respondents indicate funding is “accounted for in project or institutional budget”. A large minority have no explicit funding for curation and preservation, and where resources are available these are pooled from other funded areas (e.g. IT budget for project or organisation) or research grants. Spending on curation/preservation is less than £50,000  (for around half of those respondents who were aware of this). Around half are unsure whether spending will increase or decrease, with the remainder being evenly split.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Detailed questions and response data are available on request.&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Angus Whyte, Digital Curation Centre&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;[1]       The DCC Associates membership list includes UK data organisations, leading data curators, overseas and supranational standards agencies, and industrial/business communities. Currently research data creators are under-represented (information from registration details). &lt;/span&gt;&lt;/p&gt;&lt;/span&gt;&lt;p&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-size:6;"&gt;&lt;span class="Apple-style-span" style="font-size: 19px; "&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;div style="mso-element:footnote-list"&gt;&lt;div style="mso-element:footnote" id="ftn1"&gt;  &lt;/div&gt;  &lt;/div&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-5669024272654263842?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/5669024272654263842/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/digital-curation-centre-user-survey.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5669024272654263842'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5669024272654263842'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2010/01/digital-curation-centre-user-survey.html' title='Digital Curation Centre User Survey 2009: Highlights'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-3638285228191617664</id><published>2009-12-17T23:07:00.004Z</published><updated>2009-12-17T23:23:44.382Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Semantic web'/><category scheme='http://www.blogger.com/atom/ns#' term='Data publishing'/><category scheme='http://www.blogger.com/atom/ns#' term='Research data'/><category scheme='http://www.blogger.com/atom/ns#' term='Science publishing'/><category scheme='http://www.blogger.com/atom/ns#' term='publishing'/><title type='text'>More activity on semantic publishing</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;If you saw tweets from &lt;a href="http://twitter.com/cardcc"&gt;@cardcc&lt;/a&gt; today, you might realise I’ve been very interested in a couple of recent developments in semantic publishing. I wrote &lt;a href="http://digitalcuration.blogspot.com/2009/11/data-and-journal-article.html"&gt;earlier&lt;/a&gt; about linking data to journal articles, including David Shotton’s adventures in semantic publishing. David’s work was one of those included in the review article in the Biochemical Journal by Attwod, Kell, McDermott et al (2009). The article ranged over the place of ontologies and databases, science blogs, and various experiments. These included&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;RSC and Project Prospect,&lt;/li&gt;&lt;li&gt;The ChemSpider Journal of Chemistry,&lt;/li&gt;&lt;li&gt;The FEBS Letters experiment,&lt;/li&gt;&lt;li&gt;PubMed Central and BioLit,&lt;/li&gt;&lt;li&gt;Shotton’s PLoS experiment,&lt;/li&gt;&lt;li&gt;Elsevier Grand Challenge,&lt;/li&gt;&lt;li&gt;Liquid Publications,&lt;/li&gt;&lt;li&gt;The semantic Biochemical Journal experiment.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;                &lt;p class="MsoNormal"&gt;The latter was the real focus of the article, available in PDF, but which if read through a special reader called Utopia Document displayed some active capabilities. These included the ability to visualise and rotate 3-d images of proteins, to see tables represented as graphs (or vice versa) and to link to entries in nucleic acid databases. The capabilities were perhaps a bit awkward to spot and to manipulate, but still interesting. This article is (gold) open access. Other articles in the issue have also been instrumented in this way.&lt;/p&gt;  &lt;p class="MsoNormal"&gt; It’s clearly early days for Utopia, and I wasn’t wholly impressed with it as a PDF reader, but I was certainly very excited at some of what I read and saw.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;I also read today a very different article (I think not available on open access), by Ruthensteiner and Hess (2008). They describe the processes in making 3-d models of biological specimens, and presenting them in PDF, readable by a standard Acrobat Reader. The 3-d capability was at least as good as if not better than the Utopia results.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Because it’s getting late, I’ll end with my last tweet: &lt;/p&gt;&lt;p class="MsoNormal"&gt;“My head is &lt;b&gt;spinning&lt;/b&gt;&lt;span style="font-weight:normal"&gt; with semantic article possibilities. I hope some get picked up in new &lt;a href="http://search.twitter.com/search?q=%23jiscmrd"&gt;#jiscmrd&lt;/a&gt; proposals, see &lt;a href="http://www.jisc.ac.uk/fundingopportunities/funding_calls/2009/12/1409researchdata.aspx"&gt;http://www.jisc.ac.uk/fundingopportunities/funding_calls/2009/12/1409researchdata.aspx&lt;/a&gt;"&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;Attwood, T. K., Kell, D. B., McDermott, P., Marsh, J., Pettifer, S. R., Thorne, D., et al. (2009). &lt;/span&gt;&lt;span lang="EN-US"&gt;&lt;a href="http://www.biochemj.org/bj/424/0317/bj4240317.htm"&gt;Calling International Rescue: knowledge lost in literature and data landslide!&lt;/a&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt; &lt;i&gt;The Biochemical journal&lt;/i&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;, &lt;i&gt;424&lt;/i&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;(3), 317-33. doi: 10.1042/BJ20091474.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;Ruthensteiner, B., &amp;amp; Hess, M. (2008). Embedding 3D models of biological specimens in PDF publications. &lt;i&gt;Microscopy research and technique&lt;/i&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;, &lt;i&gt;71&lt;/i&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;(11), 778-86. doi: 10.1002/jemt.20618. &lt;/span&gt;&lt;span lang="EN-US"&gt;&lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/18785246"&gt;Pubmed abstrac&lt;/a&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;t.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-3638285228191617664?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/3638285228191617664/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/more-activity-on-semantic-publishing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3638285228191617664'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3638285228191617664'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/more-activity-on-semantic-publishing.html' title='More activity on semantic publishing'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7712146577270064415</id><published>2009-12-15T23:11:00.003Z</published><updated>2009-12-15T23:20:29.977Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Research data'/><category scheme='http://www.blogger.com/atom/ns#' term='Linked Data'/><category scheme='http://www.blogger.com/atom/ns#' term='RDF'/><title type='text'>Linked Statistics &amp; other data</title><content type='html'>Someone pointed me to the blog Jeni's Musings, written by Jeni Tennison. I don't know who Jeni is, but there's some really interesting stuff here, with some obvious links to UK Government Data activity. Among other things, there's a post about &lt;a href="http://www.jenitennison.com/blog/node/132"&gt;expressing statistics with RDF&lt;/a&gt; (it looks pretty horribly verbose, but it's the first attempt I've seen to address some real data that could be relevant to science research), an thoughtful post about the &lt;a href="http://www.jenitennison.com/blog/node/134"&gt;provenance of linked data&lt;/a&gt;, and a series of 5 posts (from &lt;a href="http://www.jenitennison.com/blog/node/135"&gt;here&lt;/a&gt; to &lt;a href="http://www.jenitennison.com/blog/node/139"&gt;here&lt;/a&gt;) on creating linked data.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7712146577270064415?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7712146577270064415/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/linked-statistics-other-data.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7712146577270064415'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7712146577270064415'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/linked-statistics-other-data.html' title='Linked Statistics &amp; other data'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-3349199704298072312</id><published>2009-12-09T17:39:00.003Z</published><updated>2009-12-09T17:49:36.972Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='DCC'/><title type='text'>Director, The Digital Curation Centre (DCC)</title><content type='html'>&lt;div&gt;So, time to come fully out into the open, after various coy hints over the past week or so. I'm planning to retire around the start of DCC Phase 3. Adverts are starting to appear with the following text:&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;"We wish to appoint a new Director to take the DCC forward into an exciting third phase, from March 2010. You must be a persuasive advocate for better management of research data on a national and international scale. Able to listen to and engage with researchers and with research management, publishers and research funders, you will build a strong, shared vision of the changes needed, working with and through the community. You should have a sound knowledge of all aspects of digital curation and preservation, an understanding of higher education structures and processes, and appropriate management skills to be the guiding force in the DCC’s progress as an effective and enduring organisation with an international reputation. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"This post is fixed term for three years. [I would suggest: in the first instance...]&lt;/div&gt;&lt;div&gt;Closing date: Friday 15th January 2010."&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;This is a great job, and I think an important one. I have been bending the ears of many people in the last couple of weeks to ask them to think of appropriate people to point this advert at (yes, I know that's rotten English; it's been a long week).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Further details will be on the University of Edinburgh's jobs web site, &lt;a href="http://www.jobs.ed.ac.uk/"&gt;http://www.jobs.ed.ac.uk/&lt;/a&gt; (they weren't there when I checked a few minutes ago, maybe tomorrow).&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-3349199704298072312?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/3349199704298072312/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/director-digital-curation-centre-dcc.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3349199704298072312'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3349199704298072312'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/director-digital-curation-centre-dcc.html' title='Director, The Digital Curation Centre (DCC)'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7764484922560066111</id><published>2009-12-08T23:36:00.002Z</published><updated>2009-12-08T23:39:11.891Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='sustainability'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Curation'/><category scheme='http://www.blogger.com/atom/ns#' term='data curation'/><category scheme='http://www.blogger.com/atom/ns#' term='DCC'/><title type='text'>Leadership opportunities</title><content type='html'>Those interested in leadership in Digital and Data Curation should keep an eye on the relevant UK press and lists over the next week or so for anything of interest...&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7764484922560066111?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7764484922560066111/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/leadership-opportunities.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7764484922560066111'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7764484922560066111'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/leadership-opportunities.html' title='Leadership opportunities'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-5268906091468016344</id><published>2009-12-08T23:23:00.006Z</published><updated>2009-12-09T17:14:02.899Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='IJDC'/><title type='text'>Last volume 4 issue of IJDC just published</title><content type='html'>&lt;blockquote&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;/blockquote&gt;On Monday this week, we published volume 4, issue 3 of &lt;a href="http://www.ijdc.net/"&gt;IJDC&lt;/a&gt;. From one respect, this was a miracle of speed publishing, as 7 of the peer-reviewed articles had just been delivered the previous week as part of the International Digital Curation Conference. But we also included an independent article, plus 1 peer-reviewed paper and 3 articles with a rather longer gestation, originating in papers at iPres 2008! There are good and bad reasons for that too lengthy delay.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I wrote in the editorial that I would reproduce part of it for this blog, to attract comment, so here that part is.&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;"But first, some comments on changes, now and in the near future, that are needed. One major change is that Richard Waller, our indefatigable Managing Editor, has decided to concentrate his energies on Ariadne. Richard has done a grand job for us over the past few years, in his supportive relationships with authors, his detailed and careful editing, and in commissioning general articles. To quote one author: “I note that the standard of Richard’s reviewing is much better than [a leading publisher's]; they let an article of mine through with very bad mistakes in the references without flagging them for review, and were not so careful about flagging where they had changed my text, not always for the better”. The success of IJDC is in no small way a result of Richard’s sterling efforts over the years. I am very grateful to him, and wish him well for the future: Ariadne authors are very lucky!&lt;/blockquote&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;"Looking to the future of IJDC, we will have Shirley Keane as Production Editor, working with Bridget Robinson who provides a vital link to the International Digital Curation Conference, and several other members of the DCC community. We are seeking to work more closely with the Editorial Board in the commissioning role and to draw on the significant expertise of this group.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"In parallel, we have been reviewing how IJDC works, and are proposing some changes to enhance our business processes and I shall be writing to the Editorial Board shortly. For example, we expect to include articles in HTML- as well as PDF format, to introduce changes to reduce the publishing lead times, and a possible new section with particular practitioner orientation. As part of reduced publishing lead times, we are considering releasing articles once they have been edited after review, leading to a staggered issue which is “closed” once complete. I’m planning to repeat this part of the editorial in the Digital Curation Blog [here], perhaps with other suggestions, and comments [here] would be very welcome."&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;Oh, we then did  a little unashamed puffery...&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;"We are, of course, very interested in who is reading IJDC, and the level of impact it is having on the community. In order to find out, Alex Ball from UKOLN/DCC has been trying several different approaches in order to get as full a picture as possible.&lt;/div&gt;&lt;div&gt;&lt;div&gt;One approach we have used is to examine the server log for the IJDC website. The statistics for the period December 2008 to June 2009 show that around 100 people visit the site each day, resulting in about 3,000 papers and articles being downloaded each month. It was pleasing to discover we have a truly global readership; while it is true that a third of our readers are in the US and the UK, our content is being seen in around 140 countries worldwide, from Finland to Australia and from Argentina to Zimbabwe. As one would expect, we principally attract readers from universities and colleges, but we also receive visits from government departments, the armed forces and people browsing at home.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;"The Journal is also having a noticeable impact on academic work. We have used Google Scholar to collect instances of journal papers, conference papers and reports citing the IJDC. In 2008, there were 44 citations to the 33 papers and articles published in the Journal in 2006 and 2007, excluding self-citations, giving an average of 1.33 citations per paper. Overall, three papers have citation counts in double figures. One of our papers (“Graduate Curriculum for Biological Information Specialists: A Key to Integration of Scale in Biology” by Palmer, Heidorn, Wright and Cragin, from Volume 2, Issue 2) has even been cited by a paper in Nature, which gives us hope that digital curation matters are coming to the attention of the academic mainstream."&lt;/div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;/div&gt;&lt;div&gt;OK, so we're not Nature! Nevertheless, we believe there is a valuable role for IJDC, and we'd like your help in making it better. Suggestions please...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;(I made this plea at our conference, and someone approached me immediately to say our RSS feed was broken. It seems to work, at least from the title page. So if it still seems broken, please get in touch and explain how. Thanks)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-5268906091468016344?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/5268906091468016344/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/last-volume-4-issue-of-ijdc-just.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5268906091468016344'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5268906091468016344'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/last-volume-4-issue-of-ijdc-just.html' title='Last volume 4 issue of IJDC just published'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7910561111526871954</id><published>2009-12-08T21:39:00.003Z</published><updated>2009-12-09T06:04:39.492Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Delegate Interview: Melissa Cragin and Allen Renear</title><content type='html'>Melissa Cragin and Allen Renear [corrected; apologies] from the University of Illinois, who will be chairing IDCC next year in Chicago, give their reactions to IDCC 09 and a hint of their preparations for next year's event in this final, double-act, video interview....&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true"&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=8055927&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1"&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=8055927&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/8055927"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://vimeo.com/"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7910561111526871954?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7910561111526871954/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-delegate-interview-melissa.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7910561111526871954'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7910561111526871954'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-delegate-interview-melissa.html' title='IDCC 09 Delegate Interview: Melissa Cragin and Allen Renear'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2535617906796396977</id><published>2009-12-08T21:35:00.002Z</published><updated>2009-12-08T21:38:30.634Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Delegate Interview: Neil Grindley</title><content type='html'>JISC Programme Manager and session chair Neil Grindley gives us his response to IDCC 09 in this quick video interview as the event draws to a close...&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=8055712&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=8055712&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/8055712"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://vimeo.com"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2535617906796396977?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2535617906796396977/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-delegate-interview-neil.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2535617906796396977'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2535617906796396977'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-delegate-interview-neil.html' title='IDCC 09 Delegate Interview: Neil Grindley'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1419801042392633847</id><published>2009-12-08T21:30:00.002Z</published><updated>2009-12-08T21:33:52.327Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Keynote: Prof. Ed Seidel</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_rEQRc2BOVZs/Sx7GFQJW7HI/AAAAAAAAAHk/sgeYgDL4Oos/s1600-h/DSC_0213.JPG"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 133px;" src="http://4.bp.blogspot.com/_rEQRc2BOVZs/Sx7GFQJW7HI/AAAAAAAAAHk/sgeYgDL4Oos/s200/DSC_0213.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5412981595681778802" /&gt;&lt;/a&gt;In a content-packed keynote talk, Ed Seidel wanted to give us a preview about what types of project are driving the National Science Foundation's need to think about data, the cyber infrastructure questions, the policy questions and the cultural issues surrounding data that are deeply routed in the scientific community.&lt;br /&gt;&lt;br /&gt;To illustrate this initially, Seidel gave the example of some visualisation work on colliding black holes that he had conducted whilst working in Germany with data collected in Illinois, explaining that in order to achieve this he had to do a lot of work on remote visualisation and high performance networking – but that moving the data by network to create the visualisations was not practical, so the team had to fly to Illinois to do the visualisations, then bring the data back.  He also cited projects that are already expecting to generate an exabyte of data – vastly more than is currently being produced – so the problem of moving data is only going to get bigger.&lt;br /&gt;&lt;br /&gt;Seidel looked first to the cultural issues that influence scientific methods when it comes to the growing data problem.  He demonstrated the 400-year-old scientific model of collecting data in small groups or as individuals, writing things down in notebooks and using small amounts of data in that could be measured in kilobytes in modern terms, with calculations carried out by hand.  This has not change from Galileo and Newton through to Stephen Hawkins in the 1970's.  However, within 20-30 years, the way of doing science changed – with teams of people working on projects using high performance computers to create visualisations of much larger amounts of data.  This is a big culture shift, and Seidel pointed out that many senior scientists are still trained in the old method.  You now need larger collaborative teams to solve problems and manage the data volumes to do true data-driven science.  He used the example of the Hadron Collider, where scientists are looking at generating tens of petabytes of data, which need to be distributed globally to be analysed – with around 15,000 scientists working on around six experiments.&lt;br /&gt;&lt;br /&gt;Seidel then went on to discuss how he sees this trend of data sharing developing, using the example of the recent challenge of predicting the route of a hurricane.  This involved the sharing of data between several communities to achieve all the necessary modelling to respond to the problem in a short space of time.  Seidel calls the groups solving these complex problems “grand challenge communities”.  The scientists involved with have three or four days to share data and create models and simulations to solve these problems, but will not know each other!  The old modality of sharing data with people that you know will not work and so these communities will have to find ways to come together dynamically to share data if they are going to solve these sorts of problems.  Seidel predicted that these issues are going to drive both technical development and policy change.&lt;br /&gt;&lt;br /&gt;To illustrate the types of changes already in the pipeline, Seidel cited colleagues who are playing with the use of social networking technologies to help scientists to collaborate – particularly Twitter and Facebook.  Specifically, they have set up a system whereby their simulation code tweets its status, and have also been uploading the visualisation images directly into Facebook in order to share it.&lt;br /&gt;&lt;br /&gt;Seidel noted that high dimensional, collaborative environments and tremendous amounts of bandwidth are needed, so technical work will be required.  The optical networks often don't exist – with universities viewing such systems like the plumbing and funding bodies not looking to support the upgrade of such infrastructure.  Seidel argued that we need to find ways to catalyse this sort of investment.&lt;br /&gt;&lt;br /&gt;To summarise, Seidel highlighted two big challenges in science trends at the moment: multi-skilled collaborations and the dominance of data, which are both tightly linked.  He explained that he had calculated that compute, data and networks have grown 9-12 orders of magnitude in 20-30 years after 400 years unchanged, which shows the scale of the change and the change in culture that it represents.&lt;br /&gt;&lt;br /&gt;NSF has a vision document which highlights four main areas – virtual organisations for distributed communities, high performance computing, data visualisation, and learning and work practices.  Focusing on the “Data and Visualisation” section, Seidel quoted their dream for data to be routinely deposited in a well-documented form, regularly and easily consulted and analysed by people and are openly accessible, protected and preserved.  He admitted this is a dream that is no where near being realised yet.   He recognised that there need to be incentives for the changes and new tools to deal with the data deluge.  They are looking to develop a national data framework, but emphasised that the scientific community really needs to take the issues to heart.&lt;br /&gt;&lt;br /&gt;Taking the role of the scientist, Seidel took us through some of the questions and concerns which a research scientist may raise in the face of this cultural shift.  They included concerns about replication of results – which Seidel noted could be a particular problem when services come together in an ad hoc way, but needs to be addressed if the data produced is to be credible.&lt;br /&gt;&lt;br /&gt;Seidel moved on to discuss the types of data that need to be considered, in which he included software.  He stressed that software needs to be considered as a type of data and therefore needs to be given the same kind of care in terms of archiving and maintenance as traditional scientific collection or observation data.  He also includes publications as data, as many of these are now in electronic form.&lt;br /&gt;&lt;br /&gt;In discussing the hazards faced, Seidel noted that we are now producing more data each year than we have done in the entirety of human history up to this point – which demonstrates a definite phase change in the amount of data being produced.  &lt;br /&gt;&lt;br /&gt;The next issue of concern Seidel highlighted was that of providence – particularly how we collect the metadata related to the data that we are considering how to move around.  He admitted that we just simply don't know how to do this annotation at the moment, but this is being worked on.&lt;br /&gt;&lt;br /&gt;Having identified these driving factors, Seidel explained the observations and workgroup structures that NSF has in place to think more deeply and investigate solutions to these problems, which includes the DataNet project.  $100 million is being invested in five different projects as part of this programme.  Seidel hopes that this investment will help catalyse the development of a data-intensive science culture.  He made some very “apple-pie” idealistic statements about how the NSF sees data, and then used these to explain why the issues are so hard, emphasising the need to engage the library community who have been curating data for centuries, and the need to consider how to enforce data being made available post-award. &lt;br /&gt;&lt;br /&gt;Discussions at the NSF are suggesting that each individual project should have a data management policy which is then peer-reviewed.  They don't currently have consistency, but but this is the goal.  &lt;br /&gt;&lt;br /&gt;In conclusion, Seidel emphasised that there are many more difficult cases are coming... However, the benefits of making data available and searchable – potentially with the help of searchable journals and electronic access to data – are great for the progress of science, and the requirement to make many more things available than before if percolating down from the US Government to the funding bodies.  Open access to information online is a desirable priority and clarification of policy will be coming soon.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1419801042392633847?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1419801042392633847/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-keynote-prof-ed-seidel.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1419801042392633847'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1419801042392633847'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-keynote-prof-ed-seidel.html' title='IDCC 09 Keynote: Prof. Ed Seidel'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_rEQRc2BOVZs/Sx7GFQJW7HI/AAAAAAAAAHk/sgeYgDL4Oos/s72-c/DSC_0213.JPG' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-3594103778929434139</id><published>2009-12-08T15:51:00.002Z</published><updated>2009-12-08T16:05:01.029Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Keynote: Timo Hannay</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_rEQRc2BOVZs/Sx55D1f17II/AAAAAAAAAHc/7lGtUWuUzbU/s1600-h/IMG_1278.JPG"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_rEQRc2BOVZs/Sx55D1f17II/AAAAAAAAAHc/7lGtUWuUzbU/s200/IMG_1278.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5412896908953119874" /&gt;&lt;/a&gt;Timo Hannay presented a talk entitled 'From Web 2.0 to the Global Database”, providing a publishing perspective on the need for cultural change in scientific communication.&lt;br /&gt;&lt;br /&gt;Hannay took a step back to take a bigger picture view.  He began by giving an overview to his work at Nature, noting that the majority of their business is through the web – although not everyone reads the work electronically, they do access the content through the web.  He then explained how journals are being coming more structured, with links providing supplementary links and information.  He admitted that this information is not yet structured enough, but it is there – making the journal more like databases.&lt;br /&gt;&lt;br /&gt;Hannay moved on to explain that Nature is getting involved in database publishing.  They help to curate and peer-review database content and commission additional articles to give context to the data.  This is a very different way of being a science publisher – so the change is not just for those doing the science!&lt;br /&gt;&lt;br /&gt;After taking us through Jim Gray's four scientific paradigms, Hannay asked us to think back to a talk by Clay Skirky in 2001, which led to the idea that the defining characteristic of the computer age is not the devices, but the connections.  If a device is not connected to the network, it hardly seems like a computer at all.  This led Tim O'Reilly to develop the idea of the Internet Operating System, which morphed into the name “Web 2.0”.  O'Reilly looked at the companies that survived and thrived after the dot com bubble and created a list of features which defined Web 2.0 companies, including the Long Tail, software as a service, peer-to-peer technologies, trust systems and emergent data, tagging and folksonomies, and “Data as the new 'Intel Inside'”.... the idea that you can derive business benefit from powering data behind the scenes.&lt;br /&gt;&lt;br /&gt;Whilst we have seen the Web 2.0 affect science, science blogging hasn't really taken off as much as it could have done – particularly in the natural sciences – and is still not a main stream activity.  However, Hannay did note some of the long term changes we are seeing as a result of the web and the tools it brings: increasing specialisation, more information sharing, smaller 'minimum publishable unit', better attribution, merging of journals and databases – with journals providing more structure to databases – and new roles for librarians, publishers and others.  Hannay asserted that these changes are leading, gradually, to a speeding up of discovery.&lt;br /&gt;&lt;br /&gt;Hannay took us through some of the resources that are available on the web, from Wikipedia to PubChem and ChemSpider, where the data is structured and annotated through crowd sourcing to make the databases searchable and useable.  &lt;br /&gt;&lt;br /&gt;He asserted that we are moving away from the cottage-industry model of science, with one person doing all the work in the process from designing the experiment to writing the paper.  We are now seeing whole teams with specialisms collaborating across time and space in a more industrial-scale science.  Different areas of science at at different stages with this.&lt;br /&gt;&lt;br /&gt;Hannay referred to Chris Anderson's claim on Wired Magazine that we no longer need theory.  He rejected this, but did agree that more is different, so we will be seeing changes.  He gave the example of Google, which didn't develop earlier in the history of the web simply because it was not necessary until the web reached a certain degree of scale for it to be useful.&lt;br /&gt;&lt;br /&gt;As publishers, Hannay believes that have a role to play in helping to capture, structure and preserve data.  Journals are there to make information more readable for human beings, but they need think about how they present information to help both humans and computers to search and access information as both are now just as important.&lt;br /&gt;&lt;br /&gt;All human knowledge is interconnected and the associations between facts are just as important as the facts themselves.  As we reach that point when a computer not connected to the network is not really a computer, Hannay hopes we will reach a point where a fact not connected to other facts in a meaningful way will hardly be considered a fact. One link, one tag, one ID at a time, we are building a global database.  This may be vast and messy and confusing, but it will be hugely valuable – like the internet itself as the global computer operating system.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-3594103778929434139?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/3594103778929434139/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-keynote-timo-hannay.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3594103778929434139'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3594103778929434139'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-keynote-timo-hannay.html' title='IDCC 09 Keynote: Timo Hannay'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_rEQRc2BOVZs/Sx55D1f17II/AAAAAAAAAHc/7lGtUWuUzbU/s72-c/IMG_1278.JPG' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2054405149788893944</id><published>2009-12-08T15:10:00.003Z</published><updated>2009-12-08T15:37:31.630Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09: Richard Cable Discusses BBC Lab UK and Citizen Science</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_rEQRc2BOVZs/Sx5yozYZfZI/AAAAAAAAAHU/-aX-B0dBHx0/s1600-h/Richard+Cable.jpg"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 251px; height: 320px;" src="http://1.bp.blogspot.com/_rEQRc2BOVZs/Sx5yozYZfZI/AAAAAAAAAHU/-aX-B0dBHx0/s320/Richard+Cable.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5412889847458790802" /&gt;&lt;/a&gt;Richard Cable, the Editor of BBC Lab UK, opened his talk with the traditional representations of science on television (“Science is Fun” vs “blow things up”) in his presentation about the BBC Lab UK initiative – designed to involve the public with science.&lt;br /&gt;&lt;br /&gt;Cable used this comparison to illustrate that most “citizen science” seems to involve the mass engagement of people with science, whereas Lab UK is aimed at mass participation in science.  The project is about new learning: creating scientifically useful surveys and experiments with the BBC audience, online.  &lt;br /&gt;&lt;br /&gt;He discussed the motives of the audience and how this forms a fundamental part of both the design of the experiment and the types of experiment that they can usefully conduct.  For the audience, the experiment has to be a bit of a “voyage of self-discovery” where they learn something about themselves as well as contributing data to the wider experiment – a more altruistic motive.  Cable emphasised that they work with real scientists, properly designed methodologies, ethics approval and peer-review systems so that the experiments are built on solid science and therefore make a useful contribution to scientific knowledge, rather than just entertainment for the audience.&lt;br /&gt;&lt;br /&gt;To illustrate, Cable took us through the history of BBC online mass participation experiments which have led to the development of the new Lab UK brand.  This included their &lt;em&gt;Disgust&lt;/em&gt; experiment, involving showing users images and asking them to judge whether they would touch the item in the image.  This was driven by a television programme, which directed the audience to the website after the show.  He also discussed &lt;em&gt;Sex ID&lt;/em&gt;, which worked the opposite way round – with the results of the experiment feeding the content of the programme.  250,000 participants got involved over 3 months to take a series of short flash tests which identified the sex of their brains.  This exemplified his point about giving the audience a motive – with them learning something about themselves as a result of participation.&lt;br /&gt;&lt;br /&gt;In continuing this back-story, Cable briefly introduced Stress, which was the prototype launch for the Lab UK brand itself, which was linked into the BBC's &lt;em&gt;Headroom&lt;/em&gt; initiative.  He noted that the general public would rather something that gave some lifestyle feedback, rather than just being purely sciency.  This experiment – a series of flash tasks and uncomfortable questions – has since been taken down.&lt;br /&gt;&lt;br /&gt;The more recent &lt;em&gt;Brain Test Britain&lt;/em&gt; was a higher profile experiment launched by the programme &lt;em&gt;Bang Goes The Theory&lt;/em&gt; which was the first longitudinal experiment, where the audience were asked to revisit the site over a period of 6 weeks to participate, rather than one-off site visit, survey model of experiment used in the previous examples.  They were expecting 60,000 participants, given the issue of retention, to help establish whether brain training actually works.  This was a proper clinical trial with academic sponsors from the Alzheimer's Society – the results of which will be announced in a programme later next year.&lt;br /&gt;&lt;br /&gt;The fourth experiment Cable described was &lt;em&gt;The Big Personality Test&lt;/em&gt;, linked with the &lt;em&gt;Child Of Our Time&lt;/em&gt; series following children born in the year 2000.  They used standard accepted models for measuring personality, to give detailed feedback to participants.  They were seeking to answer the question: “Does personality shape your life or does life shape your personality?”.  They attracted 100,000 participants in 3 days, which was vastly more uptake than expected.  The level of data they have collected already is becoming unmanageable, so this means they are having to re-evaluate the duration of the experiment.  &lt;br /&gt;&lt;br /&gt;In the future, they are hoping to take their experiments social using Facebook and Twitter as part of the method.&lt;br /&gt;&lt;br /&gt;Cable summed up these experiments by highlighting the rules they have found they need to apply when designing such experiments.  These include a low barrier to entry, a clear motive for participation, a genuine mass participation requirement, a sound scientific methodology and an aim that will contribute to new knowledge.  &lt;br /&gt;&lt;br /&gt;Cable went on to discuss the practicalities of how experiments are designed from conception to commissioning.  This involves selecting sponsor scientists, who help to design the experiment and analyse the results.  He explained the selection process, which entails finding respected scientists who are flexible and adaptable to this experiment format.  The role of this “sponsor academic” is to collaborate on experiment design, advise on the ethics processes, interpret the results and then write the peer-review paper resulting from the data and publish their findings.&lt;br /&gt;&lt;br /&gt;The data collected from these experiments comes in two forms: personally identifiable data and anonymisable data.  This means that the scientist cannot trace individual participants back, but the BBC (or three people within the BBC) can trace people back in the event that they need to manage the database and delete entries if requested.  Cable also explained that the data they ask for is driven by the science, not editorial decisions by television programme makers, using standard measures where possible.&lt;br /&gt;&lt;br /&gt;Finally, Cable discussed the actual data and the curation issues surrounding it.  All the data from Lab UK is stored in one place, connected by the BBC ID system, which enables them to start doing secondary analysis from the data where participants have taken part in multiple experiments.  The sponsor academics have a period of exclusivity before the data becomes available for academic and educational purposes only.  However, they are still grappling with issues of data visualisation so they can make this data comprehensible to the general public, and data storage issues – as the BBC does not do long term data storage.  There are precedents – including the &lt;em&gt;People's War&lt;/em&gt; project where people's memories of the World War II were collected and hosted online.  This data has now been passed to the British Museum and forms part of their collection.  He also noted that there may be demands from the ethics committee on how long they can keep the data and before it may be destroyed.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2054405149788893944?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2054405149788893944/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-richard-cable-discusses-bbc-lab.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2054405149788893944'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2054405149788893944'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-richard-cable-discusses-bbc-lab.html' title='IDCC 09: Richard Cable Discusses BBC Lab UK and Citizen Science'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_rEQRc2BOVZs/Sx5yozYZfZI/AAAAAAAAAHU/-aX-B0dBHx0/s72-c/Richard+Cable.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4608991766371927535</id><published>2009-12-08T13:26:00.001Z</published><updated>2009-12-08T13:28:58.727Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Interview: Chris Rusbridge</title><content type='html'>Chris Rusbridge, Director of the DCC, gives us his views of IDCC 09 and his thoughts about the future of the DCC as it moves towards Phase 3 in this video interview:&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=8052753&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=8052753&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/8052753"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://vimeo.com"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4608991766371927535?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4608991766371927535/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-interview-chris-rusbridge.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4608991766371927535'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4608991766371927535'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-interview-chris-rusbridge.html' title='IDCC 09 Interview: Chris Rusbridge'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-505789644121336064</id><published>2009-12-07T19:03:00.004Z</published><updated>2009-12-07T19:15:01.766Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Opening Keynote Recording Now Available</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_rEQRc2BOVZs/SxfUNeRZPHI/AAAAAAAAAGM/SvsISuvY2hU/s1600-h/DSC_0093.JPG"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 183px;" src="http://4.bp.blogspot.com/_rEQRc2BOVZs/SxfUNeRZPHI/AAAAAAAAAGM/SvsISuvY2hU/s200/DSC_0093.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5411026805238348914" /&gt;&lt;/a&gt;After a few problems with the recording following issues with the wireless network at the IDCC 09 venue, I am pleased to say that Professor Douglas Kell's opening keynote is now available online by &lt;a href="http://vimeo.com/8036092"&gt;clicking here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Complete listings of the recordings from all the main IDCC 09 sessions are available at the &lt;a href="http://www.netvibes.com/idcc2009"&gt;event NetVibes page&lt;/a&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-505789644121336064?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/505789644121336064/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-opening-keynote-recording-now.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/505789644121336064'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/505789644121336064'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-opening-keynote-recording-now.html' title='IDCC 09 Opening Keynote Recording Now Available'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_rEQRc2BOVZs/SxfUNeRZPHI/AAAAAAAAAGM/SvsISuvY2hU/s72-c/DSC_0093.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1985078223449763541</id><published>2009-12-07T15:15:00.002Z</published><updated>2009-12-07T15:20:08.745Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09: Prof. Carole Palmer - "The Data Conservancy: A Digital Resource &amp; Curation Virtual  Organisation"</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_rEQRc2BOVZs/Sx0dA0V1l5I/AAAAAAAAAHM/BKUi7WMFYgQ/s1600-h/IMG_1156.jpg"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_rEQRc2BOVZs/Sx0dA0V1l5I/AAAAAAAAAHM/BKUi7WMFYgQ/s200/IMG_1156.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5412514227056646034" /&gt;&lt;/a&gt;Professor Carole Palmer introduced the Data Conservancy which was “cooked up” at the IDCC when it was held in Glasgow.  The Data Conservancy asserts that Research Libraries are a core part of the emerging distributed network of data collections and services.  &lt;br /&gt;&lt;br /&gt;Palmer noted that there is not really an adequate analogy for data services yet (are data sets the new library stacks? Or the new special collections?) but emphasised that data collections and services are consistent with the research library mission.&lt;br /&gt;&lt;br /&gt;The Data Conservancy is a diverse group of domain and data scientists, enterprise experts, librarians and engineers.  Palmer introduced the range of partners involved in the project, and then moved on to discuss how they look to move forward in a very “non-rigid way”, learning to build principles of navigation and how large the solution space actually is – with technical solutions being only a small part of that.  She also noted how an NSF report discussing how successful infrastructure evolves has inspired their group.&lt;br /&gt;&lt;br /&gt;Their goals align with the original programme call for DataNet.  They are going to collect, organise, validate and preserve data as part of a data curation strategy, as necessary for the call.  They are also going to examine how to bring data together to address grand research challenges that society is currently facing.  However, the strategy is to connect systems to infrastructure and to be highly informed by user-centred design.  They found it was very very important to build on existing exemplar projects and to engage with communities that already have deep involvement with scientists.&lt;br /&gt;&lt;br /&gt;Palmer took us through diagrams showing who they are intending to support and how the responsibilities of each of the teams within the projects.  They are trying to strike a balance between the research and implementation – which is a requirement of DataNet.&lt;br /&gt;&lt;br /&gt;The Data Conservancy believes in a flexible architecture, but this has to support a wide range of requirements, data and uses that they have across their constituencies.  As a research library project, they are committed to bringing data in, but Palmer noted that not all research libraries can or should do this.&lt;br /&gt;&lt;br /&gt;A big part of their project has to do with building a data framework, so they are thinking a lot about the notion of the “scientific observation” as a common concept across scientific disciplines.  They will be examining existing models and building on this work.  In particular Palmer talked us through an ORE resource map and noted the need to link data to literature and explained that as libraries they are well positioned to work in this area and improve upon such models.&lt;br /&gt;&lt;br /&gt;The launch pad for the project is looking at data from the Astronomy – specifically the Sloane Digial Sky Survey, which is almost 3 times bigger than data held at Johns Hopkins University in total, which presents a big initial problem in terms of scale.  They will then be taking what they learn from working with this core community and applying this to other areas, including Life Sciences, Earth Science and Social Science, after a deep study of the history of astronomic research processes.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;As part of her presentation, Palmer gave us an over view of the types of projects they are involved with and how they intend to start interfacing between these projects.  She also explained further about their work at Illinois as a number of her colleagues from Illinois were present at the conference.  Their work has noted that it is not just the big instrument driven science that will drive this forwards, but also smaller science projects.  They are also working to understand how they can determine, early on, the long-term potential for data.  The IDCC will be hosted at Illinois in 2010, which will be followed by a partner summit for DataNet projects as they move forward.  &lt;br /&gt;&lt;br /&gt;To conclude, Palmer discussed the education element of their work, which includes a data curation specialisation in the Masters of Science, with the third class running this semester involving 31 students.  The Data Conservancy is expected to infuse teaching practices and help to educate a more diverse range of students.  She showed us a slide demonstrating the strategy for building the new workforce at Illinois, with the Data Conservancy working across the various areas.&lt;br /&gt;&lt;br /&gt;There are lots of connections between the Data Conservancy and research groups and, so Palmer is looking forward to sharing results, work practices and ideas as they move forward with the DataNet.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1985078223449763541?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1985078223449763541/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-prof-carole-palmer-data.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1985078223449763541'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1985078223449763541'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-prof-carole-palmer-data.html' title='IDCC 09: Prof. Carole Palmer - &quot;The Data Conservancy: A Digital Resource &amp; Curation Virtual  Organisation&quot;'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_rEQRc2BOVZs/Sx0dA0V1l5I/AAAAAAAAAHM/BKUi7WMFYgQ/s72-c/IMG_1156.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2355319649005988178</id><published>2009-12-07T12:36:00.005Z</published><updated>2009-12-07T12:49:27.071Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>Catch Up On IDCC 09 Day 2</title><content type='html'>The recordings of the main sessions from Day 2 of IDCC 09 are now available to view on &lt;a href="http://www.vimeo.com/"&gt;Vimeo&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Please select from the links below to watch any sessions of interest, or any that you may have missed from the live stream of the event....&lt;br /&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/7980218"&gt;Best Peer-Reviewed Paper “Multi-Scale Data Sharing in the Life Sciences- Some Lessons for Policy Makers” – Graham Pryor, University of Edinburgh&lt;/a&gt;&lt;br /&gt;&lt;br/&gt;&lt;br/&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/7981903"&gt;Keynote Address by Professor Ed Seidel, Associate Director, Directorate of Mathematical and Physical Sciences, National Science Foundation&lt;/a&gt;&lt;br /&gt;&lt;br/&gt;&lt;br/&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/8029421"&gt;Closing Keynote Address: Timo Hannay – Publishing Director, Nature.com, Nature Publishing Group&lt;/a&gt;&lt;br /&gt;&lt;br/&gt;&lt;br/&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/8030259"&gt;Closing Remarks: Chris Rusbridge - Director of the DCC&lt;/a&gt;&lt;br /&gt;&lt;br/&gt;&lt;br/&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_rEQRc2BOVZs/Sxz4palqUBI/AAAAAAAAAHE/gCqrWtssKd0/s1600-h/idccnetvibes.jpg"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 115px;" src="http://2.bp.blogspot.com/_rEQRc2BOVZs/Sxz4palqUBI/AAAAAAAAAHE/gCqrWtssKd0/s200/idccnetvibes.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5412474242588102674" /&gt;&lt;/a&gt;You can also find the complete list of session recordings, together with links, at the &lt;a href="http://www.netvibes.com/idcc2009"&gt;IDCC 09 NetVibes page&lt;/a&gt; where we are still gathering #idcc09 tweets and other feeds about the event.  If you are blogging about the event, please remember to use the #idcc09 tag!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2355319649005988178?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2355319649005988178/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/catch-up-on-idcc-09-day-2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2355319649005988178'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2355319649005988178'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/catch-up-on-idcc-09-day-2.html' title='Catch Up On IDCC 09 Day 2'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_rEQRc2BOVZs/Sxz4palqUBI/AAAAAAAAAHE/gCqrWtssKd0/s72-c/idccnetvibes.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-583295515441552656</id><published>2009-12-05T12:45:00.003Z</published><updated>2009-12-07T12:21:30.871Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Delegate Interview: Kevin Ashley</title><content type='html'>Kevin Ashley, who manages the Digital Archives Department for the University of London Computer Centre, gives us his response to IDCC 09 over lunch on the final day.&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=7982903&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=7982903&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/7982903"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://vimeo.com"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-583295515441552656?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/583295515441552656/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-delegate-interview-kevin-ashley.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/583295515441552656'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/583295515441552656'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-delegate-interview-kevin-ashley.html' title='IDCC 09 Delegate Interview: Kevin Ashley'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4098490742882395253</id><published>2009-12-05T12:42:00.001Z</published><updated>2009-12-05T12:45:27.345Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Poster Presenter Interview: John Kunze</title><content type='html'>Poster presenter John Kunze from California Digital Library discusses his poster and his observations from IDCC 09...&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=7982645&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=7982645&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/7982645"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://vimeo.com"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4098490742882395253?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4098490742882395253/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-poster-presenter-interview-john.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4098490742882395253'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4098490742882395253'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-poster-presenter-interview-john.html' title='IDCC 09 Poster Presenter Interview: John Kunze'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4889469885134819396</id><published>2009-12-05T12:31:00.002Z</published><updated>2009-12-05T12:43:21.957Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Demonstrator Interview: Terri Mitton</title><content type='html'>Terri Mitton was one of those demonstrating in the Community Space at IDCC 09.  Here she tells us what she has found interesting about the event...&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=7982507&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=7982507&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/7982507"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://vimeo.com"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4889469885134819396?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4889469885134819396/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/terri-mitton-at-idcc-09.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4889469885134819396'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4889469885134819396'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/terri-mitton-at-idcc-09.html' title='IDCC 09 Demonstrator Interview: Terri Mitton'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7906908584010162254</id><published>2009-12-04T11:34:00.002Z</published><updated>2009-12-04T11:37:48.035Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Best Peer-Reviewed Paper: Graham Pryor</title><content type='html'>Presenter of the Best Peer-Reviewed Paper, Graham Pryor of the University of Edinburgh, gives a context to his paper: “Multi-Scale Data Sharing in the Life Sciences - Some Lessons for Policy Makers” in a video interview prior to his presentation on Day 2 of IDCC 09.&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=7978258&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=7978258&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/7978258"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://vimeo.com"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7906908584010162254?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7906908584010162254/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-best-peer-reviewed-paper-graham.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7906908584010162254'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7906908584010162254'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-best-peer-reviewed-paper-graham.html' title='IDCC 09 Best Peer-Reviewed Paper: Graham Pryor'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-8896227029857285531</id><published>2009-12-04T11:30:00.003Z</published><updated>2009-12-04T11:33:00.361Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Delegate Interview: William Kilbride</title><content type='html'>William Kilbride from the Digital Preservation Coalition tells us what he has found interesting on day one of IDCC 09 in this quick video interview over coffee....&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=7978234&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=7978234&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/7978234"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://vimeo.com"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-8896227029857285531?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/8896227029857285531/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-delegate-interview-william.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8896227029857285531'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8896227029857285531'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-delegate-interview-william.html' title='IDCC 09 Delegate Interview: William Kilbride'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7973947364179498795</id><published>2009-12-04T11:16:00.002Z</published><updated>2009-12-04T11:19:18.118Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Delegate Interview:</title><content type='html'>Duncan Dickinson from the University of Southern Queensland discusses his experience of IDCC 09 in this video interview.&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=7978204&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=7978204&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/7978204"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://vimeo.com"&gt;Vimeo&lt;/a&gt;.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7973947364179498795?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7973947364179498795/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-delegate-interview.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7973947364179498795'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7973947364179498795'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-delegate-interview.html' title='IDCC 09 Delegate Interview:'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-5736697930532187328</id><published>2009-12-04T11:12:00.003Z</published><updated>2009-12-04T11:34:30.594Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Demonstrator Interview: Heather Bowden</title><content type='html'>Heather Bowden from the University of North Carolina introduces their &lt;a href="http://www.digitalcurationexchange.org"&gt;Digital Curation Exchange&lt;/a&gt;, which was demonstrated during the Community Space session at IDCC 09&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=7978180&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=7978180&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/7978180"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://vimeo.com"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-5736697930532187328?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/5736697930532187328/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-demonstrator-interview-heather.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5736697930532187328'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5736697930532187328'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-demonstrator-interview-heather.html' title='IDCC 09 Demonstrator Interview: Heather Bowden'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4187656300888734457</id><published>2009-12-04T11:08:00.002Z</published><updated>2009-12-04T11:11:48.143Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Peer-Review Paper: Andrew Treloar</title><content type='html'>Dr Andrew Treloar of the Australian National Data Service discusses his paper: “Designing for Discovery and Re-Use: the 'ANDS Data Sharing Verbs' Approach to Service Decomposition” and his tomato plants in this short video interview at IDCC 09.&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=7978155&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=7978155&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/7978155"&gt;click here&lt;/a&gt; to view this video in &lt;a href="http://www.vimeo.com"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4187656300888734457?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4187656300888734457/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-peer-review-paper-andrew.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4187656300888734457'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4187656300888734457'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-peer-review-paper-andrew.html' title='IDCC 09 Peer-Review Paper: Andrew Treloar'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2964630496875337462</id><published>2009-12-04T10:52:00.002Z</published><updated>2009-12-04T11:07:37.921Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Peer-Review Paper: Tyler Walters</title><content type='html'>In the first of a series of informal interviews from IDCC 09, Tyler Walters from Georgia Institute of Technology gives us a summary of his peer-review paper "Data Curation Program Development in US Universities: The Georgia Institute of Technology Example", presented during the parallel sessions at the International Digital Curation Conference 2009 on Friday 4th December&lt;br /&gt;&lt;br /&gt;&lt;center&gt;&lt;object width="400" height="300"&gt;&lt;param name="allowfullscreen" value="true" /&gt;&lt;param name="allowscriptaccess" value="always" /&gt;&lt;param name="movie" value="http://vimeo.com/moogaloop.swf?clip_id=7978075&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" /&gt;&lt;embed src="http://vimeo.com/moogaloop.swf?clip_id=7978075&amp;amp;server=vimeo.com&amp;amp;show_title=1&amp;amp;show_byline=1&amp;amp;show_portrait=0&amp;amp;color=&amp;amp;fullscreen=1" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="400" height="300"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;[&lt;a href="http://vimeo.com/7978075"&gt;click here&lt;/a&gt; to view this video at &lt;a href="http://www.vimeo"&gt;Vimeo&lt;/a&gt;]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2964630496875337462?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2964630496875337462/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-peer-review-paper-tyler-walters.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2964630496875337462'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2964630496875337462'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-peer-review-paper-tyler-walters.html' title='IDCC 09 Peer-Review Paper: Tyler Walters'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-5707525331012801615</id><published>2009-12-04T08:03:00.003Z</published><updated>2009-12-08T12:40:30.663Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>Catch Up On IDCC 09</title><content type='html'>If you missed any of the International Digital Curation Conference Sessions as they were live streamed yesterday, but have read the blog summary here and wish you had seen the presentation - never fear! The recordings are now available online via the video sharing site &lt;a href="http://www.vimeo.com"&gt;Vimeo&lt;/a&gt;. &lt;br /&gt;&lt;br /&gt; Just click on the link for the relevant session below to be taken direct to the recording.  Unfortunately, there was a glitch with the recording for Prof. Douglas Kell's opening keynote, but we hope to get this up online in the early part of next week.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;IDCC 09 Day One Plenaries:&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/7965677"&gt;Prof Carole Palmer: The Data Conservancy: A Digital Resource &amp; Curation Virtual Organisation&lt;/a&gt;&lt;br /&gt;&lt;br/&gt;&lt;br/&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/7965857"&gt;Dr William Michener: DataONE: A Virtual Data Center for Biology, Ecology, and the Environmental Sciences&lt;/a&gt;&lt;br /&gt;&lt;br/&gt;&lt;br/&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/7965921"&gt;Prof Anne Trefethen: NeuroHub: The information environment for the neuroscientist&lt;/a&gt;&lt;br /&gt;&lt;br/&gt;&lt;br/&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/7966158"&gt;Mark Birkin: National e-Infrastructure for Social Simulation (NeISS)&lt;/a&gt;&lt;br /&gt;&lt;br/&gt;&lt;br/&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/7966244"&gt;Panel Discussion: UK/US Perspectives&lt;/a&gt;&lt;br /&gt;&lt;br/&gt;&lt;br/&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/8051987"&gt;DCC Symposium: Citizen Science: Data Challenges&lt;br/&gt;&lt;br /&gt;Introduction /Chair: Dr Liz Lyon, Associate Director, DCC&lt;br/&gt;&lt;br /&gt;Presentation: Richard Cable, BBC Lab UK&lt;/a&gt;&lt;br /&gt;&lt;br/&gt;&lt;br/&gt;&lt;br /&gt;&lt;a href="http://vimeo.com/7966466"&gt;Summing Up: Cliff Lynch, Executive Director, Coalition for Networked Information&lt;/a&gt;&lt;br/&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-5707525331012801615?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/5707525331012801615/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/catch-up-on-idcc-09.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5707525331012801615'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5707525331012801615'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/catch-up-on-idcc-09.html' title='Catch Up On IDCC 09'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1284736309468804953</id><published>2009-12-03T22:02:00.002Z</published><updated>2009-12-03T22:07:30.019Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09: Cliff Lynch Sums Up Day One</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_rEQRc2BOVZs/Sxg2jhBicSI/AAAAAAAAAG0/NF9alJwlsEk/s1600-h/IMG_1215.jpg"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_rEQRc2BOVZs/Sxg2jhBicSI/AAAAAAAAAG0/NF9alJwlsEk/s200/IMG_1215.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5411134936073793826" /&gt;&lt;/a&gt;In summing up day one, Cliff Lynch observed how the focus of the discussion at these conferences has shifted over the last five years, harking back to the first meetings when there was more discussion about preservation rather than curation.  Lynch noted that we are now beginning to understand that preservation has to be a supporting structure to curation, which is a more complex process – more deeply involved in the research process.   &lt;br /&gt;&lt;br /&gt;One of the other trends he observed emerging is that of “re-use” of data.  We are no longer just interested in preserving, but evaluating the prospects of re-use for data and improving those prospects, where possible, to derive greater value from our data.  &lt;br /&gt;&lt;br /&gt;Lynch noted that there is a deepening linkage between the tools and workflows that researchers use, so data curation needs to be increasingly integrated, as this will help solve the problems of meta data, providence and so on to make curation more effective.  &lt;br /&gt;&lt;br /&gt;Lynch was very happy to hear mention of the notion that we need to get the scientific equipment developers and vendors involved.  This could help feed curation into the workflow more effectively – he gave the example of cultural heritage researchers who found their cameras “knew” a lot of the meta data that they had to laboriously enter to fulfil their curation needs, and so could use the equipment to aid in the curation of the data it produced.&lt;br /&gt;&lt;br /&gt;Lynch took a lot of heart from the focus on education to give us a generation of data preservers and data curators.  He was also heartened by comments that funding agencies were taking data curation seriously as part of the grant proposal and review processes.  He also suggested it would be great if we could actually track the progress of this type of cultural shift.&lt;br /&gt;&lt;br /&gt;In concluding, Lynch looked at the more speculative elements of the day's discussion, including the Citizen Science debate – referring to Liz Lyon's paper on the topic.  However, he wants us to recall that there is a whole range of computational and observational citizen science tasks, not just the survey-based BBC Lab UK model.  He also reminded us that this is not just applicable to science... we are seeing the emergence of Citizen Humanities and amateur study in other areas which has been revitalised by the web.  What we need need to look to is building data support for citizen scholarship as a whole.&lt;br /&gt;&lt;br /&gt;Finally, Lynch made a speculation involving the measure “scientific papers per minute” which underscores how badly out of control scientific communication is and creates a huge problem when propagating and curating knowledge.  It seems to Lynch that one of the things we need to recognise is that many of these papers don't need to be papers, but database submissions.  This would be a better way to do things if we are going to manage the data – without the emphasis on the traditional individual-voice analysis paper.   So we need to have is a hard conversation about traditional forms of scientific communication and data curation to determine how data curation fits into scholarly communication and how scholarly communication may need to change to help us manage the sheer volume of the output.&lt;br /&gt;&lt;br /&gt;I caught up with Cliff just after his summary of day one to ask what he is looking forward to most from day two of IDCC 09...&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;“I am looking forward to hearing from Ed Seidel.  Most of us in the States know that there are three more DataNet awards in their final stages, so we would love to know who has got the inside track on those... although I suspect he will say that he can't comment on that! &lt;br /&gt;&lt;br /&gt;Following on from my summary, I would like to know what people think about how we can track the uptake on data curation in funding bids.&lt;br /&gt;&lt;br /&gt;Having been involved in the paper review process, I know that the best peer-review paper is very good, and there are some other great papers being presented tomorrow, so I am very much looking forward to it!”&lt;br /&gt;&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1284736309468804953?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1284736309468804953/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-cliff-lynch-sums-up-day-one.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1284736309468804953'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1284736309468804953'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-cliff-lynch-sums-up-day-one.html' title='IDCC 09: Cliff Lynch Sums Up Day One'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_rEQRc2BOVZs/Sxg2jhBicSI/AAAAAAAAAG0/NF9alJwlsEk/s72-c/IMG_1215.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-3032926396227496671</id><published>2009-12-03T21:55:00.001Z</published><updated>2009-12-03T21:58:54.028Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09: Panel Discussion - UK/US Perspectives</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_rEQRc2BOVZs/Sxg0ephrFyI/AAAAAAAAAGs/o6034MN8L_g/s1600-h/DSC_0144.JPG"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 133px;" src="http://4.bp.blogspot.com/_rEQRc2BOVZs/Sxg0ephrFyI/AAAAAAAAAGs/o6034MN8L_g/s200/DSC_0144.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5411132653433460514" /&gt;&lt;/a&gt;To contrast and conclude the morning plenary sessions, the four speakers formed a panel to accept questions from the audience.&lt;br /&gt;&lt;br /&gt;Q: Anne Trefethen was asked to explain more about Blog3.&lt;br /&gt;A: Anne's colleagues have been using it and it will be launched next week at the All Hands meeting.&lt;br /&gt;&lt;br /&gt;Q: How is user-centric design being used in other areas? (aside from Neuro Hub)&lt;br /&gt;A: Carole Palmer explained that their work does involve requirements based work, whilst William Michener explained that DataONE engaged users from the beginning from different research centers, each of which also does its own work in their centers to establish the needs of their users.  Mark Birkin explained how they have identified three different types of users – emphasising how diverse the groups of users can be – and highlighted the use of social networking tools to harvest user views directly.&lt;br /&gt;&lt;br /&gt;Q: What perspectives do the panel have as to whether data curation is still a pioneering activity and what level of maturity is there among researchers?&lt;br /&gt;A: Trefethen noted that whilst some researchers are mature in their understanding of data management, but there are groups who are surprised by the requirement for curation commitments in the funding bids.  She explained that an understanding needs to be nurtured across disciplines, not just individual disciplines.  Palmer explained that in terms of preservation, they see people lining up at the door, whilst the data sharing side is not so practiced (although there are people very keen philosophically).  Bad experiences have fed into this.  Michener noted that there has been a dis-service done by failing to educate young scientists with good data practice as part of doing science, so there is a lot of re-educating to be done.  Birkin has a different perspective, as he is doing secondary analysis of well preserved primary data sources.  However, there is not the same level of practice about the secondary analysis of data in his field, which can led to researchers having to reinvent methods.&lt;br /&gt;&lt;br /&gt;Q: Are there plans to be able to cite the data that's being used? &lt;br /&gt;A: Michener is looking at a data citation model that will rely on digital object identifiers to give scientists as much credit as possible for not just their publications, but also their data.  It is key to cite the data as a specific object, as the data can lead to multiple publications.  The other three panellists agreed that this is part of their projects, at different levels of priority.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-3032926396227496671?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/3032926396227496671/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-panel-discussion-ukus.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3032926396227496671'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3032926396227496671'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-panel-discussion-ukus.html' title='IDCC 09: Panel Discussion - UK/US Perspectives'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_rEQRc2BOVZs/Sxg0ephrFyI/AAAAAAAAAGs/o6034MN8L_g/s72-c/DSC_0144.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-5574236712838989636</id><published>2009-12-03T21:42:00.002Z</published><updated>2009-12-03T21:48:32.210Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09: Mark Birkin Presents NeISS</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_rEQRc2BOVZs/SxgyHN5o0GI/AAAAAAAAAGk/PQ8LGuF4DjQ/s1600-h/IMG_1193.jpg"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_rEQRc2BOVZs/SxgyHN5o0GI/AAAAAAAAAGk/PQ8LGuF4DjQ/s200/IMG_1193.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5411130051857535074" /&gt;&lt;/a&gt;To open, Birkin gave us an introduction to GIS (Geographic Information Systems) which display data sets as map graphics.  He demonstrated some of the applications of this type of system, which have spawned quite a large industry, with 20,000 people in the US who claim to use GIS each day in their work.&lt;br /&gt;&lt;br /&gt;So they have this transformational technology in geography, which enables one to manage and integrate spatial and attribute data and has widespread applications for demographics, climate research, land-use, health, business, crime etc.  Birkin admitted that his first reaction when introduced to these maps was: “So what?! What can we do with these systems to make decisions or provide insights into the kind of phenomena we are studying?”&lt;br /&gt;&lt;br /&gt;He used the examples of how intelligence can be added to the GIS data through spatial analysis, helping to automatically identify burglary hotspots, which has been used to inform policing decisions, mathematical models drawn from GIS, simulation and dynamic simulation.&lt;br /&gt;&lt;br /&gt;Birkin went on to give us more detail about how he is trying to create a social simulation of the city of Leeds by combining data sources, and how this can inform policy makers.  This includes creating “synthetic individuals” to create a complete model.&lt;br /&gt;&lt;br /&gt;As a researcher looking to create simulations and analyse issues using geographical  information, there are loads of data sources.  You would download this information and go away and create the simulations independently.  The point of the NeISS project is to create a framework for sharing the value adding-work of creating the simulations from the different data sets.  They started in the spring by building portals  to bring together technologies to help create an infrastructure with the capabilities to help add value to all the data that is available.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-5574236712838989636?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/5574236712838989636/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-mark-birkin-presents-neiss.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5574236712838989636'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5574236712838989636'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-mark-birkin-presents-neiss.html' title='IDCC 09: Mark Birkin Presents NeISS'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_rEQRc2BOVZs/SxgyHN5o0GI/AAAAAAAAAGk/PQ8LGuF4DjQ/s72-c/IMG_1193.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-6568865595392554654</id><published>2009-12-03T21:27:00.002Z</published><updated>2009-12-03T21:42:31.330Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09: Anne Trefethen Presents NeuroHub</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_rEQRc2BOVZs/Sxgwp1Hp9WI/AAAAAAAAAGc/VoAdCAnbjtY/s1600-h/DSC_0110.JPG"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 133px;" src="http://3.bp.blogspot.com/_rEQRc2BOVZs/Sxgwp1Hp9WI/AAAAAAAAAGc/VoAdCAnbjtY/s200/DSC_0110.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5411128447477609826" /&gt;&lt;/a&gt;Anne Trefethen introduced NeuroHub, which is a much smaller project than the US projects presented earlier in the day.  Their aim is to work with neuroscientists to enable them to share their data – work which they help will fit into the wider picture and generate some useful tools.&lt;br /&gt;&lt;br /&gt;The project involves Oxford , Reading and Southampton universities, together with their research council STFC e-Science Group.  She explained how late trains brought two main players in the project together in conversation, which helped to develop the platform for collaboration which evolved between the universities.&lt;br /&gt;&lt;br /&gt;Trefethen then focused in on the science involved.  She noted that their work will only of value if they help to deliver the science.  She explained the specific projects each university group are working on – including studying the way neural networks of insects work to enable them to move their limbs.  She asked us to take note of the types of images that her slides showed – demonstrating the types of data that the scientists want to share with their colleagues – in raw form.  She drew our attention to the note that some of the diagrams were stored with Spike 2 software – exemplifying the need to be aware of the wide range of tools when storing and sharing data.  Data does not just include images, but also video.  She explained that one of the apparently small, but significant considerations is that the scientists do not want to have to use a USB key in order to share their data.&lt;br /&gt;&lt;br /&gt;She emphasised that it is very important to identify what the data we are are using actually are.  Experiments do not necessarily create metadata to make it easy to find and share the results later.  She also noted the complex range of software products being used in different ways to collect, process and publish the data.&lt;br /&gt;&lt;br /&gt;To overcome this, the NeuroHub project involves embedding the developers in the neuroscience labs in the early stages to gain insight, combined with structured and unstructured interviews to establish how all of these issues mesh together.&lt;br /&gt;&lt;br /&gt;Trefethen then moved on to look at the challenges they are facing – a list that she admitted could have been taken from Douglas Kell's opening keynote.  The variety of interdisciplinary teams, different expectations, cultures, requirements, and understanding of shared terms, which can all obstruct data sharing.  They have been using an agile development process to try and resolve some of these challenges and ensure that they develop tools that actually work for the scientists in practice.&lt;br /&gt;&lt;br /&gt;She explained that their aim is “jam today, jam tomorrow” i.e. doing simple things that can make a big difference.  This can include things like format conversions and proper annotation to help facilitate data sharing.&lt;br /&gt;&lt;br /&gt;Trefethen then introduced some of the related projects that they are interacting with – including myExperiment (“Facebook for scientists” - socialising the data and providing annotation) and CARMEN, which is larger than NeuroHub, but more focussed on one area and works in the same community – promoting standards.  There is a lot out there that they can integrate into NeuroHub.&lt;br /&gt;&lt;br /&gt;In explaining the environment architecture of the project, Trefethen emphasised that they did not want to develop a large, monolithic system, but rather something that is in their workspace and creates an environment that empowers the researchers.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-6568865595392554654?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/6568865595392554654/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-anne-trefethen-presents.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6568865595392554654'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6568865595392554654'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-anne-trefethen-presents.html' title='IDCC 09: Anne Trefethen Presents NeuroHub'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_rEQRc2BOVZs/Sxgwp1Hp9WI/AAAAAAAAAGc/VoAdCAnbjtY/s72-c/DSC_0110.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4578743471177120674</id><published>2009-12-03T15:13:00.001Z</published><updated>2009-12-03T15:15:43.725Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09: William Michener Presents DataONE</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_rEQRc2BOVZs/SxfV5G7A6CI/AAAAAAAAAGU/0GMj71jsBhI/s1600-h/DSC_0107.JPG"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 133px;" src="http://2.bp.blogspot.com/_rEQRc2BOVZs/SxfV5G7A6CI/AAAAAAAAAGU/0GMj71jsBhI/s200/DSC_0107.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5411028654396336162" /&gt;&lt;/a&gt;Michener began his presentation introducing DataONE by asking “Why?” He explained that his team is focussed on the environmental challenges that are of increasing concern to us all.&lt;br /&gt;&lt;br /&gt;He described their approach to build a knowledge pyramid and the role of citizen science to contribute more site-based data.  Michener also highlighted the problem of data loss not just due to fire and other physical factors created by unstable data storage systems such as tapes, but also an inability to archive all the data – predicting that next year they will lose as much data as gets collected in the course of the year. &lt;br /&gt;&lt;br /&gt;Data that gets collected does lose value over time.  Scientists are more familiar with their data set at the time of publication, so without comprehensive metadata, we lose a lot of the details that make the data re-useable and relevant.  &lt;br /&gt;&lt;br /&gt;DataONE is designed to address some of these challenges.  They aim to provide universal access to data about life on earth.  They are developing a cyber infrastructure with distributed nodes, including member nodes where data are stored at existing sites, and co-ordinating nodes, where there is not physical storage of data, but keep the metadata catalogues, and an investigator toolkit for students and scientists to help access and use this data.&lt;br /&gt;&lt;br /&gt;Michener then tool us through how they plan to support the data lifecycle – providing examples of the systems and tools to support data contributors and data users.  &lt;br /&gt;&lt;br /&gt;DataONE is initially focussing on biological and environmental data, but they recognise that many of the issues require using other data sources – including social science.  They already have a range of data sources and a diverse array of partners which include libraries, academic institutions, research networks.  They plan to leverage existing structures, but also expanding the network throughout the project.  He also explained the link between DataONE and the Data Conservancy described by Carole Palmer in the previous presentation.&lt;br /&gt;&lt;br /&gt;Michener showed how they have already identified a number of member nodes across the globe representing a number of research networks – but there is no common data management standard between them.  “The problem with standards is that there are so many of them,” he noted, so there is no one-size fits all approach.&lt;br /&gt;&lt;br /&gt;He demonstrated some of the investigator tools that they are developing to help search and filter the data available, which also enable you to bring up the metadata as well as the original data.  Finally, he also introduced the EVA working group, which looks at exploring, visualising and analysing data, and gave a practical example of how they have combined data and created visualisations mapping vegetation and bird migration.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4578743471177120674?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4578743471177120674/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-william-michener-presents.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4578743471177120674'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4578743471177120674'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-william-michener-presents.html' title='IDCC 09: William Michener Presents DataONE'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_rEQRc2BOVZs/SxfV5G7A6CI/AAAAAAAAAGU/0GMj71jsBhI/s72-c/DSC_0107.JPG' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1475708767013462491</id><published>2009-12-03T15:03:00.002Z</published><updated>2009-12-03T15:08:01.942Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Keynote Address: Douglas Kell</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_rEQRc2BOVZs/SxfUNeRZPHI/AAAAAAAAAGM/SvsISuvY2hU/s1600-h/DSC_0093.JPG"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 183px;" src="http://4.bp.blogspot.com/_rEQRc2BOVZs/SxfUNeRZPHI/AAAAAAAAAGM/SvsISuvY2hU/s200/DSC_0093.JPG" border="0" alt=""id="BLOGGER_PHOTO_ID_5411026805238348914" /&gt;&lt;/a&gt;Professor Douglas Kell provided us with a BBSRC perspective on digital data curation and his own perspective as an academic. &lt;br /&gt;&lt;br /&gt;He began with the philosophy of data-science, as data curation is not an end in itself, but rather a means to an end.  He showed a diagram that he referred to as an arc of knowledge, which demonstrated that one would normally start with an idea or  hypothesis, then do an experiment that would produce some data that would either be consistent with the hypothesis or not.  The other side of the arc addresses where those ideas come – the data, which is the starting point for many within the audience at IDCC 09.  This is data-driven science, when hypotheses are generated from the data, compared to the hypotheses-generate science, which he admits the biological community is quite reliant on.  As we move towards more data-driven science we encounter the problem of storing not just the data itself so we can make use of it, but also the knowledge generated by the data – a theme that would be picked up by Mark Birkin in his later talk.&lt;br /&gt;&lt;br /&gt;The digital availability of all sorts of stuff changes the entire epistimology of how we do science, making all sorts of things possible.  &lt;br /&gt;&lt;br /&gt;Historically physics were seen as the high data science, whilst no biology is now being recognised as a high data science.  Biology is a short, fat data model, with less data, but more people using it, whilst physics could be described as a long, thin data model.  He did note that there is a lot of video and image data related to biological studies that is not being shared because people don't yet know how to handle it.&lt;br /&gt;&lt;br /&gt;Kell made the point that if you can access and re-use scientific data you will gain a huge advantage over those who do not, both academically and potentially commercially, which helps to drive funding for this type of work.  He used the example of genomics to show just one of the fields generating huge data sets, which could be used for data-drive science, but need to be integrated.&lt;br /&gt;&lt;br /&gt;Relative to the cost of creating the data, the cost of storing and curating it is minimal, so it is therefore a good idea to store it effectively.  But the issue is not just storage.  There is also the cost of moving it.  Kell asserted that we need to move to a model where we don't take the data to the computing, but rather take the computing to the data, which will change the way we approach storage and sharing.&lt;br /&gt;&lt;br /&gt;Kell moved on to explain that we will need to have a new breed of curators  and tools to deal with the challenges, particularly making that data useful.  Having the data does not always help, as generally biologists do not have the tools to deal with big data.  He expects the type of software to evolve that does not sit locally on the machine, but sits somewhere else and gets changed and updated, but is useable by the scientist without specialist computing knowledge.  &lt;br /&gt;&lt;br /&gt;Kell pointed out that things do not evolve just from databases getting bigger, but rather from the tools to deal with the data and the curation methods evolving too.  To illustrate, he pointed out that five scientific papers per minute are published, so we need a lot of tools to make this vast amount of literature (and the associated data) useful, so it does not just end up in a data tomb.  &lt;br /&gt;&lt;br /&gt;Some areas of science have better strategies than others, but BBSRC are now looking at making data curation and sharing part of the funding processes, but making sure that data-driven projects are not competing with hypotheses-driven bids.  He noted that journalists are keen, funding bodies are keen and the culture will soon change so that NOT managing and sharing data will become distinctly uncool, like smoking.&lt;br /&gt;&lt;br /&gt;Finally, Kell emphasised the need to integrate the data and the metadata.  He noted that the digital availability of all data has the potential to stop the balkanisation of scientific data, and it is the responsibility of people within the room to ensure this.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1475708767013462491?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1475708767013462491/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-keynote-address-douglas-kell.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1475708767013462491'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1475708767013462491'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-keynote-address-douglas-kell.html' title='IDCC 09 Keynote Address: Douglas Kell'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_rEQRc2BOVZs/SxfUNeRZPHI/AAAAAAAAAGM/SvsISuvY2hU/s72-c/DSC_0093.JPG' height='72' width='72'/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-6890937630203632322</id><published>2009-12-02T23:08:00.008Z</published><updated>2009-12-02T23:24:02.211Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='#idcc09'/><title type='text'>IDCC 09 Gets Underway...</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_rEQRc2BOVZs/Sxb04Putz5I/AAAAAAAAAF0/Y3fPiX9wT14/s1600-h/IMG_1150.jpg"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_rEQRc2BOVZs/Sxb04Putz5I/AAAAAAAAAF0/Y3fPiX9wT14/s200/IMG_1150.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5410781249464553362" /&gt;&lt;/a&gt;The 5th International Digital Curation Conference got underway stylishly in London this evening as Dr Liz Lyon, Associate Director of the DCC, welcomed delegates to the opening drinks reception at the Natural History Museum, which was specially lit in DCC orange and red for the occasion.  &lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_rEQRc2BOVZs/Sxb1MJfhapI/AAAAAAAAAF8/EFI-diMcjTQ/s1600-h/Liz+Lyons.png"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 200px; height: 112px;" src="http://2.bp.blogspot.com/_rEQRc2BOVZs/Sxb1MJfhapI/AAAAAAAAAF8/EFI-diMcjTQ/s200/Liz+Lyons.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5410781591387597458" /&gt;&lt;/a&gt;Liz gave us some background to the Natural History Museum and some of its exhibits – including tales of “Dippy” (the iconic skeletal Diplodocus) and an 8-foot squid housed in a tank produced by Damien Hurst's suppliers.  These impressive displays aside, Liz noted that the museum has its own range of digital curation challenges, which are explored in an interview with Neil Thompson at the DCC website.  Liz then introduced us to Lee Dirks of Microsoft Research, who kindly sponsored the event.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_rEQRc2BOVZs/Sxb1hDsjaAI/AAAAAAAAAGE/iMWXhlK9V5g/s1600-h/Lee+Dirks.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 200px; height: 112px;" src="http://4.bp.blogspot.com/_rEQRc2BOVZs/Sxb1hDsjaAI/AAAAAAAAAGE/iMWXhlK9V5g/s200/Lee+Dirks.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5410781950608893954" /&gt;&lt;/a&gt;Lee expressed his team's honour at having the opportunity to sponsor the evening drinks reception and noted that the IDCC was one of the more fascinating conferences in the field and the one that he has prioritised getting to for the last three years.  He explained very briefly that Microsoft Research has been doing a lot of work in e-Science and e-Research areas and they feel very strongly that data curation is a critical activity that needs to receive more attention.  He proposed a toast to Chris Rusbridge and all the organisers for putting on the event and reminded the speakers that those of us in the audience will be looking to them for all the answers over the coming days.&lt;br /&gt;&lt;br /&gt;With that in mind, I will be blogging here over the next two days to summarise all of the hopefully illuminating presentations that are packed into this years' programme, together with a range of interviews with speakers and delegates.  Please feel free to comment, and if you are attending the conference and blogging yourself, please include a link to your post.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-6890937630203632322?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/6890937630203632322/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-gets-underway.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6890937630203632322'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6890937630203632322'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/idcc-09-gets-underway.html' title='IDCC 09 Gets Underway...'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_rEQRc2BOVZs/Sxb04Putz5I/AAAAAAAAAF0/Y3fPiX9wT14/s72-c/IMG_1150.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-8865491559934688111</id><published>2009-12-02T16:37:00.005Z</published><updated>2009-12-02T16:58:50.373Z</updated><title type='text'>Five-Minute Interview at IDCC 09: William Michener</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_rEQRc2BOVZs/SxacicL-zdI/AAAAAAAAAFs/px9Q_jI4rxk/s1600-h/Michener.jpg"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 267px; height: 188px;" src="http://4.bp.blogspot.com/_rEQRc2BOVZs/SxacicL-zdI/AAAAAAAAAFs/px9Q_jI4rxk/s320/Michener.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5410684117828160978" /&gt;&lt;/a&gt;In the first of our interviews from IDCC 09, William Michener gives us a sneak preview of his plenary talk, which will be &lt;a href="http://www.netvibes.com/idcc2009#Live"&gt;live streamed&lt;/a&gt; at 10:15 tomorrow.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Introduction:&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;I am William Michener from University of New Mexico and I am a professor with the University Libraries System&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;What will you be talking about in your presentation tomorrow?&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;I am introducing DataONE which stands for Data Observation Network for Earth, which is a virtual data centre for the Biological, Ecological and Environmental Sciences.  It is unique in that it will be highly distributed with world wide presence and it will support the curation and preservation of data from Universities, individual scientists, research networks and other organisations.  In addition, it will host an investigators' tool kit that will provide data exploration, data management and analytical and visualisation tools.  This will be for scientists, students and citizens.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;What's next for DataONE?&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;This is a long term project to set up, essentially, a virtual data centre that will last decades to centuries so we hope to build new partnerships and expand upon existing partnerships with a large number of European organisations&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;What are you looking for in Phase 3 from the DCC?&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;We hope that we could explore collaborations with respect to developing educational materials as well as providing digital curation and other tools for scientists and students.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-8865491559934688111?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/8865491559934688111/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/five-minute-interview-at-idcc-09.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8865491559934688111'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8865491559934688111'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/five-minute-interview-at-idcc-09.html' title='Five-Minute Interview at IDCC 09: William Michener'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_rEQRc2BOVZs/SxacicL-zdI/AAAAAAAAAFs/px9Q_jI4rxk/s72-c/Michener.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4404548780715866300</id><published>2009-12-01T15:57:00.005Z</published><updated>2009-12-01T18:53:21.974Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Live Streaming'/><category scheme='http://www.blogger.com/atom/ns#' term='IDCC09'/><title type='text'>Live Streaming at IDCC 09</title><content type='html'>If you are planning to follow the 5th International Digital Curation Conference online, you can now watch the main sessions broadcast from the event via a live video stream.  &lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.netvibes.com"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 78px; height: 93px;" src="http://www.aecom.org/blog/images/netvibes-logo.png" border="0" alt="" /&gt;&lt;/a&gt;You can see the live stream at the new event NetVibes page at &lt;a href="http://www.netvibes.com/idcc2009#Live"&gt;www.netvibes.com/idcc2009&lt;/a&gt;.  NetVibes is a tool for collecting resources and feeds from all over the web, which we hope will enable you to get everything you need to stay up to date in one place, if you are participating in the event remotely.&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_rEQRc2BOVZs/SxVEwyBvLPI/AAAAAAAAAFk/wo83zF4ko7s/s1600/idccnetvibes.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 230px;" src="http://2.bp.blogspot.com/_rEQRc2BOVZs/SxVEwyBvLPI/AAAAAAAAAFk/wo83zF4ko7s/s400/idccnetvibes.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5410306132208987378" /&gt;&lt;/a&gt; &lt;center&gt;[screen shot of IDCC 2009 NetVibes page]&lt;/center&gt;&lt;br /&gt;&lt;br /&gt;You will find recordings of any sessions that you may have missed, updates from this blog and the official @idcclive Twitter commentary, and all the opinions and comments tagged with the #idcc09 hash tag - all brought together at one page. &lt;br /&gt;&lt;br /&gt;The main plenary sessions, including the DCC symposium, will be live streamed on Thursday 3rd December and Friday 4th December, subject to consent from the individual speakers. Parallel sessions will not be covered.&lt;br /&gt;&lt;br /&gt;If you cannot access NetVibes for any reason, then you can also view the live video stream by &lt;a href="http://www.tconsult-ltd.com/IDCC09.html"&gt;clicking here.&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4404548780715866300?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4404548780715866300/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/live-streaming-at-idcc-09.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4404548780715866300'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4404548780715866300'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/12/live-streaming-at-idcc-09.html' title='Live Streaming at IDCC 09'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_rEQRc2BOVZs/SxVEwyBvLPI/AAAAAAAAAFk/wo83zF4ko7s/s72-c/idccnetvibes.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1448311797701942280</id><published>2009-11-25T17:55:00.007Z</published><updated>2009-11-26T08:38:29.314Z</updated><title type='text'>IDCC 2009 Amplified!</title><content type='html'>&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;As Chris has recently announced, the annual International Digital Curation Conference is almost upon us.  This year's event will be amplified using a range of online social media tools to help include those who can't make it to London on 3&lt;/span&gt;&lt;sup&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;rd&lt;/span&gt;&lt;/sup&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt; and 4&lt;/span&gt;&lt;sup&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;th&lt;/span&gt;&lt;/sup&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt; December, and to capture the online conversation surrounding the event for future reference.&lt;br /&gt;&lt;br /&gt;This blog will form the centre point of the coverage.  There will be summaries of each of the sessions, video interviews with speakers and delegates, and much more.  So, if you are reading via the RSS feed, expect a flurry of updates throughout the conference!  If you're not subscribed to the RSS feed, make sure you check back regularly during the event to see what's been covered.&lt;br /&gt;&lt;br /&gt;You will also be able to follow the official live commentary of each of the plenary sessions on Twitter by following @idcclive, and take part in the conversation using the event hash tag #idcc09.  If you have a question for a speaker, simply tweet your question to @idcclive and it will be relayed to the speaker for you at an appropriate point.&lt;br /&gt;&lt;br /&gt;We look forward to seeing you at IDCC 09 – whether in person or online!&lt;br /&gt;_________________&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 100px; height: 75px;" src="http://4.bp.blogspot.com/_rEQRc2BOVZs/Sw135Q33aEI/AAAAAAAAAFc/iEE9ALInJ2o/s200/kirsty.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5408110553207367746" /&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;br /&gt;The amplification of IDCC 09 will be co-ordinated by Kirsty McGill.  Kirsty is the Creative Director of communications and training firm &lt;/span&gt;&lt;/span&gt;&lt;a href="http://www.tconsult-ltd.com/"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;TConsult Ltd&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;.&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1448311797701942280?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1448311797701942280/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/11/as-chris-has-recently-announced-annual.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1448311797701942280'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1448311797701942280'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/11/as-chris-has-recently-announced-annual.html' title='IDCC 2009 Amplified!'/><author><name>Kirsty McGill</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='24' src='http://3.bp.blogspot.com/_rEQRc2BOVZs/SvlT1d02PHI/AAAAAAAAAEo/CblXc9p8rpo/S220/kirsty.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_rEQRc2BOVZs/Sw135Q33aEI/AAAAAAAAAFc/iEE9ALInJ2o/s72-c/kirsty.jpg' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4468098374502296499</id><published>2009-11-18T22:40:00.002Z</published><updated>2009-11-18T22:47:35.313Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Repositories'/><category scheme='http://www.blogger.com/atom/ns#' term='data curation'/><category scheme='http://www.blogger.com/atom/ns#' term='Citation'/><category scheme='http://www.blogger.com/atom/ns#' term='training'/><category scheme='http://www.blogger.com/atom/ns#' term='IDCC09'/><title type='text'>Workshops prior to the International Digital Curation Conference</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span class="Apple-style-span"   style="  color: rgb(38, 38, 38); font-family:ArialMT, serif;font-size:17px;"&gt;Pre-conference workshops can be very useful and interesting; they can be a good part of the justification for attending a conference, giving an extended opportunity to focus on a single topic, followed by a broader (but shallower) look at many topics, at the conference itself. This time it is quite frustrating, as I would very much like to go to all the workshops! There is still time to &lt;a href="http://www.dcc.ac.uk/events/dcc-2009/"&gt;register for your choice, and for the IDCC conference&lt;/a&gt; itself.&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;&lt;b&gt;Disciplinary Dimensions of Digital Curation: New Perspectives on Research Data &lt;/b&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;Our SCARP Project case studies have explored data curation practice across a variety of clinical, life, social, humanities, physical and engineering research communities. This workshop is the final event in SCARP, and will present the reports and synthesis.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="mso-pagination:none;tab-stops:11.0pt 36.0pt;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#2B5ABC;"&gt;&lt;b&gt;&lt;a href="http://www.dcc.ac.uk/events/dcc-2009/programme/SCARP%20Workshop%20Programme.pdf"&gt;&lt;span style="color:#2B5ABC;"&gt;See the full programme&lt;/span&gt;&lt;/a&gt;&lt;/b&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt; &lt;b&gt;[PDF]&lt;/b&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;&lt;b&gt;Digital Curation 101 Lite Training &lt;/b&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;Research councils and funding bodies are increasingly requiring evidence of adequate and appropriate provisions for data management and curation in new grant funding applications. This one-day training workshop is aimed at researchers and those who support researchers and want to learn more about how to develop sound data management and curation plans. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="mso-pagination:none;tab-stops:11.0pt 36.0pt;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#2B5ABC;"&gt;&lt;b&gt;&lt;a href="http://www.dcc.ac.uk/events/dcc-2009/programme/IDCC%20Digital%20Curation%20101%20Lite.pdf"&gt;&lt;span style="color:#2B5ABC;"&gt;See the full programme&lt;/span&gt;&lt;/a&gt;&lt;/b&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt; &lt;b&gt;[PDF]&lt;/b&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;&lt;b&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;&lt;b&gt;Citability of Research Data&lt;/b&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;Goal: Handling research datasets as unique, independent, citable research objects offers a wide variety of opportunities.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;The goal of the new DataCite cooperation is to establish a not-for-profit agency that enables organisations to register research datasets and assign persistent identifiers to them.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;Citable datasets are accessible and can be integrated into existing catalogues and infrastructures. A citable datasets furthermore rewards scientists for their extra-work in storage and quality control of data by granting scientific reputation through cite-counts. The workshop will examine the different methods for the enabling of citable datasets and discuss common best practices and challenges for the future.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left:36.0pt;text-indent:-36.0pt;mso-pagination:none;mso-list:l0 level1 lfo1;tab-stops:11.0pt 36.0pt;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;▪&lt;span style="font:7.0pt &amp;quot;Times New Roman&amp;quot;"&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#2B5ABC;"&gt;&lt;b&gt;&lt;a href="http://www.dcc.ac.uk/events/dcc-2009/programme/Citability%20of%20Research%20Data%20DCC.pdf"&gt;&lt;span style="color:#2B5ABC;"&gt;See the full programme&lt;/span&gt;&lt;/a&gt;&lt;/b&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt; &lt;b&gt;[PDF]&lt;/b&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;&lt;b&gt;Repository Preservation Infrastructure (REPRISE)&lt;/b&gt;&lt;/span&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt; (co-organised by the OGF Repositories Group, OGF-Europe, D-Grid/WissGrid) &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-bottom:10.0pt;mso-pagination:none;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US"    style="mso-ansi-language:EN-US;font-family:ArialMT;font-size:13.0pt;color:#262626;"&gt;Following on from the successful &lt;a href="http://www.dcc.ac.uk/events/dcc-2008/programme/"&gt;&lt;span style="color:#2B5ABC;"&gt;&lt;b&gt;Repository Curation Service Environments (RECURSE) Workshop&lt;/b&gt;&lt;/span&gt;&lt;/a&gt; at IDCC 2008, this workshop discusses digital repositories and their specific requirements for/as preservation infrastructure, as well as their role within a preservation environment.&lt;/span&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4468098374502296499?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4468098374502296499/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/11/workshops-prior-to-international.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4468098374502296499'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4468098374502296499'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/11/workshops-prior-to-international.html' title='Workshops prior to the International Digital Curation Conference'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-8381619751072147109</id><published>2009-11-18T18:05:00.002Z</published><updated>2009-11-18T18:22:46.719Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data publishing'/><category scheme='http://www.blogger.com/atom/ns#' term='Research data'/><category scheme='http://www.blogger.com/atom/ns#' term='publishing'/><title type='text'>Data and the journal article</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;I recently had a discussion (billed as a presentation, but it was on such an (ahem) intimate scale that it became a discussion) at &lt;a href="http://www.ithaka.org/"&gt;Ithaka&lt;/a&gt;, the organisation in New York that runs JSTOR, ArtSTOR and Portico. We talked about some of the issues surrounding supporting journal articles better with data. Both research funders and some journals are starting to require researchers/authors to keep and to make available the data that supports the conclusions in their articles. How can they best do this?&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;It seems to me that there are 4 ways of associating data with an article. The first is through the time-honoured (but not very satisfactory) Supplementary Materials, the second is through citations and references to external data, the third is through databases that are in some way integrated with the article, and the fourth is through data encoded within the article text.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;My expectation was that most supplementary materials that included data would actually be in Excel spreadsheets, and a few would be in CSV files, while even fewer would be in domain-specific, science-related encodings. I was quite shocked after a little research to find, at least for the Nature journals I looked at, that nearly all supplementary data were in PDF files, while a few were in Word tables. I don't think I found any that were Excel, let alone CSV. This doesn't do much for data re-usability! As things stand currently, data in a PDF document (eg in tables) will probably need to be extracted by hand copy; possibly by cut and paste followed by extensive hand manipulation.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;I would expect that looking away from the generalist journals towards domain-specific titles, would reveal more appropriate formats. However, a ridiculously quick check of Chem-Comm, a Royal Society of Chemistry title, showed supplementary data in PDF even for an "electronically enhanced article (eg &lt;a href="http://www.rsc.org/suppdata/CC/b9/b912733j/b912733j.pdf"&gt;Experimental procedures, spectra and characterization data&lt;/a&gt;, perhaps not openly accessible...).&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;There’s a bit of concern in some quarters about journals managing data, particularly that data would disappear behind the pay wall, limiting opportunities for re-use.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;What would be ideal? I guess data that are encoded in domain-specific, standardised formats (perhaps supported by ontologies, well-known schemas, and/or open software applications) would be pretty useful. I’ve also got a vague sense of unease about the lack of any standardised approach to describing context, experimental conditions, instrument calibrations, or other critical metadata needed to interpret the data properly. This is a tough area, as we want to reduce the disincentives to deposit as well as increase the chances of successful re-use.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Clearly there are many cases where the data are not appropriate for inclusion as supplementary materials, and should be available by external reference. Such would be the case for genomics data, for example, which must have been deposited in an appropriate database (the journal should demand deposit/accession data before publication).&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;External data will be fine as long as they are on an accessible (not necessarily open) and reasonably permanent database, data centre or repository somewhere. I do worry that many external datasets will be held on personal web sites. Yes, these can be web-accessible, and Google-indexed, but researchers move, researchers die, and departments reorganise their web presence, which means those links will fail, and the data will disappear (see the nice Book of Trogool article &lt;a href="http://scienceblogs.com/bookoftrogool/2009/11/_and_then_what.php"&gt;"... and then what?"&lt;/a&gt;).&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Sometimes such external data can be simply linked, eg a parenthetical or foot-noted web link, but I would certainly like to encourage increasing use of proper citations for data. Citations are the currency of academia, and the sooner they accrue for good data, the sooner researchers will start to regard their re-usable data as valuable parts of their output! It’s interesting to see the launch of the &lt;a href="http://www.datacite.org/"&gt;DataCite&lt;/a&gt; initiative coming up soon in London.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;There is this interesting idea of the overlay data journal, which rather turns my last paragraph on its head; the data are the focus and the articles describe the data. &lt;a href="http://nar.oxfordjournals.org/content/vol37/suppl_1/index.dtl"&gt;Nucleic Acids Research Database Issu&lt;/a&gt;e articles would be prime examples here in existing practice, although they tend to describe the dataset as a persistent context, rather than as the focus for some discovery. The &lt;a href="http://proj.badc.rl.ac.uk/ojims"&gt;OJIMS project&lt;/a&gt; described a proposed overlay journal in Meteorology; they produced a sample issue and a business analysis, but I’m not sure what happened then.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The best (and possibly only) example I know of the database-as-integral-part-of-article approach is &lt;a href="http://intarch.ac.uk/"&gt;Internet Archaeology&lt;/a&gt;, set up in 1996 (by the eLib programme!) as an exemplar for true internet-enabled publishing. 13 years later it's still going strong, but has rarely been emulated. Maybe what it provides does not give real advantages? Maybe it's too risky? Maybe it’s too hard to create such articles? Maybe scholarly publishing is just too blindly conservative? I don't know, but it would be good to explore in new areas.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Peter Murray-Rust has argued eloquently at the tragedy of data trapped and rendered useless in the text, tables and figures of articles. We would like to see articles semantically enriched so that these data can be extracted and processed. Encoded data points us to a few examples, such as the Shotton enhanced article described in Shotton et al 2009, and also to the Murray-Rust/Sefton &lt;a href="http://ice.usq.edu.au/introduction/ice_theorem.htm"&gt;TheoREM-ICE&lt;/a&gt; approach (although that was designed for theses, I think). I think the key here is the lack of authoring tools. It is still rather difficult to actually do this stuff, eg to write an article that contains meaningful semantic content. The Shotton target article was marked up by hand, with support from one of the authors of the W3C SKOS standard, ie an expert! The chemists having been working on tools for their community, both the ICE example, and also MS Chem4Word, maybe ChemMantis, etc.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;This last paragraph also points us towards the thesis area; I think this is one that Librarians really ought to be interested in tackling. What is the acceptable modern equivalent to the old (but never really acceptable) practice of tucking disks into a pocket inside the back cover of a thesis? Many universities are now accepting theses in digital form; we need some good practice in how to deal with their associated data.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;So, we seem to be quite a way from universal good practice in associating data with our research articles.&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;Shotton, D., Portwin, K., Klyne, G., &amp;amp; Miles, A. (2009). Adventures in semantic publishing: exemplar semantic enhancements of a research article. &lt;i&gt;PLoS Computational Biology&lt;/i&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;, &lt;i&gt;5&lt;/i&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;(4). doi: &lt;/span&gt;&lt;span lang="EN-US"&gt;&lt;a href="http://www.ploscompbiol.org/doi/pcbi.1000361"&gt;10.1371/journal.pcbi.100036&lt;/a&gt;&lt;/span&gt;&lt;span lang="EN-US" style="mso-ansi-language:EN-US"&gt;1.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-8381619751072147109?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/8381619751072147109/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/11/data-and-journal-article.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8381619751072147109'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8381619751072147109'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/11/data-and-journal-article.html' title='Data and the journal article'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-5589523735288675913</id><published>2009-11-13T18:09:00.002Z</published><updated>2009-11-13T18:16:06.666Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='IDCC09'/><title type='text'>5th International Digital Curation Conference : Register Now!</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;Hear ye, hear ye! [Shameless promotion here, but with useful information embedded!]&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Time to &lt;a href="http://www.dcc.ac.uk/events/dcc-2009/"&gt;register for this premier curation even&lt;/a&gt;t, coming up in London, in the first week in December. We have a great &lt;a href="http://www.dcc.ac.uk/events/dcc-2009/programme/"&gt;programme&lt;/a&gt; this year, with Douglas Kell, head of BBSRC as the opening Keynote, and Timo Hannay of Nature as the closing keynote. In between we have perspectives on scale from US viewpoints, particularly the two large NSF-funded Datanet projects, and from the UK with reports linked to neurosciences and social simulation.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;In the first afternoon we have our popular Minute Madness, followed by the Community Space: part of the conference shaped by you, plus a symposium on citizen science.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The second day has a wide range of &lt;a href="http://www.dcc.ac.uk/events/dcc-2009/programme/accepted-papers"&gt;interesting papers&lt;/a&gt;. Do you want to know how curation is being tackled in some US universities? The implications of Chronopolis or CASPAR? What those Australians are doing in data curation? How to preserve software, or to do emulation a bit better? What metadata might be appropriate for scientific datasets, or how to extract metadata from resources better? What are the information requirements of Life Sciences, or the Arts and Humanities? How to curate a database that’s constantly changing? Then come to Kensington in December!&lt;/p&gt;&lt;p class="MsoNormal"&gt;Nearly forgot to mention the pre-conference workshops, some of which deserve blog posts of their own.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-5589523735288675913?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/5589523735288675913/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/11/5th-international-digital-curation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5589523735288675913'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5589523735288675913'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/11/5th-international-digital-curation.html' title='5th International Digital Curation Conference : Register Now!'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-8291968264902061764</id><published>2009-10-21T17:59:00.005+01:00</published><updated>2009-10-22T08:52:38.671+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='IJDC'/><category scheme='http://www.blogger.com/atom/ns#' term='IDCC09'/><title type='text'>New issue of IJDC</title><content type='html'>The latest issue (volume 4, issue 2) of the &lt;a href="http://www.ijdc.net/index.php/ijdc"&gt;International Journal of Digital Curation&lt;/a&gt; is now available. It's a bumper issue, with two letters to the editor (a whiff of controversy there!), 8 peer-reviewed papers (originating from last year's International Digital Curation Conference), and 6 general articles (two of which came from last year's iPres08 conference). I'm really pleased with this issue, which as always is extremely interesting.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This is the last issue to be produced by Richard Waller as Managing Editor, and I'd like to pay tribute to his dedication in making IJDC what it is today. He has sourced most of the general articles himself, and those who have worked with him as authors will know the courteous detail with which he has edited their work. They may not know the sheer blood, sweat and tears that have been involved, nor the extraordinarily long hours that Richard has put in to make IJDC what it is, alongside his "day job" of editing &lt;a href="http://www.ariadne.ac.uk/"&gt;Ariadne&lt;/a&gt;. Thank you so much, Richard.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;We will have a new Production Editor for the next issue, whom I will introduce when that comes out (we hope at about the same time as this year's &lt;a href="http://www.dcc.ac.uk/events/dcc-2009/"&gt;International Digital Curation Conference in London&lt;/a&gt;... have you registered yet?). We have some interesting plans to develop IJDC in volume 5, next year.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Update: I thought I should have said a bit more about the contents, so the following is abridged from the Editorial.&lt;/div&gt;&lt;div&gt;&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal" style="mso-pagination:none;tab-stops:28.0pt 56.0pt 84.0pt 112.0pt 140.0pt 168.0pt 196.0pt 224.0pt 252.0pt 280.0pt 308.0pt 336.0pt;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US" style="font-size:11.0pt;font-family:Arial;mso-ansi-language:EN-US"&gt;Two papers are linked by their association with data on the environment. Baker and Yarmey develop their viewpoint with environmental data as background, but their emphasis is more on arrangements for data stewardship&lt;i&gt;.&lt;/i&gt;&lt;/span&gt;&lt;span lang="EN-US" style="font-size:11.0pt;font-family:Arial;mso-ansi-language:EN-US"&gt; Jacobs and Worley report on experiences in NCAR in managing its “small” Research Data Archive (only around 250 TB!).&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="mso-pagination:none;tab-stops:28.0pt 56.0pt 84.0pt 112.0pt 140.0pt 168.0pt 196.0pt 224.0pt 252.0pt 280.0pt 308.0pt 336.0pt;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US" style="font-size:11.0pt;font-family:Arial;mso-ansi-language:EN-US"&gt;Halbert also looks at elements of sustainability, in distributed approaches that are cooperatively maintained by small cultural memory organizations. Naumann, Keitel and Lang report on work developing and establishing a well-thought out preservation repository dedicated to a state archive. Sefton, Barnes, Ward and Downing address metadata, plus embedded semantics; their viewpoint is that of document author. Gerber and Hunter similarly address metadata and semantics, this time from the viewpoint of compound document objects&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="mso-pagination:none;tab-stops:28.0pt 56.0pt 84.0pt 112.0pt 140.0pt 168.0pt 196.0pt 224.0pt 252.0pt 280.0pt 308.0pt 336.0pt;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US" style="font-size:11.0pt;font-family:Arial;mso-ansi-language:EN-US"&gt; Finally, we have two papers loosely linked through standards, though from different points on the spectrum of the general to the particular, as it were. At the particular end, Todd describes XAM, a standard API for storing fixed content; while from the more general end, Higgins provides an overview of continuing efforts to develop standards frameworks.&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="mso-pagination:none;tab-stops:28.0pt 56.0pt 84.0pt 112.0pt 140.0pt 168.0pt 196.0pt 224.0pt 252.0pt 280.0pt 308.0pt 336.0pt;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US" style="font-size:11.0pt;font-family:Arial;mso-ansi-language:EN-US"&gt;Moving on to general articles, in this case I would like to mention first my colleagues Pryor and Donnelly, who present a white (or possibly green?) paper on developing curation skills in the community.&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="mso-pagination:none;tab-stops:28.0pt 56.0pt 84.0pt 112.0pt 140.0pt 168.0pt 196.0pt 224.0pt 252.0pt 280.0pt 308.0pt 336.0pt;mso-layout-grid-align:none;text-autospace:none"&gt;&lt;span lang="EN-US" style="font-size:11.0pt;font-family:Arial;mso-ansi-language:EN-US"&gt;Next, I would highlight two very interesting articles that originated from iPres 2008. These are Dappert and Farquahar who look at how explicitly modelling organisational goals can held define the preservation agenda. Woods and Brown describe how they have created a prototype virtual collection of 100 or so of the thousands of CD-ROMs published from many sources, including the US Government Printing Office. Shah presents the second part of his interesting independently-submitted work on preserving ephemeral digital videos. Finally, Knight reports from a Planets workshop on its preservation approach, while Guy, Ball and Day report from a UK web archiving workshop. &lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;   &lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-8291968264902061764?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/8291968264902061764/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/new-issue-of-ijdc.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8291968264902061764'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8291968264902061764'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/new-issue-of-ijdc.html' title='New issue of IJDC'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-5344689588912140340</id><published>2009-10-19T20:25:00.003+01:00</published><updated>2010-01-05T17:27:59.605Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='PASIG'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><title type='text'>SUN PASIG: October 2009</title><content type='html'>As readers of this blog may have guessed, I was in San Francisco for the iPres 2009 Conference (17 blog posts in 2 days is something of a personal record!). This conference was followed by several others, including the Sun Preservation &amp;amp; Archiving SIG (Sun-PASIG), from Wednesday to Friday. I didn't feel quite so moved to blog the presentations as at iPres (and I was also knackered, not to put too fine a point on it). But I did not want to pass it b y completely unremarked, particularly as I really like the event. This is the second Sun-PASIG meeting I've attended, following one in Malta in June of this year (see two previous &lt;a href="http://digitalcuration.blogspot.com/2009/07/rosenthal-at-sun-pasig-in-malta.html"&gt;blog&lt;/a&gt; &lt;a href="http://digitalcuration.blogspot.com/2009/07/my-backup-rant.html"&gt;posts&lt;/a&gt;).&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It's a very different kind of meeting from iPres. The agenda is constructed by a small group, forcefully led by Art Pasquinelli of Sun and Michael Keller of Stanford. The presentations are just that; not papers. This let's them be more playful and pragmatic, also more up-to-date. Of course, there's a price to pay for a vendor-sponsored conference, although I won't reveal here what it is!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Tom Cramer has put up the &lt;a href="http://lib.stanford.edu/pasig"&gt;slides&lt;/a&gt; at Stanford, so you can explore things I was less interested in. In the first session, the presentation that really grabbed me was Mark Leggott from Prince Edward Island (I confess, guiltily, I don't really know where this is) &lt;a href="http://lib.stanford.edu/files/pasig2009sf/Islandora_PASIG_Oct09.pdf"&gt;talking about Islandora&lt;/a&gt;. This is a munge of Fedora and Drupal, with a few added bits and bobs. It looked like a fantastic example of what a small, committed group with ideas and some technical capability can do. Nothing else on day 1 caught my imagination quite so strongly, although I enjoyed Neil Jeffries' &lt;a href="http://lib.stanford.edu/files/pasig2009sf/pasig2009sf_oxford_jeffries.pdf"&gt;update&lt;/a&gt; on activities in Oxford Libraries, and Tom Cramer's own &lt;a href="http://lib.stanford.edu/files/pasig2009sf/pasig2009sf_sdr_cramer.pdf"&gt;newly pragmatic take&lt;/a&gt; on a revised version of the Stanford Digital Repository.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;On day 2 there were lots of interesting presentations. Of particular interest perhaps was the &lt;a href="http://lib.stanford.edu/files/pasig2009sf/pasig-2009-pods.pdf"&gt;description&lt;/a&gt; of the University of California Curation Center's new micro-services approach to digital curation infrastructure. I'm not quite sure I get all of this, mainly perhaps as so much was introduced so quickly; however as I read more about each puzzling micro-service, it seems to make more sense. BTW I congratulate the ex-CDL Preservation Group on their new UC3 moniker! 'Tis pity it came the same week as the New York Times moan about overloading the curation word (&lt;a href="http://www.nytimes.com/2009/10/04/fashion/04curate.html?emc=eta1"&gt;here&lt;/a&gt; if you are a registered NYT reader)...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I also very much liked the extraordinary presentation by Dave Tarrant of Southampton and Ben O-Steen of Oxford on their ideas for creating a collaborative Cloud. Just shows what can be done if you don't believe you can't! The slides are &lt;a href="http://lib.stanford.edu/files/pasig2009sf/pasif2009sf_tarrant.pdf"&gt;here&lt;/a&gt; but don't give the flavour; you just had to be there.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In a presentation particularly marked by dry style and humour, Keith Webster of UQ &lt;a href="http://lib.stanford.edu/files/pasig2009sf/pasig2009sf_fez.pdf"&gt;talked&lt;/a&gt; about Fez, and shortly after Robin Stanton of ANU &lt;a href="http://lib.stanford.edu/files/pasig2009sf/pasig2009sf_stanton.pdf"&gt;talked&lt;/a&gt; about ANDS; both very interesting. The day ended with a particularly &lt;a href="http://lib.stanford.edu/files/pasig2009sf/pasig2009sf_lesk.pdf"&gt;provocative talk&lt;/a&gt; by Mike Lesk, once at NSF for the Digital Library Initiatives, now at Rutgers. Mike's aim was to provoke us with increasingly outrageous remarks until we reacted; if he failed to get a pronounced reaction, it was more to do with the time of day and the earlier agenda. But this is a great talk, and mostly accessible from the slides.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;On the 3rd day, we had a summing up from Cliff Lynch, interesting as ever, followed by breakouts. I went to the Data Curation group (surprise!), to find a half dozen folk, apparently mostly from IT providers, very concerned about dealing with data at extreme scale. It's a big problem (sorry), but not quite what I'd have put on the agenda. But in a way it typifies Sun-PASIG: never quite what you thought, always challenging and interesting.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Shortly thereafter I had to leave, but in the middle of a fascinating discussion about the future of Sun-PASIG, particularly with the shadow of the Oracle acquisition looming. I certainly believe that the group would be useful to the new organisation, and very much hope that it survives. Next year in Europe?&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-5344689588912140340?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/5344689588912140340/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/sun-pasig-october-2009.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5344689588912140340'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5344689588912140340'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/sun-pasig-october-2009.html' title='SUN PASIG: October 2009'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2386488810852164734</id><published>2009-10-07T00:35:00.000+01:00</published><updated>2009-10-07T00:36:23.835+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Migration'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: van Horik on MIXED framework for curation of file formats</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;Scholars in the Netherlands can deposit or search information in a repository system called DANS EASY, containing about 500,000 files, with a wide diversity of formats. How do I deal with a file called cars.DBF, now an obsolete format. There system can read such formats and convert them to the XML-based MIXED format, which identifies the data type and contains information on structure and content. So this was a smart conversion from the binary, obsolete dbase file to an XML reusable file. In the future it can be converted from this format to a current format of choice. This process (allegedly) does not require multiple migrations…&lt;/p&gt;  &lt;p class="MsoNormal"&gt;They have a SDFP community model for spreadsheet and tabular data. Have created some code for DBF and DataPerfect formats that they had to reverse engineer, in SourceForge; this a very labour-intensive activity, and really should be a community effort.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question: does reverse engineering expose to risk? Don’t know…&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2386488810852164734?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2386488810852164734/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-van-horik-on-mixed-framework.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2386488810852164734'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2386488810852164734'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-van-horik-on-mixed-framework.html' title='iPres 2009: van Horik on MIXED framework for curation of file formats'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1263375660700566396</id><published>2009-10-07T00:16:00.000+01:00</published><updated>2009-10-07T00:17:19.661+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Brown on font problems</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;They have a very large collection of documents, some of which had Texas Instrument calculator fonts, which had maths symbols in them, but didn’t always render properly with font substitutions. Several other examples, including barcode fonts (where font substitution can give the numeric value, not losing information but losing functionality).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The top 10 fonts in a collection tend to be the same; it’s the long tail of up to 3,000 or so that might be the problem. Font names help a bit but there are huge variations in font names, eg 50+ for Arial alone! In fact, it’s quite difficult to get useful matches from font names with fonts in font tables, some of which have very weak information content. Times new Roman satisfies about 38% of documents in their collection; Windows XP + Word satisfies about 80% of the documents in the collection; the large collection of fonts they assembled would satisfy about 95% of the collection, many more would be needed to build that up higher.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Worst example was a Cyrillic font, called Glasnost-light but rendered as ASCII; the problem was related to the pre-Unicode code space in some way I didn’t understand. A font substitution looked hopeful; it produced Cyrillic, but unfortunately not Russian, as the encoding was different.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Comment: this is a difficult problem much dealt with in the commercial community, who have secret tables. But even Adobe only deals with a couple of thousand fonts.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1263375660700566396?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1263375660700566396/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-brown-on-font-problems.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1263375660700566396'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1263375660700566396'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-brown-on-font-problems.html' title='iPres 2009: Brown on font problems'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-3204977062224590006</id><published>2009-10-06T23:56:00.000+01:00</published><updated>2009-10-06T23:57:11.684+01:00</updated><title type='text'>iPres 2009: Tarrant, the P2 Registry, Where the Semantic Web and Web 2.0 meet format risk management</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;P2-registry is a demo of what we can do if we publish in a web 2 fashion. The mainstream here is the web, for the community&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Linked data: every slide has links to where the stuff comes from. See the graph on linked data, let’s get in that graph. Using linked data reduces redundancy, facilitates re-use and maximises discovery. The community is not just consumers, also publishers. Because of links to namespaces, this contributes to building trust.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The main node is DBpedia, which is in fact Wikipedia marked up as RDF. Lots of people reference it and link to it. Give URIs to things: Tarrant has a URI; his home page is not him; has a URL that’s not the same (but relates).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;4 rules of linked data: use URIs as the names of things; use HTTP URIs so they can be looked up; when someone looks them up, provide useful information, include links to other useful things.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Here, data are facts, facts are represented as triples, in RDF. OWL &amp;amp; RDFS provide means to represent your RDF model. It’s machine readable and validatable. Importing data from multiple domains, you can use OWL to say a thing in one domain is the same as another thing in&lt;span style="mso-spacerun: yes"&gt;  &lt;/span&gt;different domain.. Used PRONOM and Wikipedia to build a small ontology that describes what can be done by different software. The underlying registry is a triple store, it understands RDF, so 19 possible answers are turned into 70 with some data alignment. Then used these data to perform a basic risk analysis on PDF.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Take home message: data hidden in registries is not easily discoverable so is little used, so publish it on the web and it can be much more widely used.&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Trust seems an issue in so many name spaces, but hopefully it all works out….&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-3204977062224590006?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/3204977062224590006/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-tarrant-p2-registry-where.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3204977062224590006'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3204977062224590006'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-tarrant-p2-registry-where.html' title='iPres 2009: Tarrant, the P2 Registry, Where the Semantic Web and Web 2.0 meet format risk management'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2061526140021380480</id><published>2009-10-06T22:58:00.001+01:00</published><updated>2009-10-06T22:59:58.301+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Kirschenbaum &amp; Farr on digital materiality: access to the computers</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;This seems to be about the digital equivalent of literary personal papers; an urgency based on the recent deaths of authors like John Updike &amp;amp; others. Based on planning grant funding from NEH, resulting in a deliverable as a White Paper.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Digital objects in this case are artefacts, not just records; both the physical and the virtual require materiality. Some of this is regarding the computers as important parts of the creative context.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Recommendations: keep the hardware and storage media. You can tell things from hand-writing on diskette labels, etc.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Recommendation: Image disks (both pictorial images, but also forensic imaging), see Jeremy Leighton John.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Recommendation: computer forensics (see forthcoming CLIR/Mellon report on Computer Forensics in Cultural Heritage, expected to be available next fall).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Recommendation: document the original environment, eg 360 degree views.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Recommendations: value from interviewing the donors themselves.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Recommendation: since they are balancing lots of needs, they need to put careful thought for interface development.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Recommendation: Scholarly Communication Needs, have to have new tools and methodologies on citation (eg of a tracked change in a Word document), reproduction, copyright and IP issues. White paper available at &lt;a href="http://www.neh.gov/ODH"&gt;http://www.neh.gov/ODH&lt;/a&gt; …&lt;/p&gt;  &lt;p class="MsoNormal"&gt;There is a time window open now that may not stay open for long, for computers from the early 1908s!&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2061526140021380480?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2061526140021380480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-kirschenbaum-farr-on-digital.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2061526140021380480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2061526140021380480'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-kirschenbaum-farr-on-digital.html' title='iPres 2009: Kirschenbaum &amp; Farr on digital materiality: access to the computers'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1837659961289461015</id><published>2009-10-06T22:31:00.002+01:00</published><updated>2009-10-06T22:34:50.951+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Archival media'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Guttenbrunner on Digital Archaeology, recovering digital objects from audio waveforms</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;Early home computers often used audio cassettes as data media. Quite a bit of such data still exist in audio tapes in various archives, getting in worse and worse condition. Can they migrate the data without the original system in the future?&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The system they used is the Philips Videopac+ G7400, basically a video game system released in 1983… and another one (!).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Data are encoded in bitstreams, which in turn are encoded in analogue waveforms (via a microphone/headphone socket pair and an audio cassette system!). They worked out how the waveforms responded to changes in the data (basically reverse-engineering the data encodings; would not have been so easy without a working computer). As a result, they were able to write a migration tool from the audio streams to non-obsolete formats.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;It turned out there was already a solution that worked where there was a good signal from the tape, but these were often very old tapes in poor condition, so they implemented a different approach, which worked better.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Using old tapes, the other system recognised found no files. The actual system recovered 6 out of 23. Their new implementation recovered 22 out of 23 files, in some cases with errors. They checked by re-encoding the recovered files (on new tapes) and reloading to the actual system; most had minor errors that could be fixed if you knew what you were doing.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;They think their findings are valid for all systems that use audio encodings, although there will be wide variations in encodings and file types, but it’s not extensible to other media types.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1837659961289461015?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1837659961289461015/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-guttenbrunner-on-digital.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1837659961289461015'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1837659961289461015'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-guttenbrunner-on-digital.html' title='iPres 2009: Guttenbrunner on Digital Archaeology, recovering digital objects from audio waveforms'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-894969573474067862</id><published>2009-10-06T22:27:00.001+01:00</published><updated>2009-10-06T22:29:13.795+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><category scheme='http://www.blogger.com/atom/ns#' term='Blog'/><title type='text'>iPres 2009: Pennock on ArchivePress</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;Blogs are a new medium but an old genre, witness Samuel Pepys’ diaries for instance (now also a blog!). But since they are web based, aren’t they already archived through web archiving? However, simple web archiving treats blogs simply as web pages; pages that change but in a sense stay the same. Web archiving also can’t easily respond to triggers, like RSS feeds relating to new postings. Web archiving approaches are fine, but don’t treat the blogs as first class objects.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;New possibilities can help build new corpora for aggregating blogs to create a preserved set for institutional records and other purposes. ArchivePress is a JISC Rapid Innovation (JISCRI) project, which once completed will be released as open source. The project started with a small 10-question survey, for which the key question was: which parts of blogs should archiving capture. In descending order the answers were posts, comments, tag &amp;amp; category names, embedded objects, and the blog name &amp;amp; URLs. These findings were broadly in agreement with an earlier survey 9see paper for reference).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Set out to find the significant properties of blogs. Significant properties, they see as in the eye of the stakeholder. First round this includes content (posts, comments, embedded objects), context (including authors &amp;amp; profiles), structure, rendering and behaviour.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;To achieve this, they build on the Feed plugin for WordPress, which gathers the content as long as a RSS or Atom feed is available. WordPress is arguably the most widely used, it’s open source, it’s GPL and it has publicly available schemas.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Maureen showed the AP1 demonstrator based on the DCC blogs [disclosure: I’m from the DCC!], including blog posts written today that had already been archived. The AP2 demonstrator (the UKOLN collection) will harvest comments, and resolving some rendering and configuration issues from AP1; and will allow administrators to add new categories (tags?).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;It seems to work; there turned out to be more variations in feed content than expected. Configuration is tricky, so must make it easier.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-894969573474067862?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/894969573474067862/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-pennock-on-archivepress.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/894969573474067862'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/894969573474067862'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-pennock-on-archivepress.html' title='iPres 2009: Pennock on ArchivePress'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-8173036049044003064</id><published>2009-10-06T20:28:00.001+01:00</published><updated>2009-10-06T20:31:00.169+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='Collaboration'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Collaboration</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;iPres 2009: Martha Anderson on Enabling Collaboration for Digital Preservation&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Collaboration is what you do when you can’t solve a problem by yourself. Digital Preservation is such a problem. That was Martha’s summary of her very interesting presentation recapping the NDIPP so far, and giving some excellent guidelines relating to modes of collaboration. She spoke also about an upcoming National Digital Stewardship collaboration, which if I understood it is based round organisations (government?) taking some shared responsibility for the future of their data.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;o:p&gt;iPres 2009: Panel on challenges on distributed digital preservation&lt;/o:p&gt;&lt;/p&gt;&lt;!--StartFragment--&gt;    &lt;p class="MsoNormal"&gt;All the speakers participate in Private LOCKSS Networks (PLNs), although there are others eg Chronopolis. Meta Archive Cooperative is growing slowly, recent new members include Hull in the UK, but has a list of up to 40 potential associates. Alabama Digital Preservation Network (ADPnet?) focuses particularly on being simple and cheap. Canadian library consortium (COPL?) has a PLN with 8 members out of 12 in the consortium.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Organisational challenges on starting up, eg creation of Meta Archive West as new startup versus extension of existing. Issues are the same: organisational, technology and sustainability, of which the first and last are the parts of the iceberg under the water! Some very interesting points made about many aspects of these networks.&lt;/p&gt;  &lt;!--EndFragment--&gt;     &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-8173036049044003064?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/8173036049044003064/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-collaboration.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8173036049044003064'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8173036049044003064'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-collaboration.html' title='iPres 2009: Collaboration'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2215049589480982430</id><published>2009-10-06T18:22:00.001+01:00</published><updated>2009-10-06T18:25:20.787+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Research data'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Data'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Micah Altman Keynote on Open Data</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;Open Data is at the intersection of scientific practice, technology, and library/archival practice. Claims that data are at the nucleus of scientific collaboration, and data are needed for scientific replication. Science is not just scientific; it becomes science after community acceptance. Without the data, the community can’t work.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Open data also support new forms of science &amp;amp; education: data intensive science, which also promoted inter-disciplinarity. Open data also democratise science: crowd-sourcing, citizen science, developing country re-use, etc. Mentions Open Lab Notebook (Jean-Claude Bradley), Galaxy Zoo etc.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Open data can be scientific insurance; that little extra bit of explanation makes your own data more re-usable, and can give your project extended life after initial funding ends.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Data access is key to understanding social policy. Governments attempt to control data access “to evade accountability”.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Why do we need infrastructure? [Huh?] While many large data sets are in public archives, many datasets are hard to find. Even problems in professional data archives: links, identifiers, access control, etc. So, core requirements…&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Stakeholder incentives&lt;/li&gt;&lt;li&gt;Dissemination inc metadata &amp;amp; documentation&lt;/li&gt;&lt;li&gt;Access control&lt;/li&gt;&lt;li&gt;Provenance: chain of control, verification of metadata &amp;amp; the bits&lt;/li&gt;&lt;li&gt;Persistence&lt;/li&gt;&lt;li&gt;Legal protection&lt;/li&gt;&lt;li&gt;Usability&lt;/li&gt;&lt;li&gt;Business model…&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;                &lt;p class="MsoNormal"&gt;Institutional barriers: no-one (yet?) gets tenure for producing large datasets [CR: not sure that’s right, in some fields eg genomics etc data papers amongst highest cited]. Discipline versus institutional loyalties for deposit. Funding is always an issue, and potential legal issues raise their heads: copyright, database rights, privacy/confidentiality etc.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Social Science was amongst the first disciplines to establish shared data archives (eg ICPSR, UKDA etc), in the 1960s [CR: I believe as an access mechanism originally: to share decks of cards!]. Mostly traditional data, not far beyond quantitative data. More recently community data collections have been established, eg Genbank etc; success varies greatly from field to field. Institutional repositories mostly preserve outputs rather than data, and most only have comparatively small collections. They provide so far only bit-level preservation, mostly not designed to capture tacit knowledge, and have limited support for data. More recently still, virtual hosted archives are happening: institutionally supported but depositor-branded (?), eg Dataverse Network at Harvard; Data360, Swivel. Some of these have already gone out of business; what does that do to trust re persistence of service &amp;amp; data; can you self-insure through replication?&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Cloud computing models are interesting, but mostly Beta, and often dead on arrival or soon after. What about storing data in social networks (which are often in/on the cloud). Mostly they don’t really support data (yet), but they do “leverage” that allegiance to a scientific community.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Altman illustrated a wide range of legal issues affecting data; not just intellectual property, but also open access, confidentiality, privacy, defamation, contract. Traditional ways of handling some of this was de-identification of data; unfortunately this is working less and less well, with several cases of re-identification published recently (eg Netflix problem, Narayan et al). [CR; refreshing to hear a discussion that is realistic about the impossibility of complete openness!]&lt;/p&gt;  &lt;p class="MsoNormal"&gt;So instead of de-identifying at the end, we’re going to have to build in confidentiality (of access) from the beginning! Current Open Access licences don’t cover all IP rights (as they vary so widely), don’t protect 3&lt;sup&gt;rd&lt;/sup&gt; party liability, and often mutually incompatible.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Altman ending on issues at intersections, starting with data citation: “a real mess”. At least should be some form of persistent identifier. UNF as a robust, coded data integrity check (approximation, normalisation, fingerprinting, representation). Technology can facilitate persistent identifier [CR: not a technology issue!], deep citation (subsets), versioning. Scientific practices evolve: replications standards, scientific publications standards.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;There is a virtuous circle here: publish data, get data cited, encourages more data publication and citation!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Next BKN, which sounds like a Mendeley/Zotero/Delicious-like system, transforming treatment of bibliographies &amp;amp; structured lists of information.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The Dataverse network: an open source, federated web 2.0 data network, a gateway to &gt;35,000 social science studies. Now being extended towards network data. Has endowed hosting.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;DataPASS, a broad-based collaboration for preservation.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Syndicated Storage Project: replication ameliorates institutional risk to preservation. Virtual organisations need policy-based, auditable, asymmetric replication commitments. Formalise these commitments, and layer on top of LOCKSS. Just funded by IMLS to take the prototype, make it easier to configure, open source etc.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Prognostication: archiving workflow must extend backwards to research data collection [CR: Yeah!!!]. Data dissemination &amp;amp; preservation increasingly hybrid approach. Strengthening links from publication to data, makes science more accountable. Effective preservation &amp;amp; dissemination is a co-evolutionary process: technology, institution &amp;amp; practice all change in reaction to each other!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question: what do you mean by extending backwards? Archiving often captured when the research is done; becomes another chore, lose opportunity to capture context. So if the archive can tap into the research grid, the workflow can be captured in the archive.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question (CR): depositor/re-user asymmetry? It does exist; data citation can help this!&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2215049589480982430?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2215049589480982430/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-micah-altman-keynote-on-open.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2215049589480982430'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2215049589480982430'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-micah-altman-keynote-on-open.html' title='iPres 2009: Micah Altman Keynote on Open Data'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1146507451324215698</id><published>2009-10-06T00:16:00.001+01:00</published><updated>2009-10-06T00:18:23.536+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='sustainability'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Kejser on Danish cost model on migration</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;[CR: missed the start of this posting the last one!]&lt;/p&gt;&lt;p class="MsoNormal"&gt;Using a cost model for digital curation, based on the functional breakdown from OAIS. Multiply break down activities until get to costable components; loos rather frightening. Have use case for digital migration. Cost factors include format interpretation, software provision (development of reader, writer &amp;amp; translator). Interesting data in person weeks for development of migration, eg TIFF to PDF/A as 34.7 person weeks (!!)&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Reporting results of some earlier stuff; A-archives dating 1968-1998; very heterogeneous; B &amp;amp; C archives more recent&lt;span style="mso-spacerun: yes"&gt;  &lt;/span&gt;and more homogeneous. Shows results from model predictions and actual costs, differences mostly because the A archives were so hard. Also, for the better archives, the mode did well overall but under-estimated some parts and over-estimated other parts.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Second test case was migration of 6 TB of data in 2000 files (very big ones: 300 MByte each). They bought software; the model over-estimated the “development” time on this basis, but under-estimated the processing, perhaps because of the very big files; throughput was very low.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Overall, they found that detailed cost factors make the model not an accurate predictor (but still useful). Precision an issue; models are inaccurate per se, but sometimes give impression of accuracy.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Searching for studies on format life expectancy and migration frequency [longer and less in my view].&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question: how about software re-use? They cost on a first mover basis. Also migration tools do also become obsolete.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question: why did you think migrating from PDF was necessary? Hardly a format at risk. Turns out to be a move from proprietary to non-proprietary.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question on scaling: thousands to hundreds of millions of objects; will these apply. Answer was that they will. [CR: doubt this; biggest flaw in LIFE so far has been devastating scaling problems.]&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1146507451324215698?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1146507451324215698/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-kejser-on-danish-cost-model.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1146507451324215698'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1146507451324215698'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-kejser-on-danish-cost-model.html' title='iPres 2009: Kejser on Danish cost model on migration'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1891826460790529577</id><published>2009-10-05T23:52:00.000+01:00</published><updated>2009-10-05T23:53:27.083+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='sustainability'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Wheatley on LIFE3</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;Paul reviews the two earlier phases of LIFE; LIFE3 is UCL, BL &amp;amp; HATII [disclosure: DCC partner; disclosure, I’m also an “expert” on LIFE panels etc] at Glasgow University. Defined a lifecycle approach to costing, creating a generic model of digital preservation lifecycle. LIFE3 is now trying to create a costing tool based on costing models based on stages of the digital lifecycle. Will use previous LIFE data, also Keeping Research data Safe project.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Tool inputs: content profile, organisational profile and context. Lifecycle stages are creation/purchase, acquisition, ingest, bit-stream preservation, content preservation, and access. Where possible, exploit existing work, eg PLANETS work building on DROID, also FITS tool (?), also looking at DRAMBORA &amp;amp; DAF, plus PLANETS tool PLATO.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;A template approach lowers the barrier for non-digital preservation people.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Context: still very much a hybrid world; analogue as well as digital. Non-digital not dying, but usage increasing. Also greater variety of digital content, eg video etc. Resources are currently 20:1 on preservation of non-digital to digital, but will need to move more towards 1:1. Need to think about the risk elements as well as cost elements.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;LIFE is also expected in the BL to support preservation planning, eg in purchase/acquire/digitise, and in selecting appropriate preservation strategies. Finally, need to budget for resulting costing [CR: the feedback from prediction versus actual could be very interesting!]&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Challenges and request for help: had a simple categorisation of content type &amp;amp; complexity. This has been criticised but without a better example. Hlpe, please. Also need more costing data. Finally, will be trialling models, and we’d like to hear from anyone who might want to participate in this.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1891826460790529577?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1891826460790529577/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-wheatley-on-life3.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1891826460790529577'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1891826460790529577'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-wheatley-on-life3.html' title='iPres 2009: Wheatley on LIFE3'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-6699824920297446731</id><published>2009-10-05T22:55:00.001+01:00</published><updated>2009-10-05T22:57:08.762+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Conway on a Preservation Analysis Methodology</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;Based on DCC SCARP [disclosure: I'm at the DCC] &amp;amp; CASPAR projects. Need to do preliminary analysis of data holdings, then do a stakeholder and archive analysis. Eg a project started in the 1920s, which started from radio, through radar, later ionosphere studies. Then define a preservation objective, which should be well-defined, actionable, measurable, realistic. Assess this against a particular designated community (DC).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;From this design preservation information flows; there are always important elements beyond the actual data that are important, eg software, documentation, database technologies, etc. Then do a cost/benefit/risk analysis. Interesting issue about the nature of the relationship between archivist and the science community (producing and consuming).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;They seem not to want to define objectives in science discovery terms (eg gravity wave research from wind profile data) but much more specifically in terms of 11 specific parameters. Describes a rather over-the-top AIP including FORTRAN manuals, to read NetCDF files (maybe I misunderstood this bit).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;They then find that this homework makes it easier to interface with DRAMBORA &amp;amp; TRAC for audit &amp;amp; certification, and the PLATTER tool from PLANETS. Work may also help to build business cases for preservation of these data.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question:&lt;span style="mso-spacerun: yes"&gt;  &lt;/span&gt;How well does this archivist/community relationship scale? Does not require those relationship, but exploit it where it exists. Point is to use all the assets you have.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question: Different types of infrastructure, eg computer centres; have any taken initiatives themselves? Mostly at present it’s a “found” situation rather than a designed one.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Comment: worth looking at the DRIVER project, with concept of enhanced publication, ie data plus supporting documentation.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-6699824920297446731?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/6699824920297446731/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-conway-on-preservation.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6699824920297446731'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6699824920297446731'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-conway-on-preservation.html' title='iPres 2009: Conway on a Preservation Analysis Methodology'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-5850999516387653576</id><published>2009-10-05T22:32:00.001+01:00</published><updated>2009-10-05T22:33:48.692+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Pawletko on TIPR’s progress towards interoperability</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;Motivation is to distribute data not just geographically but also across different technologies. Also preserving through software changes; forward migrate to later versions, or replacements. Also to have a succession plan for the case where the repository fails.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;TIPR is defining a common exchange format. Involves FCLA using DAITSS, Cornell using ADORE but migrating to FEDORA, NYU using DSpace. FCLA have one AIP per intellectual entity, and they retain the first and the latest representation. Cornell hold one AIP for each representation. NYU also has one AIP (didn’t catch how it works).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Format is called the Repository Exchange Package (RXP) based on METS and PREMIS. Need to work with multiple sources, but contain sufficient data for the receiving repository to do what it needs. Minimal structure is 4 files in a directory. A METS document about the source repository, plus provenance and optional rights, plus the actual representations in the package. The second file contains information about provenance. Then two more PREMIS files (?); finally a files manifest (cf BAGIT). [I’m not sure I’m capturing this well, best look at the PPTs later. But why are the slides blue and yellow mixed up???]&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Transfer tests: a broadcast transfer, and a ring transfer. In the latter case, each RXP is ingested, then disseminated and sent on to the next, until it gets back to the first. They have built a lot of stuff, and implemented the broadcast transfer test. Next steps: the ring test, and try different (wacky!) RXPs.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question: why use METS/PREMIS but not RDF &amp;amp; ORE? Familiarity!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question: will this work with Bagit? Yes; they use Bagit right now…&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-5850999516387653576?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/5850999516387653576/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-pawletko-on-tiprs-progress.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5850999516387653576'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5850999516387653576'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-pawletko-on-tiprs-progress.html' title='iPres 2009: Pawletko on TIPR’s progress towards interoperability'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7430236563753579129</id><published>2009-10-05T22:14:00.001+01:00</published><updated>2009-10-05T22:15:34.371+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Schmidt on a framework for distributed preservation workflows</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;Schmidt is associated with the EU PLANETS project, building an integrated system for development &amp;amp; evaluation of preservation strategies. Environment based on service-oriented architecture, with platform, language and location independence.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Basic building blocks are preservation interfaces (verbs). Define atomic preservation activities; low level concepts &amp;amp; actions; light-weight &amp;amp; easy to implement. &gt;50 tools wrapped up as the PLANETS service. Plus digital objects (the nouns): generic data abstraction for modelling digital entities. Minimal &amp;amp; generic model for data management, with no serialisation schema, so perhaps create from DC/RDF, serialise with METS etc.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Digital Object Managers map from source (eg OAI-PMH) to PLANETS Dos. There are PLANETS registry services. There is a workflow engine driven by templates. Developers create workflow fragments, experimenters select fragments and assemble, configure &amp;amp; execute them. Workflows implemented by workflow execution engine (WEE: a level 2 abstraction), which looks at access management etc. (?).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;This ends up being, not an out of box solution, but an extensible network of services, but capable of public deployment to allow sharing of resources and results.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question: are services discoverable by format? Use PRONOM as format registry, also building an ontology that defines which property can be preserved.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Question: Does PLANETS assume a particular preservation strategy? Have tools for emulation and migration.&lt;/p&gt;&lt;p class="MsoNormal"&gt;Question: are tools deployed outside the project? Not yet but trying to figure out.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7430236563753579129?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7430236563753579129/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-schmidt-on-framework-for.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7430236563753579129'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7430236563753579129'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-schmidt-on-framework-for.html' title='iPres 2009: Schmidt on a framework for distributed preservation workflows'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-8495558380509653210</id><published>2009-10-05T21:51:00.001+01:00</published><updated>2009-10-05T21:53:13.929+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Engineering data'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Wilkes on preservation in product lifecycle management</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt; Wolfgang Wilkes is part of the EU Shaman project, running from 2008 to 2011, working in libraries &amp;amp; archives, e-science, and engineering environments; this talk covers the latter. Different phases of a product’s life generate different data; lots of thi is required to maintain a product through its life. Important for long-lived products: cars, aeroplanes, process plants. Many jurisdictions have strong legal requirements to keep data; there may also be contractual requirements. There are also economic reasons: a long-lived item needs modifications through its life, that will be helped by such information. &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;However, these data are very complex, structured data, often in tools with strongly proprietary data formats. Also the players in different phases of the lifecycle are very different. So ingest becomes a process, not an event. And close control over access will be essential because of high IP value. &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The project’s focus is to look at the interaction of the Product Lifecycle Management systems and the digital preservation system. So there is a Shaman information lifecycle.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Pre-ingest: creation and assembly, important for capturing metadata and data. May need to transform proprietary information into standards-based (may lose some information, but that’s better than losing all of it!).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Post access: adoption and re-use. May need to transform back from standards to tool-specific formats.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Need to use the PLM system (which captures stuff in its own repository). The preservation system can’t work o its own; needs preservation extensions to the core PLM system.&lt;span style="mso-spacerun: yes"&gt;  &lt;/span&gt;So need additional PLM functions, but also additional DP functions. Open research topics: detailed spec of DP service interface, dealing with distributed archives, capturing &amp;amp; generation of metadata, linking to external ontologies, etc.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-8495558380509653210?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/8495558380509653210/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-wilkes-on-preservation-in.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8495558380509653210'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8495558380509653210'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-wilkes-on-preservation-in.html' title='iPres 2009: Wilkes on preservation in product lifecycle management'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2690040699627974389</id><published>2009-10-05T20:07:00.000+01:00</published><updated>2009-10-05T20:09:56.126+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009: Lowood on why Virtual Worlds are History</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt; Starts with a couple of stories about the end of virtual worlds, and one about a player whose death was accompanied by outpourings of grief; it later turned out that even the player (and her death) were virtual.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Can replay files help? Can relive the actions of a long-dead game by a long-dead player. But even if we save every such replay, we still don’t save the virtual world. &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Events in the world that start then end can leave no record; they get deleted and are no longer there to preserve. No newspapers will blow in the wind, no records in dusty digital filing cabinets. Context will have gone, even if you manage a perfect replay reconstruction of the game. &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Time to get positive. The “How to get game” project at Stanford started with an artefact donation, has now developed (with others, &amp;amp; NDIPP funding) into the “Preserving Virtual Worlds” project.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Should we try to preserve the game as digital artefact, or the documentation of context? Better not to take this either-or attitude, but it may be forced on us.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Replays can depend on the exact version of game software, which is constantly changing.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Project is taking a multi-stranded approach: saving player movies, crawling sites for documentation, etc. Perhaps can use the facilities of the virtual worlds themselves, eg virtual world coordinates to navigate; can you participate as a player and use this to capture stuff? [I’m not sure I’m getting this right, on the fly!]&lt;/p&gt;  &lt;p class="MsoNormal"&gt;What about access? Suggests it’s not a core concern for preservation [wrong??!!]. But they have some techniques that might help here.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Evaluating open world game platform, Sirikata, as a mechanism for preserving some virtual worlds, by moving maps and objects from one game space to another. Exported files from Quake to OpenVRML, then can import into other worlds like Sirikata.&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2690040699627974389?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2690040699627974389/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-lowood-on-why-virtual-worlds.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2690040699627974389'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2690040699627974389'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-lowood-on-why-virtual-worlds.html' title='iPres 2009: Lowood on why Virtual Worlds are History'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7787001270010396558</id><published>2009-10-05T19:36:00.001+01:00</published><updated>2009-10-05T19:39:11.992+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009 Conference 2: BRTF-SDPA panel</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;Brian Lavoie introduces the Blue Ribbon Task Force on Sustainable Digital Preservation and Access. Sustainability is much more than a technical issue; very much an economic issue. The Report of the task force (full disclosure: I’m a member of this TF) is due early next year. &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Definition of economic sustainability was in the &lt;a href="http://brtf.sdsc.edu/biblio/BRTF_Interim_Report.pdf"&gt;interim repor&lt;/a&gt;t, published last year. Requires:&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Recognition of benefits [demand-side]&lt;/li&gt;&lt;li&gt;Incentives to preserve [supply-side]&lt;/li&gt;&lt;li&gt;Selection to match means to ends&lt;/li&gt;&lt;li&gt;Mechanisms to support ongoing allocation of resources&lt;/li&gt;&lt;li&gt;Appropriate organisation &amp;amp; governance&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;          &lt;p class="MsoNormal"&gt; Digital preservation is a process where active intervention/investment is needed to reduce the risks to the assets. Mostly this has been seen in technical terms, but we also need to reduce the economic risks, and ensure active support by decision-makers.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Abby Smith moderating the panel, which includes Martha Anderson, Paul Courant (both on the TF), and Tricia Cruse (not on the TF but here acting as control). Although the TF is anglo-centric (albeit US-focused), we think the problem is universal. Questions first on demand-side.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Tricia Cruse on how CDL Preservation program serves UC &amp;amp; UC Libraries; the Libraries certainly see the value, but they are now taking the issue out to the academics. Phrase from Climate Change debate: challenge to preserve in a dynamic environment. Problem is that 3 or 5-year grant doesn’t encourage long-term thinking. Need to re-articulate preservation as a way to help with their current problem, rather than long term.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Martha Anderson on issues that are persuasive to Congress &amp;amp; national government. The most important argument is to demonstrate value to the nation. Again, this is value for now, rather than value for the future. Use for education is often a winning argument. NDIPP has conditions, one of which is “demonstrate they have used the money well” so gaining trust is an issue. Funding subject to changing administration, which does mean arguments have to be slightly longer term.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Paul Courant on how to deal with 5 years being the new forever. Except for those here present (and following remotely) almost nobody cares! Waxing rhapsodic doesn’t well. Preservation is what major institutions are about? There’s a need to express non-monetised value. Libraries grew because bigger collections attract better faculty, but that compact is breaking down; I don’t need to be at an institution to use their resources. Plus the preservation was happening almost by accident, because the books tended to live so. Is the answer, get the first 5-10 years out of the way? Hold on for things for “long enough”, so you can decide later on what is worth while. “Show me the value”. Now for the supply-side.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Tricia: at the administrative level. Can we try scare stories; what is the cost of not preserving. Can tools and services help people to exploit their data better, in ways that bolster reputation? Also, think of data more as publications; could be real incentive to researchers; needs data citation mechanism (eg the DOI for data movement (personally a little unconvinced).&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Martha: for cultural and public policy records, what are the barriers? Mass and complexity are barriers: the problem is so major, it’s overwhelming. Strategy is to share the work, share the burdens; this needs a public policy framework. Two aspects; one is change in copyright legislation (eg section 108 report) to give more powers for other libraries to preserve. Second is tax incentive etc for individuals, corporations etc to pay attention to preservation, offsetiing costs, and enhancing motivations to donate.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Paul: what better incentives are there? Design of mechanisms often difficult. General principle is, demand creates supply. So if we can articulate demand, then supply will follow, but well articulated demand means money attached! NSF conditions on data management plans are OK, but Paul is yet to meet the person who does not get the next grant because of doing a bad job previously, so this is perhaps not yet working. But it’s hard to get current money on many potential future uses, especially scholarly ones. An intermediate case is the notion of handoffs: those with current interest and those with future interest are different people, so can we put mechanisms (eg libraries) to deal with that.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Questions from the floor: what’s the economic argument for preserving open access journals, since there’s no financial interest? The only way those journals will become part of the formal record of scholarship, is if we do preserve them. [Mind you, OA journals are surely low-hanging fruit?]&lt;/p&gt;  &lt;p class="MsoNormal"&gt;How can economies of scale come into the argument? &gt;500K libraries across the world; does this help or hinder? Too many dispersed, closed efforts. Preserving for everyone introduces the free rider problem, which reduces incentives to pay to preserve if you can live off the activities (and costs) of others. Can we build up networks to coordinate better? [Mind you, uncoordinated is good; avoids coordinated failure…]&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Focus on 5 years is no use if the record set is going to be closed for 30 years? Perhaps there is a middle way here; ways to study closed portions of collections in privileged ways. Handoffs might help here (with anonymisation in place), but handoffs represent possible single points of failure.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Does winning the argument tip us over an economic cliff? Claim that costs of digital preservation are orders of magnitude higher than physical preservation [which I dispute!]. Paul: keeping print is much more expensive because of the huge space implications, and that’s a real cliff! Archives might be different in that respect.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;To end, Abby asks that each organisation that can answer yes to the 5 impklied question, come see us afterwards!&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7787001270010396558?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7787001270010396558/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-conference-2-brtf-sdpa-panel.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7787001270010396558'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7787001270010396558'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-conference-2-brtf-sdpa-panel.html' title='iPres 2009 Conference 2: BRTF-SDPA panel'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7864259289135757406</id><published>2009-10-05T18:09:00.001+01:00</published><updated>2009-10-05T18:11:55.793+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='iPres09'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><title type='text'>iPres 2009 Keynote: David Kirsch</title><content type='html'>Keynote from David Kirsch: Public Interest in Private Digital Records. The Corporation an extremely powerful institution in society; we don’t take enough advantage of it. Would it be enough to save personal communications? He doesn’t think it’s enough… Public interest: 17&lt;sup&gt;th&lt;/sup&gt; century jurist Matthew Hale “When private property is affected with a public interest, it ceases to be juris privati only”. Harvard longest continuously incorporated institutions in US! Corporations now legal persons. Now people want to be corporations! Where is this going?    &lt;p class="MsoNormal"&gt;Aha! There may be a public interest in their private records, but problem in accessing without infringing private rights. Should corporations have a right to be forgotten (apparently part of EU charter of human rights)? Challenge: the digital record of business is at risk. The major legal power of legal discovery means corporations don’t want to create records (something similar in UK when freedom of Information came in: shredders were furiously active). IT Knowledge Management makes corporate records more valuable, but lawyers want them to be destroyed ASAP.&lt;/p&gt;    &lt;p class="MsoNormal"&gt;Could corporations see their own self-interest in preserving their records? Can collective action help? Eg Chemical Industry Institute for Toxicology set up to research health impacts of formaldehyde… Possible National Venture Archive?&lt;/p&gt;    &lt;p class="MsoNormal"&gt;Possible “stroke of a pen” approach: create a public interest in the private records. Make a national register of Historical Documents. Escrow institutions, make the records “beyond discovery”? Technical redaction or selective invalidation? US taxpayers now own big companies like GM; for $50B, shouldn’t we at least get the records?&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;p class="MsoNormal"&gt;3&lt;sup&gt;rd&lt;/sup&gt; possible mechanism: abandoned interest: failed companies lose power to dispose of records? Would need to revise the social contract of corporations. Working with a Silicon Valley venture capital liquidator. Trying to turn a records warehouse to an archive: which boxes do they want? (Looks like they should have hired an archivist!)&lt;/p&gt;    &lt;p class="MsoNormal"&gt;Otherwise try elsewhere, eg Canada, Finnland etc. Finally, exploit the general statutes for incorporation, eg use of term NewCo; we need&lt;span style=""&gt;  &lt;/span&gt;a NewCo for preservation.&lt;/p&gt;    &lt;p class="MsoNormal"&gt;Do Something, but Do No Harm…&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7864259289135757406?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7864259289135757406/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-keynote-david-kirsch.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7864259289135757406'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7864259289135757406'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/10/ipres-2009-keynote-david-kirsch.html' title='iPres 2009 Keynote: David Kirsch'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-6924483771162370910</id><published>2009-09-10T16:02:00.002+01:00</published><updated>2009-09-10T16:49:38.116+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data publishing'/><category scheme='http://www.blogger.com/atom/ns#' term='Research data'/><category scheme='http://www.blogger.com/atom/ns#' term='Data sharing'/><title type='text'>Nature special section on data sharing</title><content type='html'>This week's issue of Nature includes a very interesting &lt;a href="http://www.nature.com/news/specials/datasharing/index.html"&gt;special section on data sharing&lt;/a&gt;. There is a (not entirely accurate) editorial, a &lt;a href="http://www.nature.com/news/2009/090909/full/461160a.html"&gt;feature&lt;/a&gt; (on why data archives are so empty), and two opinion pieces.&lt;br /&gt;&lt;br /&gt;The first opinion piece is on &lt;a href="http://www.nature.com/nature/journal/v461/n7261/full/461168a.html"&gt;pre-publication deposit&lt;/a&gt;, linked to a workshop held in Toronto (and continuing the famous Bermuda and Fort Lauderdale workshops on data sharing in genomics) where attendees "[recommended] extending the practice to other biological data sets". The workshop produced the "&lt;a href="http://www.nature.com/nature/journal/v461/n7261/box/461168a_BX1.html"&gt;Toronto Statement&lt;/a&gt;", which has recommendations for funding agencies, data producers, data anlysts and users, and scientific journal editors. The statement encourages rapid pre-publication deposit where the datasets are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;"Large scale (requiring significant resources over time)&lt;/li&gt;&lt;li&gt;Broad utility&lt;/li&gt;&lt;li&gt;Creating reference data sets&lt;/li&gt;&lt;li&gt;Associated with community buy-in"&lt;/li&gt;&lt;/ul&gt;The second opinion piece  is on &lt;a href="http://www.nature.com/nature/journal/v461/n7261/full/461171a.html"&gt;post-publication deposit&lt;/a&gt;, linked to a meeting of mouse researchers in Rome "[proposing] ways to promote a culture of sharing". It's partly about other kinds of research materials (mice and cell-lines), but mostly about data. A couple of paragraphs are worth quoting (I hope Nature doesn't mind!):&lt;br /&gt;&lt;blockquote&gt;"Where they don't yet exist, clear criteria should be developed for reviewers of grants to help them assess data and material-sharing plans submitted as part of a funding proposal. There are already examples of good practice in this regard from the NIH, the Howard Hughes Medical Institute, and several UK funding organizations such as the Wellcome Trust and the Medical Research Council. Data-sharing plans are required in proposals, efforts are made to facilitate sharing, such as putting investigators in touch with repositories and, for some organizations, compliance is an important consideration in funding renewal.&lt;br /&gt;&lt;br /&gt;"Deposition of data and resources into public repositories is important for the validation of published results, as well as facilitating reuse. Although it is usual practice for major public databases to make data freely available to access and use, any restrictions on use should be strongly resisted and we endorse explicit encouragement of open sharing, for example under the newly available CC0 public domain waiver of Creative Commons."&lt;/blockquote&gt;The latter point is particularly interesting; in the UK restrictions on sharing often do seem to exist, even if only required registration to access. Sometimes the argument here is about protecting confidentiality; sometimes it is about ensuring the archives know their customers, for sustainability reasons. So this paragraph is also interesting:&lt;br /&gt;&lt;blockquote&gt;"Many of the major public data repositories have no stable underlying funding and there are data types, particularly new ones, without appropriate public data repositories. We encourage further investment and recommend that public database coverage and stability be looked at in a coordinated way by funding organizations and the community with increased urgency. A good model is provided by the UK Biotechnology and Biological Sciences Research Council's Bioinformatics and Biological Resources Fund, which provides dedicated funding for development and sustainability of public resources and informatics tools."&lt;/blockquote&gt;This all appears to be Open Access, and is well worth reading.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-6924483771162370910?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/6924483771162370910/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/09/nature-special-section-on-data-sharing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6924483771162370910'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6924483771162370910'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/09/nature-special-section-on-data-sharing.html' title='Nature special section on data sharing'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-3380811913364152241</id><published>2009-08-18T17:32:00.003+01:00</published><updated>2009-08-18T18:00:27.224+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Security'/><category scheme='http://www.blogger.com/atom/ns#' term='Disaster recovery'/><title type='text'>Off-site disaster recovery and security</title><content type='html'>I took an interesting help desk call today, from an IT person supporting a research centre, asking if the DCC offered off-site disaster recovery type services (perhaps this was really business continuity support services, but the answer's the same). The short answer is that we don't, but the longer answer was an interesting discussion about the centre's needs. The most interesting thing to me was the strong message that local University systems were not geared up to handle specific services of this type. I've commented before about the difficulties in getting reasonable backup systems in place (see &lt;a href="http://digitalcuration.blogspot.com/2009/07/my-backup-rant.html"&gt;My Backup Rant&lt;/a&gt;), and this is one step further. I think this person will be very capable of putting a good local system in place, but given the long-term value and sensitive nature of the data, needs a good quality service to provide that bit extra. It's looking as if he might have to go to the private sector to achieve this, although there may be services linked to HE, such as &lt;a href="http://www.aimesgridservices.com/"&gt;AIMES Grid Services&lt;/a&gt; (linked to Liverpool University, I believe). Perhaps the &lt;a href="http://www.e-science.stfc.ac.uk/services/atlas-petabyte-storage/atlas.html"&gt;Atlas Petabyte Store&lt;/a&gt; at STFC is an in-sector service that might do the trick.&lt;br /&gt;&lt;br /&gt;There was a question on whether (in the future) it might be sensible for such a project to attempt to get certified to ISO 27000. That's a big task; my guess is it might be too big a hurdle for a research centre to jump through. However, I believe that taking an approach linked to ISO 27000, without attempting to go as far as certification, can be extremely valuable. In particular, the ISO 27000 approach involves doing a security risk analysis, building what's referred to as an Information Security Management System (ISMS) to deal appropriately with the risks, and reviewing that ISMS in a continuos improvement cycle known as Plan-Do-Check-Act (PDCA). From my experience approaching this systematically can reveal that previously un-considered risks are more significant than obvious headline risks.&lt;br /&gt;&lt;br /&gt;How important is "off-site"? I do think it is important that off-site means not in the same building. But it doesn't necessarily have to mean not in the same city. Here in Edinburgh, I would quite happily regard the main IT services at Kings Buildings, a couple of miles up the road, as off-site. In deciding, you do have to think about the threats; in central London, you might need to think about whether certain types of threats make larger areas on the city inaccessible, in a way that would probably be less likely in Edinburgh. But if you're locked out of your offices and your local IT services have been destroyed, just having an off-site backup may be comforting but is not going to get you up and running in the short term.&lt;br /&gt;&lt;br /&gt;I thought there might be some current JISC information relevant to this, but on a quick scan I wasn't able to identify anything.&lt;br /&gt;&lt;br /&gt;Note this is a much simpler service than data preservation or even data curation; much more of a standard commercial offering.&lt;br /&gt;&lt;br /&gt;But it is good to see research centres taking these issues seriously.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-3380811913364152241?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/3380811913364152241/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/off-site-disaster-recovery-and-security.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3380811913364152241'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3380811913364152241'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/off-site-disaster-recovery-and-security.html' title='Off-site disaster recovery and security'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-5110302633743080261</id><published>2009-08-14T13:25:00.003+01:00</published><updated>2009-08-14T13:34:55.331+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Semantic web'/><category scheme='http://www.blogger.com/atom/ns#' term='Metadata'/><category scheme='http://www.blogger.com/atom/ns#' term='Linked Data'/><category scheme='http://www.blogger.com/atom/ns#' term='DCC'/><title type='text'>DCC web site and Linked Data</title><content type='html'>We at the DCC are in the early stages of refreshing our web site (&lt;a href="http://www.dcc.ac.uk/"&gt;www.dcc.ac.uk&lt;/a&gt;). Nothing you can see yet, but we're talking to a few consultants about what and how we can do better. The ones we have spoken to so far seem pretty clued up on content management systems, and even on web 2.0 approaches. But questions about the role of the Semantic Web or Linked Data get blank looks.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now our web site is not and will probably never be a major source of data as facts; rather it should contain resources: often documents, sometimes tools, sometimes sharing opportunities. There definitely are facts of various kinds there (which may not sufficiently explicit yet), such as staff contact details, document metadata, event locations and times, etc. But these are a comparatively small part of the content. &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Does this (or anything else) justify investment in building a web site that is based on Linked Data/Semantic Web? What advantages could we get in doing so? What advantages could our users get if we did so?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I would really like to get some views on this!&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-5110302633743080261?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/5110302633743080261/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/dcc-web-site-and-linked-data.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5110302633743080261'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/5110302633743080261'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/dcc-web-site-and-linked-data.html' title='DCC web site and Linked Data'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4942716154838074389</id><published>2009-08-10T12:09:00.003+01:00</published><updated>2009-08-10T12:23:49.149+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Digital Curation'/><category scheme='http://www.blogger.com/atom/ns#' term='Fun'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><title type='text'>Forgetting to remember</title><content type='html'>After a Sunday Times article prompted yesterday's piece of whimsy, a Tweet from my standard Twitter search ( (digital OR data) AND (preservation OR curation), since you ask) produced an interesting &lt;a href="http://www.mercurynews.com/ci_12949916?nclick_check=1"&gt;article by Chris O'Brien&lt;/a&gt;, a columnist for MercuryNews.com: "Time to clean up your digital closet". He goes quite nicely through the various ways in which our personal digital content is more at risk than we might think (media degradation, device and format obsolescence, and the sheer anonymity of large quantities of digital stuff). But he has a prescription for dealing with some of it, part of which I reproduce here (I hope fair use covers this, since you'll have to go to the original to read the rest!):&lt;br /&gt;&lt;blockquote&gt;"However, all is not lost. There are some strategies for storing your digital archives. But you'll have to do a lot of work. You will need to start thinking like a librarian and become an active curator of your files. That means relentlessly organizing, labeling and tagging, backing up and deleting.&lt;br /&gt;&lt;br /&gt;The first and most important thing to do is to begin deleting files. Whittle things down to the essentials. What do you really want to maintain and pass along? You must be ruthless and vigilant.&lt;br /&gt;&lt;br /&gt;Next, develop a system for organizing files online and offline. If you're going to store stuff on removable media, like DVDs, place them in cases that have extensive labels, and index them. And don't store files like text documents or photos on propriety formats that are not widely adopted. Experts recommend photos in JPG forms and documents in PDF formats or basic text formats.&lt;br /&gt;&lt;br /&gt;Label every file and tag them with as much information as you can. Being obsessive now will pay off in the long run. This is a lot of work, which is why you want to cull your archives as much as possible.&lt;br /&gt;&lt;br /&gt;Once that's done, make multiple copies. You can also explore "cloud" backup services..."&lt;/blockquote&gt;Thinking like a librarian? Being an active curator of your files? Sounds like a good place to start. Interesting that he sees deleting as being an important part of remembering! We probably need better tools for the average person for a lot of this (eg tagging files in a filestore), but I suspect there's enough around for any reasonably competent researcher to use. However, laziness, forgetfulness and sheer pressure of work are our enemies here. Will we forget to do the things needed to remember?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4942716154838074389?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4942716154838074389/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/forgetting-to-remember.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4942716154838074389'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4942716154838074389'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/forgetting-to-remember.html' title='Forgetting to remember'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7514086786307026428</id><published>2009-08-09T21:28:00.002+01:00</published><updated>2009-08-09T22:12:33.046+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Fun'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><title type='text'>Remembering to forget</title><content type='html'>Are we getting this digital preservation thing all wrong? An &lt;a href="http://technology.timesonline.co.uk/tol/news/tech_and_web/article6788632.ece"&gt;article&lt;/a&gt; in today's Sunday Times quotes Viktor Mayer-Schonberger that we're creating a "digital memory that vastly exceeds the capacity of our collective human mind". James Harkin reports that Viktor wants us to forget more. Mind you, the main (and earliest) example given concerns an unfortunate photo from as recently as 2006. I'm sure the lady concerned would have remembered it only too well, whether or not the Internet example had come to light.&lt;br /&gt;&lt;br /&gt;It's a daft story, but there is an interesting angle. Many preservation "systems" carry the risk that things will be preserved that some would prefer forgotten (eg the famous &lt;a href="http://www.youtube.com/watch?v=CFijzDyJnVE"&gt;Bush speech&lt;/a&gt;). When the powerful want to change the record, the Web both facilitates and resists them. Web sites (and archives) are generally under some kind of centralised control, and subject to pressure, which they may or may not be able to resist. There are rumours of web-based reports being retrospectively "fixed". But once reports have got out into the wild, as it were, it is much harder to "fix" them, as the example above shows.&lt;br /&gt;&lt;br /&gt;This doesn't mean that archives are a bad thing. They bring professional standards to the keeping of history. But perhaps it's a Good Thing that there's an uncontrollable un-system of citizens keeping (probably illegal) copies of some important and uncomfortable records. Even if it does mean that the lady's embarrassing photo stays around longer than she would like!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7514086786307026428?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7514086786307026428/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/remembering-to-forget.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7514086786307026428'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7514086786307026428'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/remembering-to-forget.html' title='Remembering to forget'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-3497458860714422275</id><published>2009-08-06T17:41:00.003+01:00</published><updated>2009-08-06T17:46:49.744+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data management'/><title type='text'>JISC Data Management Infrastructure call</title><content type='html'>I gather from a tweet from Simon Hodson (Programme Manager at JISC for the Data Management Infrastructure call, &lt;a href="http://www.jisc.ac.uk/fundingopportunities/funding_calls/2009/05/grant0709.aspx"&gt;07/09&lt;/a&gt;), that 34 bids were received for this call. With maybe 8 able to be fiunded within the funding envelope, this looks like a very strong chance of getting an extremely interesting programme. I guess quite a few of us will be getting a batch of proposals to review in the next few days; despite a small feeling of dread at the workload (mostly evenings), I usually find I quite enjoy this process! There's so much to learn, even if (theoretically) we are supposed to forget it immediately!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-3497458860714422275?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/3497458860714422275/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/jisc-data-management-infrastructure.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3497458860714422275'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3497458860714422275'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/jisc-data-management-infrastructure.html' title='JISC Data Management Infrastructure call'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2223330806253231366</id><published>2009-08-05T17:17:00.004+01:00</published><updated>2009-08-05T17:45:11.721+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Digital Curation'/><category scheme='http://www.blogger.com/atom/ns#' term='data curation'/><category scheme='http://www.blogger.com/atom/ns#' term='curated databases'/><title type='text'>Curated databases and data curation</title><content type='html'>I've been working on an article with a colleague, and came across something that's interesting, and that you might be able to help us with. There does appear to be a distinction between the way curation is used in the bio-sciences, and elsewhere. In particular, the term "curated database" tends to mean a manually constructed database that links literature to data, curated by experts who provide authority (eg see the Wikipedia definition of &lt;a href="http://en.wikipedia.org/w/index.php?title=Biocurator&amp;amp;oldid=279524622"&gt;Biocurator&lt;/a&gt;). The earliest mention of the term "curated database" I can find is in the abstract (and only in the abstract) of &lt;a href="http://nar.oxfordjournals.org/cgi/content/abstract/21/13/3021"&gt;Larsen et al&lt;/a&gt; (1993).&lt;br /&gt;&lt;br /&gt;However, the terms "digital curation" and "data curation" tend to mean something different. We in the DCC say "Digital curation is maintaining and adding value to a trusted body of digital information for current and future use; specifically, we mean the active management and appraisal of data over the life-cycle of scholarly and scientific materials". This has a lot more elements of good management about it, and less of the idea of specific curators making judgements based on the literature. It has also been rather conflated with digital preservation.&lt;br /&gt;&lt;br /&gt;The earliest reference to digital curation I can find is a report of an invitational meeting held in October 2001, oddly titled "The Digital Curation: digital archives, libraries and e-science seminar" (&lt;a href="http://www.ariadne.ac.uk/issue30/digital-curation/"&gt;Beagrie and Pothen&lt;/a&gt;, 2001). In the meeting there was some discussion about data curation. The earliest more formal reference to data curation I can find is a technical report from &lt;a href="http://arxiv.org/abs/cs.DL/0208012"&gt;Gray, Szalay et al&lt;/a&gt; (2002).&lt;br /&gt;&lt;br /&gt;So my challenge is this: are there earlier references to digital (or data) curation, of the second kind?&lt;br /&gt;&lt;br /&gt;Beagrie, N., &amp;amp; Pothen, P. (2001). The Digital Curation: digital archives, libraries and e-science seminar. Ariadne. http://www.ariadne.ac.uk/issue30/digital-curation/.&lt;br /&gt;&lt;br /&gt;Gray, J., Szalay, A. S., Thakar, A. R., Stoughton, C., &amp;amp; vandenBerg, J. (2002). Online Scientific Data Curation, Publication, and Archiving. Redmond: Microsoft Research. http://arxiv.org/abs/cs.DL/0208012.&lt;br /&gt;&lt;br /&gt;Larsen, N., Olsen, G. J., Maidak, B. L., McCaughey, M. J., Overbeek, R., Macke, T. J., et al. (1993). The ribosomal database project. Nucl. Acids Res., 21(13), 3021-3023. http://nar.oxfordjournals.org/cgi/content/abstract/21/13/3021&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2223330806253231366?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2223330806253231366/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/curated-databases-and-data-curation.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2223330806253231366'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2223330806253231366'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/08/curated-databases-and-data-curation.html' title='Curated databases and data curation'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-6241905723609408738</id><published>2009-07-28T20:25:00.003+01:00</published><updated>2009-07-28T21:05:57.590+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Backup'/><title type='text'>My backup rant</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;I did a presentation on Trust and Digital Archives at the PASIG Malta meeting; not a very good presentation, I felt. But somewhat at the last minute I added a scarcely relevant extra slide on my favourite bete noir: Preservation’s dirty little secret, namely backup, or rather the lack of it. Curation and preservation are about doing better research, and reducing risk to the research enterprise. But all is for nothing if those elementary prior precautions of making proper backups are not observed. You can’t curate what isn’t there! Anyway, that part went down very well.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Of course the obvious reaction at this point might be to say, tsk tsk, proper backups, of course, everyone should do that. But I’m willing to bet that the majority of people have quite inadequate backup for the home computer systems; the systems on which, these days, they keep their photos, their emails, their important documents. Or worse, they think that having uploaded their photos to Flickr means they are backed up and even preserved.&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;There’s a subtext here. Researchers are bright, they question authority, they are even idiosyncratic. They often travel a lot, away from the office for long periods. They have their own ways of doing things. Yes, some labs will be well organised, with good systems in place. But others will leave many things to the “good sense” of the researchers. In my experience, this means a wide variety of equipment, both desktop and laptop with several flavours of operating system in the one research group. One I saw recently had pretty much the entire gamut: more than one version of MS Windows, more than one version of MacOS/X, and several versions of Linux, all these on laptops. Desktop machines in that group tended to be better protected, with a corporate Desktop, networked drives and organised back systems. But the laptops, often the researchers’ primary machines, were very exposed. My own project has Windows and Mac systems (at least), and is complicated by being spread across several institutions.&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The "good sense" of researchers apparently leaves a lot to be desired, according to a few surveys we've seen over the past couple of years, including the StORe project (mentioned in an &lt;a href="http://digitalcuration.blogspot.com/2008/01/repositories-for-people.html"&gt;earlier blog post&lt;/a&gt;). At a recent meeting, we even heard of examples of IT departments discouraging researchers from keeping their data on the backed-up part of their corporate systems, presumably for reasons of volume, expense, etc. &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;To take my own example, I have a self-managed Mac laptop with 110 GB of disk or so. My corporate disk quota has been pushed up to a quite generous 10 GB. The simplest backup strategy for me is to rsync key directories onto a corporate disk drive, and let the corporate systems take the backup from there. Someone wrote a tiny script for me that I run in the underlying UNIX system; typically it takes scarcely a couple of minutes to rsync the Desktop, my Documents and Library folders (including email, about 9 GB in all), when in the office. But I keep downloading reports and other documents, and soon I’ll be bumping up against that quota limit again.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;For a while I supplemented this with backup DVDs, and there’s quite a pile of them on my desk. But that already doesn’t work as my needs increase.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Remember, disk is cheap! No-one buys a computer with less than 100 GB these days. My current personal backup solution was to supplement the partial rsync backup with a separate backup (using the excellent Carbon Copy Cloner) to a 500 GB disk kept at home, on a USB 2 port. I backup a bit more (Pictures folders etc), but it’s MUCH slower, taking maybe 12 minutes, most of which seems to be a very laborious trek through the filesystem (rsync clearly does the same task much faster). By the way, that simple, self-powered disk cost less than £100, and a colleague says I should have paid less. I know this doesn't translate easily into budget for corporate systems, but it certainly should.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;But this one-off solution still leaves me unable to answer a simple question: are my project’s data adequately backed up? My solution works for someone with a Mac; the software doesn’t work on Windows. It seems to be everyone for themselves. As far as I can see, there is no good, simple, low cost, standardised way to organise backup!&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;I looked into Cloud solutions briefly, but was rather put off by the clauses in &lt;a href="http://aws.amazon.com/agreement/"&gt;Amazon’s agreements&lt;/a&gt;, such as 7.2 ("you acknowledge that you bear sole responsibility for adequate security, protection and backup of Your Content"),  or 11.5, disclaiming any warranty "THAT THE DATA YOU STORE WITHIN THE SERVICE OFFERINGS WILL BE SECURE OR NOT OTHERWISE LOST OR DAMAGED" (more on this another time). That certainly makes the appealing idea of Cloud-based backups rather less attractive (although you could perhaps negotiate or design around it).&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;I think what I need is a service I can subscribe to on behalf of everyone in my project. I want agreed backup policies that allow for the partly disconnected existence, experienced by so many these days. I want the server side to be negotiable separately from the client side, and I want clients for all the major OS variants (for us, that’s Windows, Mac OS/X and various flavours of Linux).  I think this means an API interface, leaving room for individuals to specify whether it should be a bootable or partial backup, which parts of the system are included or excluded, and for management to specify the overall backup regime (full and incremental backups, cycles, and any “archiving” of deleted files, etc). I want the system to take account of intermittent and variable quality connectivity; I don’t want a full backup to start when I make a connection over a dodgy external wifi network. I don’t want it to work only in the office. I don’t want it to require a server on-site. On the one hand this sounds a lot; on the other, none of this is really difficult (I believe). &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Does such a system exist? If not, is defining a system like this a job for &lt;a href="http://www.snia.org/"&gt;SNIA&lt;/a&gt;? &lt;a href="http://www.jisc.ac.uk/"&gt;JISC&lt;/a&gt;? Anyone? Is there a demand from anyone other than me?&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-6241905723609408738?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/6241905723609408738/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/my-backup-rant.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6241905723609408738'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6241905723609408738'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/my-backup-rant.html' title='My backup rant'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1240227664694331551</id><published>2009-07-28T15:08:00.003+01:00</published><updated>2009-07-28T20:21:29.377+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Social media'/><category scheme='http://www.blogger.com/atom/ns#' term='BRTF-SDPA'/><category scheme='http://www.blogger.com/atom/ns#' term='Twitter'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><category scheme='http://www.blogger.com/atom/ns#' term='Blog'/><title type='text'>Turmoil in discourse a long term threat?</title><content type='html'>&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;Lorcan Dempsey &lt;a href="http://orweblog.oclc.org/archives/001989.html"&gt;mentioned&lt;/a&gt; a meeting with Walt Crawford, whom I don't know, in the light of his feeling that "&lt;span class="Apple-style-span" style="font-size: medium; color: rgb(16, 16, 16); line-height: 16px; "&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;some of the heat had gone out of the blogosphere in general&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Verdana; color: rgb(16, 16, 16); line-height: 16px; "&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;"&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;,&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt; &lt;/span&gt;&lt;/span&gt;and reported:&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: medium; color: rgb(16, 16, 16); line-height: 16px; "&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;span class="Apple-style-span"  style="color:#000000;"&gt;&lt;span class="Apple-style-span" style="line-height: normal;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-size: medium; color: rgb(16, 16, 16); line-height: 16px; "&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;span class="Apple-style-span"  style="color:#000000;"&gt;&lt;span class="Apple-style-span" style="line-height: normal;"&gt;'&lt;a href="http://scienceblogs.com/waltatrandom/"&gt;Walt&lt;/a&gt;,&lt;/span&gt;&lt;/span&gt; whom I was pleased to bump into [...], is probably right to suggest in the comments that some energy around notifications etc has moved to Twitter: "Twitter et al ... have, in a way, strengthened essay-length blogging while weakening short-form blogging (maybe)-and essays have always been harder to do than quick notes"'&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;span class="Apple-style-span"  style="color:#101010;"&gt;&lt;span class="Apple-style-span" style="font-size: medium; line-height: 16px;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;That ties in to my experience to some extent. I've just published a blog post from Sun-PASIG in Malta, which ended a month ago (not really an essay, but something where it was hard to get the tone just right), and I have a bunch of other posts in the "part-written" pipeline. Tweets are a lot easier.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="font-family:'times new roman', fantasy;color:#101010;"&gt;&lt;span class="Apple-style-span" style="font-size: medium; line-height: 16px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="font-family:'times new roman', -webkit-fantasy;color:#101010;"&gt;&lt;span class="Apple-style-span" style="font-size: medium; line-height: 16px;"&gt;But that isn't quite my point here. I'm a little concerned that the new "longevity" threat may not be the encoding of our discourse in obsolete formats, and not even our entrusting it to private providers such as the blog systems (as long as it IS open access, and preferably Creative Commons). The threat may be the way new venues for discourse wax and wane with great rapidity. We can learn to deal with blogs, we can even have a debate on  whether the twitterverse is worth saving (or how much of it might be). Do we need to worry about other more social media (MySpace, Facebook, Flickr and so many lesser pals; so heavily fractured)? They're not speech, they're not scholarly works, but they have some significance (particularly in documenting significant events) somewhere in between. We could learn to deal with any small set of them, but by the time we work out how they could be preserved, and how parts might be selected, that set would (as is suggested above for blogs) already be "so last year".&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="font-family:'times new roman', -webkit-fantasy;color:#101010;"&gt;&lt;span class="Apple-style-span" style="font-size: medium; line-height: 16px;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"   style="font-family:'times new roman', -webkit-fantasy;color:#101010;"&gt;&lt;span class="Apple-style-span" style="font-size: medium; line-height: 16px;"&gt;BTW, part of this space is being addressed by the &lt;a href="http://brtf.sdsc.edu/index.html"&gt;Blue Ribbon Task Force on Sustainable Digital Preservation and Access&lt;/a&gt;. I'm attending one of their meetings over the next two days, on my first visit to Ann Arbor, Michigan. Among the things we're looking at are scenarios that currently include social media. I'll try and write a bit more about it, but it's not really the sort of meeting you can blog about freely...&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1240227664694331551?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1240227664694331551/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/turmoil-in-discourse-long-term-threat.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1240227664694331551'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1240227664694331551'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/turmoil-in-discourse-long-term-threat.html' title='Turmoil in discourse a long term threat?'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4771300235438736743</id><published>2009-07-28T14:57:00.004+01:00</published><updated>2009-07-28T15:05:20.595+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Research data'/><category scheme='http://www.blogger.com/atom/ns#' term='Preservation formats'/><category scheme='http://www.blogger.com/atom/ns#' term='Open Source'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun-PASIG'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><title type='text'>Rosenthal at Sun-PASIG in Malta</title><content type='html'>&lt;p class="MsoNormal"&gt;I was very pleased to hear David Rosenthal reprise his &lt;a href="http://www.cni.org/tfms/2009a.spring/plenary.html"&gt;CNI keynote on digital preservation&lt;/a&gt; for the Sun-PASIG meeting in Malta, a few weeks ago now. David is a very original thinker and careful speaker. I’ve fallen into the trap before of mis-remembering him, and then arguing from my faulty version. I even noted two tweets made contemporaneously with his talk, that misquoted him and changed the meaning subtly (see below). Luckily, David has made his CNI presentation available in an &lt;a href="http://blog.dshr.org/2009/04/spring-cni-plenary-remix.html"&gt;annotated version&lt;/a&gt; on his blog, so I hope I don’t make the same mistake again.&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;!--[endif]--&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;If you were not able to hear this talk,&lt;span style=""&gt;  &lt;/span&gt;please go read that blog post. David has some important things to say, pretty much all of which I agree strongly with. No real surprise there, as part of the talk at least echoes concerns I expressed in the “&lt;a href="http://www.ariadne.ac.uk/issue46/rusbridge/"&gt;Excuse Me…&lt;/a&gt;” Ariadne article &lt;!--[if supportFields]&gt;&lt;span style="'mso-element:field-begin'"&gt;&lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;ADDIN EN.CITE &lt;endnote&gt;&lt;cite&gt;&lt;author&gt;Rusbridge&lt;/author&gt;&lt;year&gt;2006&lt;/year&gt;&lt;recnum&gt;93&lt;/recnum&gt;&lt;record&gt;&lt;rec-number&gt;93&lt;/rec-number&gt;&lt;ref-type name="&amp;quot;Electronic"&gt;43&lt;/ref-type&gt;&lt;contributors&gt;&lt;authors&gt;&lt;author&gt;Rusbridge, C.&lt;/author&gt;&lt;/authors&gt;&lt;/contributors&gt;&lt;titles&gt;&lt;title&gt;Excuse Me... Some Digital Preservation Fallacies?&lt;/title&gt;&lt;secondary-title&gt;Ariadne&lt;/secondary-title&gt;&lt;/titles&gt;&lt;periodical&gt;&lt;full-title&gt;Ariadne&lt;/full-title&gt;&lt;/periodical&gt;&lt;number&gt;46&lt;/number&gt;&lt;dates&gt;&lt;year&gt;2006&lt;/year&gt;&lt;/dates&gt;&lt;urls&gt;&lt;related-urls&gt;&lt;url&gt;&lt;style face="&amp;quot;underline&amp;quot;" font="&amp;quot;default&amp;quot;" size="&amp;quot;100%&amp;quot;"&gt;http://www.ariadne.ac.uk/issue46/rusbridge/&lt;/style&gt;&lt;/url&gt;&lt;/related-urls&gt;&lt;/urls&gt;&lt;/record&gt;&lt;/cite&gt;&lt;/endnote&gt;&lt;span style="'mso-element:field-separator'"&gt;&lt;/span&gt;&lt;![endif]--&gt;(Rusbridge, 2006)&lt;!--[if supportFields]&gt;&lt;span style="'mso-element:field-end'"&gt;&lt;/span&gt;&lt;![endif]--&gt;, which on reflection was probably influenced by earlier meetings with David among others.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;!--[if !supportEmptyParas]--&gt; &lt;!--[endif]--&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;So here’s the highly condensed version: Jeff Rothenberg was wrong in his famous &lt;a href="http://search.ebscohost.com/login.aspx?direct=true&amp;amp;db=buh&amp;amp;AN=9501173513&amp;amp;site=ehost-live"&gt;1995 Scientific American article&lt;/a&gt; &lt;!--[if supportFields]&gt;&lt;span style="'mso-element:field-begin'"&gt;&lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;ADDIN EN.CITE &lt;endnote&gt;&lt;cite&gt;&lt;author&gt;Rothenberg&lt;/author&gt;&lt;year&gt;1995&lt;/year&gt;&lt;recnum&gt;144&lt;/recnum&gt;&lt;record&gt;&lt;rec-number&gt;144&lt;/rec-number&gt;&lt;ref-type name="&amp;quot;Journal"&gt;17&lt;/ref-type&gt;&lt;contributors&gt;&lt;authors&gt;&lt;author&gt;Rothenberg, Jeff&lt;/author&gt;&lt;/authors&gt;&lt;/contributors&gt;&lt;titles&gt;&lt;title&gt;Ensuring the longevity of digital documents&lt;/title&gt;&lt;secondary-title&gt;Scientific American&lt;/secondary-title&gt;&lt;/titles&gt;&lt;periodical&gt;&lt;full-title&gt;Scientific American&lt;/full-title&gt;&lt;/periodical&gt;&lt;pages&gt;42&lt;/pages&gt;&lt;volume&gt;272&lt;/volume&gt;&lt;number&gt;1&lt;/number&gt;&lt;keywords&gt;&lt;keyword&gt;INFORMATION storage &amp;amp; retrieval systems&lt;/keyword&gt;&lt;/keywords&gt;&lt;dates&gt;&lt;year&gt;1995&lt;/year&gt;&lt;pub-dates&gt;&lt;date&gt;01&lt;/date&gt;&lt;/pub-dates&gt;&lt;/dates&gt;&lt;publisher&gt;Scientific American Inc.&lt;/publisher&gt;&lt;urls&gt;&lt;related-urls&gt;&lt;url&gt;http://search.ebscohost.com/login.aspx?direct=true&amp;amp;db=buh&amp;amp;AN=9501173513&amp;amp;site=ehost-live &lt;/url&gt;&lt;/related-urls&gt;&lt;/urls&gt;&lt;/record&gt;&lt;/cite&gt;&lt;/endnote&gt;&lt;span style="'mso-element:field-separator'"&gt;&lt;/span&gt;&lt;![endif]--&gt;(Rothenberg, 1995)&lt;!--[if supportFields]&gt;&lt;span style="'mso-element:field-end'"&gt;&lt;/span&gt;&lt;![endif]--&gt;. The important digital preservation problems for society are not media degradation or media obsolescence or format obsolescence, because important stuff is online (and more or less independent of media), and widely used formats no longer go obsolescent the way they used to when Jeff wrote the article. The important issue is money, as collecting all we need will be ruinously expensive. Every dollar we spend on non-problems (like protecting against format obsolescence) doesn’t go towards real problems.&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;!--[endif]--&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;And if you are so imbued with conventional preservation wisdom as to think that summary is nonsense, but you haven’t read the blog post, go read it before making up your mind!&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;!--[endif]--&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;David concludes:&lt;/p&gt;    &lt;p class="MsoNormal" style="line-height: 16pt;"&gt;&lt;span style="color: rgb(16, 16, 16);font-family:Verdana-Bold;font-size:11pt;"  lang="EN-US" &gt;&lt;b&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="line-height: 16pt;"&gt;&lt;span style="color: rgb(16, 16, 16);font-family:Verdana-Bold;font-size:11pt;"  lang="EN-US" &gt;&lt;b&gt;&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p class="MsoNormal" style="line-height: 16pt;"&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;b&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;"Practical Next Steps&lt;/span&gt;&lt;/span&gt;&lt;/b&gt;&lt;/span&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;Everyone - just go collect the bits:&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;/span&gt; &lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;Not hard or costly to do a good enough job&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;, &lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;Please use Creative Commons licenses&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;Preserve Open Source repositories:&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt; Easy &amp;amp; vital: no legal, technical or scale barriers&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;Support Open Source renderers &amp;amp; emulators&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;Support research into preservation tech:&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt; &lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;How to preserve bits adequately &amp;amp; affordably?&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt; &lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;How to preserve this decade's dynamic web of services? &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="color: rgb(16, 16, 16);" lang="EN-US"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;Not just last decade's static web of pages"&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;br /&gt;&lt;/span&gt;&lt;!--[if !supportEmptyParas]--&gt;&lt;/blockquote&gt;So what are the limitations of this analysis? My quick summary from a research data viewpoint:    &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p class="MsoNormal"&gt;Lots of important/valuable stuff is not online&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Quite a lot of this stuff is not readable with common, open-source-compatible software packages&lt;/p&gt;  &lt;p class="MsoNormal"&gt;We need to keep contextual metadata as well as the bits for a lot of this stuff… and yes, we do need to learn how to do this in a scalable way.&lt;/p&gt;&lt;/blockquote&gt;    &lt;p class="MsoNormal"&gt;&lt;!--[if !supportEmptyParas]--&gt;David clearly concentrates on the online world:&lt;/p&gt;    &lt;p class="MsoNormal" style="margin-left: 36pt; text-indent: -36pt; line-height: 16pt;"&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p class="MsoNormal" style="margin-left: 36pt; text-indent: -36pt; line-height: 16pt;"&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;“&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="color: rgb(16, 16, 16); "&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Now, if it is worth keeping, it is on-line&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;!--StartFragment--&gt;    &lt;span lang="EN-US" style="color: rgb(16, 16, 16); "&gt;&lt;span class="Apple-style-span"  style="font-family:'times new roman';"&gt;&lt;span class="Apple-style-span" style="font-size: large;"&gt;Off-line backups are temporary”&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;!--EndFragment--&gt;    &lt;/blockquote&gt;&lt;p class="MsoNormal" style="margin-left: 36pt; text-indent: -36pt; line-height: 16pt;"&gt;&lt;span style="color: rgb(16, 16, 16);font-family:Verdana;font-size:11pt;"  lang="EN-US" &gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;    &lt;p class="MsoNormal"&gt;&lt;!--[if !supportEmptyParas]--&gt;However, it’s worth remembering Raymond Clarke’s point in &lt;a href="http://digitalcuration.blogspot.com/2009/07/online-and-offline-storage-cost-and.html"&gt;my earlier post from PASIG Malta&lt;/a&gt; about the cost advantages of offline. Particularly in the research data world, there is a substantial set of content that exists off-line, or perhaps near-line. Some of the Rothenberg risks still apply to such content. Let’s leave aside for the moment that parallels to the scenario that Rothenberg envisages continue to exist: scholars’ works encoded in obsolete digital media are starting to be ingested in archives. But more pressingly, some research projects report that their university IT departments discourage them from using enterprise backup systems for research data, for reasons of capacity limitations. So these data often exist in a ragbag collection of scarcely documented offline media (or may even be not backed up at all). In Big Science, data may be better protected, being sometimes held in large hierarchical storage management systems. A concern I have heard from the managers of such large systems is that the time needed to migrate their substantial data holdings from one generation of storage to the next can approximate the life of the system, ie several years. And clearly such systems are more exposed to risk.&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;!--[endif]--&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Secondly, David’s comments about format obsolescence apply specifically to common formats. He says “Gratuitous incompatibility is now self-defeating”, and “Open Source renderers [exist] for all major formats” with “Open Source isn't backwards incompatible”. But unfortunately there are examples where there are valuable resources that remain at risk. There are areas with valuable content not accessible with Open Source renderers (eg engineering and architectural design). There are many cases in research where critical analysis codes are written by non-experts, with poor version control, poorly documented. And even in the mainstream world, format obsolescence can still occur in minority formats, for all sorts of reasons, including bankruptcy, but also including sheer bad design of early versions.&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;!--[endif]--&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Finally, I’m sure David didn’t really mean “just keep the bits”. Particularly in research, but in many other areas as well, important contextual data and metadata are needed to understand the preserved data, and to demonstrate its authenticity. The task of capturing and preserving these can be the hardest part of curating and preserving the data, precisely because those directly involved need less of the context.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Oh, that double mis-quote? Talking of the difficulty of engaging with costly lawyers, David said “&lt;span style="" lang="EN-US"&gt;1 hour of 1 lawyer ~ 5TB of disk [-] 10 hours of 1 lawyer could store the academic literature”. One tweet reported this as “&lt;/span&gt;Lawyer effects; cost of 10 lawyer hours could save entire academic literature!” and the other as “10 hours of a lawyer's time could preserve the entire academic literature”. See what I mean? Neither save nor preserve mean the same as store!&lt;/p&gt;    &lt;p class="MsoNormal"&gt;&lt;!--[if !supportEmptyParas]--&gt; &lt;!--[endif]--&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;Overall, David does a great job, in his presentation, blog post and other writings, in reminding us not to blindly accept but to challenge preservation orthodoxy. Put simply, we have to think for ourselves.&lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;!--[endif]--&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left: 36pt; text-indent: -36pt;"&gt;&lt;!--[if supportFields]&gt;&lt;span style="'mso-element:field-begin'"&gt;&lt;/span&gt;&lt;span style="mso-spacerun: yes"&gt; &lt;/span&gt;ADDIN EN.REFLIST &lt;span style="'mso-element:field-separator'"&gt;&lt;/span&gt;&lt;![endif]--&gt;Rothenberg, J. (1995). Ensuring the longevity of digital documents. &lt;i&gt;Scientific American, 272&lt;/i&gt;&lt;span style="font-style: normal;"&gt;(1), 42. &lt;a href="http://search.ebscohost.com/login.aspx?direct=true&amp;amp;db=buh&amp;amp;AN=9501173513&amp;amp;site=ehost-live"&gt;http://search.ebscohost.com/login.aspx?direct=true&amp;amp;db=buh&amp;amp;AN=9501173513&amp;amp;site=ehost-live&lt;/a&gt; &lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left: 36pt; text-indent: -36pt;"&gt;(yes, that URL IS the "permanent URL according to Ebsco!)&lt;/p&gt;&lt;p class="MsoNormal" style="margin-left: 36pt; text-indent: -36pt;"&gt;Rusbridge, C. (2006). Excuse Me... Some Digital Preservation Fallacies?  &lt;i&gt;Ariadne&lt;/i&gt;&lt;span style="font-style: normal;"&gt; from &lt;u&gt;&lt;a href="http://www.ariadne.ac.uk/issue46/rusbridge/"&gt;http://www.ariadne.ac.uk/issue46/rusbridge/&lt;/a&gt;&lt;/u&gt;.&lt;/span&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4771300235438736743?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4771300235438736743/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/rosenthal-at-sun-pasig-in-malta.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4771300235438736743'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4771300235438736743'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/rosenthal-at-sun-pasig-in-malta.html' title='Rosenthal at Sun-PASIG in Malta'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-899149572057538944</id><published>2009-07-24T20:45:00.005+01:00</published><updated>2010-01-05T17:32:48.686Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Semantic web'/><category scheme='http://www.blogger.com/atom/ns#' term='Research data'/><category scheme='http://www.blogger.com/atom/ns#' term='Linked Data'/><category scheme='http://www.blogger.com/atom/ns#' term='RDF'/><title type='text'>Semantic Web of Linked Data for Research?</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;In the beginning was the World Wide Web. Then we were going to have the Semantic Web. (Then we had Web 2.0, but that’s another story.) But maybe the Semantic Web wasn’t semantic enough for some, so they changed the name to Linked Data, and it began to take off a little more. Now there’s an argument on whether all linked data are Linked Data! &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The debate started with Andy Powell asking on Twitter what name we should use when all the conditions for Linked Data are met except for one, which was the requirement that data be expressed in standards, specifically RDF (&lt;a href="http://efoundations.typepad.com/efoundations/2009/07/linked-data-vs-web-of-data-vs-.html"&gt;see Andy's summary&lt;/a&gt;). Tim Berners Lee had &lt;a href="http://www.w3.org/DesignIssues/LinkedData.html"&gt;suggested&lt;/a&gt; there were 4 principles for Linked Data: &lt;/p&gt;&lt;p class="MsoNormal" style="margin-left:0cm;text-indent:0cm;mso-text-indent-alt:0cm;mso-list:l0 level1 lfo1"&gt;&lt;/p&gt;&lt;ol&gt;&lt;li&gt;Use URIs as names for things.&lt;/li&gt;&lt;li&gt;Use HTTP URIs so that people can look up those names.&lt;/li&gt;&lt;li&gt;&lt;span style="font:7.0pt &amp;quot;Times New Roman&amp;quot;"&gt; &lt;/span&gt;When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL).&lt;/li&gt;&lt;li&gt;Include links to other URIs. so that they can discover more things.&lt;/li&gt;&lt;/ol&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left:0cm;text-indent:0cm;mso-text-indent-alt:0cm;mso-list:l0 level1 lfo1"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="margin-left:0cm;text-indent:0cm;mso-text-indent-alt:0cm;mso-list:l0 level1 lfo1"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal" style="text-indent:36.0pt"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; There were quite strong divisions; one group says roughly: “Linked Data is a brand and a definition; live with it”, while the other group says something like “Linked Data can afford to be inclusive, and will benefit from that” (both of these are extreme simplifications). I’ve read all the remarks and they’re pretty convincing; I mostly agree with them (not much help to you, gentle reader!). Paul Walk's &lt;a href="http://blog.paulwalk.net/2009/07/21/no-data-here-just-linked-concepts/"&gt;summary&lt;/a&gt; is quite balanced. However, I particularly liked a comment made on someone else’s blog post by Dan Brickley, who should know about RDF (quoted by Andy in the post mentioned above):&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p class="MsoNormal"&gt;“I have no problem whatsoever with non-RDF forms of data in “the data Web”. This is natural, normal and healthy. Statistical information, geographic information, data-annotated SVG images, audio samples, JSON feeds, Atom, whatever.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;We don’t need all this to be in RDF. Often it’ll be nice to have extracts and summaries in RDF, and we can get that via GRDDL or other methods. And we’ll also have metadata about that data, again in RDF; using SKOS for indicating subject areas, FOAF++ for provenance, etc.&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;The non-RDF bits of the data Web are – roughly – going to be the leaves on the tree. The bit that links it all together will be, as you say, the typed links, loose structuring and so on that come with RDF. This is also roughly analogous to the HTML Web: you find JPEGs, WAVs, flash files and so on linked in from the HTML Web, but the thing that hangs it all together isn’t flash or audio files, it’s the linky extensible format: HTML. For data, we’ll see more RDF than HTML (or RDFa bridging the two). But we needn’t panic if people put non-RDF data up online…. it’s still better than nothing. And as the LOD scene has shown, it can often easily be processed and republished by others. People worry too much! :)”&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p class="MsoNormal"&gt; I think this makes lots of sense for research data. I’ve been wondering for some time how RDF fits into the world of research data. I asked the NERC Data Managers at their meeting earlier this year, and the general consensus appeared to be that RDF was good for the metadata, but not the actual research data. This seems reasonable and is consistent with Dan’s view above.&lt;/p&gt;  &lt;p class="MsoNormal"&gt; But it does rather raise the question about exactly what kinds of data RDF IS suitable for. It begins to look as if it is good for isolated facts, simple relationships and descriptive data. While RDF probably can encode most things you would put in databases or scientific datasets, generally it would be very difficult to express what those databases and datasets can express, and there would be a massive explosion of triples if one tried.&lt;/p&gt;  &lt;p class="MsoNormal"&gt;To answer Andy’s original question (what name…), although I was taken with the idea of linked data, it’s clearly too easy to confuse with Linked Data. So I think I’d go with Paul Walk’s suggestion of Web of Data, or interchangeably Dan Brickley's data Web. If we can weave research data into a Web of Data, we’ll be doing well!&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-899149572057538944?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/899149572057538944/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/semantic-web-of-linked-data-for.html#comment-form' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/899149572057538944'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/899149572057538944'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/semantic-web-of-linked-data-for.html' title='Semantic Web of Linked Data for Research?'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2102604470822475212</id><published>2009-07-23T15:37:00.005+01:00</published><updated>2009-10-21T18:17:21.892+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='IJDC'/><category scheme='http://www.blogger.com/atom/ns#' term='Citation'/><category scheme='http://www.blogger.com/atom/ns#' term='iPres'/><category scheme='http://www.blogger.com/atom/ns#' term='IDCC09'/><title type='text'>IJDC Volume 4(1) was published</title><content type='html'>That's volume 4, issue 1 of the &lt;a href="http://www.ijdc.net/index.php/ijdc/index"&gt;International Journal of Digital Curation&lt;/a&gt;... and I didn't report it here. My apologies for that. It's our biggest &lt;a href="http://www.ijdc.net/index.php/ijdc/issue/view/7"&gt;issue&lt;/a&gt; yet, with 10 peer-reviewed papers and 4 general articles, plus 2 editorials (a &lt;a href="http://www.ijdc.net/index.php/ijdc/article/view/96/90"&gt;guest editorial&lt;/a&gt; from Malcolm Atkinson, and a normal one from me). There's some really interesting stuff, mostly from the Digital Curation Conference in Edinburgh last year.&lt;br /&gt;&lt;br /&gt;There are still a few papers from last year's conference to come, plus a selection from &lt;a href="http://www.bl.uk/ipres2008/index.html"&gt;iPres 2008&lt;/a&gt; at the BL in London as well. We are also hoping that some papers will emerge from &lt;a href="http://www.cdlib.org/iPres/"&gt;iPres 09&lt;/a&gt;, which has just opened registration, and will shortly be feeding back the results of their selection process to authors. Still time to submit to this year's &lt;a href="http://www.dcc.ac.uk/events/dcc-2009/"&gt;Digital Curation Conference&lt;/a&gt;, guys (submissions close August 7, 2009).&lt;br /&gt;&lt;br /&gt;We have done a couple of interesting analyses on the IJDC. One was a "readership analysis" based on web stats, for the period January-June 2009. Eight out of the ten most down-loaded papers in that time were from IJDC 3(2) (the ninth was from 3(1), and the tenth was from IJDC 1). These 10 papers were down-loaded just under 440 times each during that period (395 to 485 times).&lt;br /&gt;&lt;br /&gt;The second was to use Google Scholar to assess citations for the issues up to and including 3(2). Issue 4(1) is too recent. I checked the peer-reviewed papers, which GS suggested had been cited 92 times (maximum 11 times for that most-down-loaded &lt;a href="http://www.ijdc.net/index.php/ijdc/article/view/6"&gt;Beagrie article&lt;/a&gt; from Issue 1), for an average of 3.3 citations per paper. I also checked the articles, although I ignored simple reports, editorials and reviews. Counting peer-reviewed papers and checked general articles, there were 142 citations, for an average of 2.7 citations per item.&lt;br /&gt;&lt;br /&gt;Only one out of those eight most down-loaded papers in issue 3(2) had translated those downloads into significant citations, the &lt;a href="http://www.ijdc.net/index.php/ijdc/article/view/84"&gt;Cheung&lt;/a&gt; paper has 6. But we should give them time, I think; citations per checked item per issue are noticeably lower for more recent items, as you might expect.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;4.2 in IJDC 1&lt;/li&gt;&lt;li&gt;3.3 in IJDC 2(1)&lt;/li&gt;&lt;li&gt;4.2 in IJDC 2(2)&lt;/li&gt;&lt;li&gt;2.1 in IJDC 3(1)&lt;/li&gt;&lt;li&gt;1.4 in IJDC 3(2)&lt;/li&gt;&lt;/ul&gt;By the way, we are particularly proud of one citation of an IJDC paper from a paper in Nature's Big Data Issue (&lt;a href="http://www.nature.com.ezproxy.webfeat.lib.ed.ac.uk/nature/journal/v455/n7209/pdf/455047a.pdf"&gt;Howe et al, The future of biocuration&lt;/a&gt;). The citation was of the &lt;a href="http://www.ijdc.net/index.php/ijdc/article/view/42"&gt;Palmer et al&lt;/a&gt; paper in IJDC 2(2)... but Google Scholar failed to notice it. So these figures come with a few caveats!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2102604470822475212?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2102604470822475212/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/ijdc-volume-41-was-published.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2102604470822475212'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2102604470822475212'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/ijdc-volume-41-was-published.html' title='IJDC Volume 4(1) was published'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7968587755508468018</id><published>2009-07-23T11:59:00.002+01:00</published><updated>2009-07-23T12:09:15.404+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='UKDA'/><category scheme='http://www.blogger.com/atom/ns#' term='Data management'/><category scheme='http://www.blogger.com/atom/ns#' term='Data sharing'/><title type='text'>Managing and sharing data</title><content type='html'>I really like the UK Data Archive publication "&lt;a href="http://www.data-archive.ac.uk/news/publications/managingsharing.pdf"&gt;Managing and Sharing Data&lt;/a&gt;: a best practice guide for researchers". It's been available in print form for a while, although I did hear they had run out and were revising it. Meanwhile, you can get the PDF version from their web site. It is &lt;a href="http://www.data-archive.ac.uk/sharing/sharing.asp"&gt;complemented by sections of their web site&lt;/a&gt; which reflect, and sometimes expand on, the sections in the Guide. The sections are:&lt;br /&gt;&lt;blockquote&gt;Sharing data - why and how?&lt;br /&gt;Consent, confidentiality and ethics&lt;br /&gt;Copyright    &lt;br /&gt;Data documentation and metadata&lt;br /&gt;Data formats and software&lt;br /&gt;Data storage, back-up, and security&lt;br /&gt;&lt;/blockquote&gt;This is an excellent resource!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7968587755508468018?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7968587755508468018/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/managing-and-sharing-data.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7968587755508468018'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7968587755508468018'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/managing-and-sharing-data.html' title='Managing and sharing data'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-6636001830945983619</id><published>2009-07-15T21:45:00.002+01:00</published><updated>2009-10-21T18:17:47.038+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Digital Curation'/><category scheme='http://www.blogger.com/atom/ns#' term='Conference'/><category scheme='http://www.blogger.com/atom/ns#' term='IDCC09'/><title type='text'>Digital Curation Conference deadline extended</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;5th International Digital Curation Conference (IDCC09) &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;Moving to Multi-Scale Science: Managing Complexity and Diversity. &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;2 – 4 December 2009, Millennium Gloucester Hotel, London, UK. &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;**************************************************************************&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;We are pleased to announce that the Paper Submission date for IDCC09 has been extended by 2 weeks to Friday 7 August 2009: &lt;/span&gt;&lt;/span&gt;&lt;a href="http://www.dcc.ac.uk/events/dcc-2009/call-for-papers/"&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;http://www.dcc.ac.uk/events/dcc-2009/call-for-papers/&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt; Remember that submissions should be in the form of a full or short paper, or a one page abstract for a poster, workshop or demonstration.&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;Presenting at the conference offers you the chance to:- &lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-family:arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Share good practice, skills and knowledge transfer &lt;/li&gt;&lt;li&gt;Influence and inform future digital curation policy &amp;amp; practice &lt;/li&gt;&lt;li&gt;Test out curation resources and toolkits &lt;/li&gt;&lt;li&gt;Explore collaborative possibilities and partnerships &lt;/li&gt;&lt;li&gt;Engage educators and trainers with regard to developing digital curation skills for the future&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;    &lt;p class="MsoNormal"&gt;&lt;span style="font-family:Arial;font-size:10.0pt;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span style=" ;font-family:Arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;Speakers at the conference will include:-&lt;/span&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span"  style=" ;font-family:Arial, fantasy;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;Timo Hannay – Publishing Director, Nature.com&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span"  style=" ;font-family:Arial, fantasy;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;Professor Douglas Kell – Chief Executive of the Biotechnology &amp;amp; Biological Sciences Research Council (BBSRC)&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span"  style=" ;font-family:Arial, fantasy;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;Dr. Ed Seidel, Director of the National Science Foundation’s Office of Cyberinfrastructure&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;/p&gt;      &lt;p class="MsoNormal"&gt;&lt;span style=" ;font-family:Arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;All papers accepted for the conference will be published in the &lt;/span&gt;&lt;a href="http://www.ijdc.net/index.php/ijdc"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;International Journal of Digital Curation&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;br /&gt;&lt;br /&gt;Sent on behalf of the Programme Committee – &lt;/span&gt; &lt;span class="Apple-style-span"  style="font-size:medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span style=" ;font-family:Arial;"&gt;&lt;span class="Apple-style-span"  style="font-size:medium;"&gt;co-chaired by Chris Rusbridge, Director of the Digital Curation Centre, Liz Lyon, Director of UKOLN and Clifford Lynch, Executive Director of the Coalition for Networked Information.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-6636001830945983619?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/6636001830945983619/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/digital-curation-conference-deadline.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6636001830945983619'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6636001830945983619'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/digital-curation-conference-deadline.html' title='Digital Curation Conference deadline extended'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-681501861355852890</id><published>2009-07-15T21:19:00.003+01:00</published><updated>2009-07-15T21:43:29.577+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Archival media'/><category scheme='http://www.blogger.com/atom/ns#' term='Data'/><category scheme='http://www.blogger.com/atom/ns#' term='Sun-PASIG'/><title type='text'>Online and offline storage: cost and greenness at Sun PASIG</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt; I was at the &lt;a href="http://sun-pasig.ning.com/"&gt;Sun Preservation and Archiving SIG&lt;/a&gt; meeting in Malta a couple of weeks ago, a very interesting meeting indeed. The agenda and presentations are being mounted &lt;a href="http://lib.stanford.edu/pasig"&gt;here&lt;/a&gt;. I’ll try to pick out some points that are worth briefly blogging about, if I can. &lt;/p&gt;&lt;p class="MsoNormal"&gt;Raymond Clarke of Sun and SNIA spoke quite early on about storage (presentation not up yet). You may know that &lt;a href="http://www.sun.com/featured-articles/2009-0325/feature/index.jsp"&gt;Sun now holds the Internet Archive&lt;/a&gt;, apparently in a mobile data centre (basically a standard container in a car park!). The point I was interested in was Raymond’s comments about tape, which he said was the fastest growing market segment. He said the cost ratio of disk:tape storage was 23:1. But, noting that storage represents around 40% of the power consumption of data centres, and given our current environmental concerns, it’s notable that the energy ratio for disk:tape is 200:1!&lt;/p&gt;  &lt;p class="MsoNormal"&gt;On risk, a separate &lt;a href="http://lib.stanford.edu/files/moreira_pasig_histor.pdf"&gt;presentation by Moreira &lt;/a&gt;(pdf) also spoke about the green and cost advantages of tape (although he only identified a 3:1 advantage), but I note two tweets from the time:&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;blockquote&gt;&lt;p class="MsoNormal"&gt; &lt;a href="http://twitter.com/dkeats"&gt;dkeats&lt;/a&gt;: Tapes degrade, data loss happens, there is risk, need to measure quality of tapes in real time, and manage risk. #pasig&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;a href="http://twitter.com/cardcc"&gt;cardcc&lt;/a&gt;: #pasig Moreira shows green &amp;amp; cost advantages of tape, but concerns on fragility &amp;amp; lifetime 5-10-30 years, but maybe only days if mistreated&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;p class="MsoNormal"&gt; Now there was an element of standard vendor pitch towards hierarchical storage management systems. But if you have very large volumes, sufficient value but relatively low re-use rate, then despite some of the significant disadvantages of tape, those numbers have to drive you towards a tape solution for preservation and archiving! It certainly will significantly increase the up-front investment cost, the data centre management requirements, and possibly some risk factors, but given sufficient volumes you should make savings overall. And this could apply to quite a bit of research data... &lt;/p&gt;&lt;p class="MsoNormal"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-681501861355852890?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/681501861355852890/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/online-and-offline-storage-cost-and.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/681501861355852890'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/681501861355852890'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/online-and-offline-storage-cost-and.html' title='Online and offline storage: cost and greenness at Sun PASIG'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7044342322939597258</id><published>2009-07-14T16:54:00.002+01:00</published><updated>2010-01-05T17:31:26.360Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Rendering'/><title type='text'>To render or to compute?</title><content type='html'>&lt;!--StartFragment--&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;From time to time I hear fruitless discussions about rendering documents versus computing data (maybe this post is another one). Person X says something like “in order to be able to preserve these documents, we need to preserve the means to render them.” But person Y says “ah, but for data rendering is not enough. I don’t want to see the data values, I need to be able to feed them into complex computations. That’s quite different.”&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;I want to yell and shout (I don’t, but I guess my blood pressure goes up). I guess there may be some kind of essential difference here that these discussions are failing to capture. But to me it’s angels dancing on a pin. Rendering a modern web page involves computations unimaginable barely 20 years ago (other than on Evans and Sutherland high end graphics systems). You take a substantial amount of data from perhaps dozens of different sources, in different kinds of loosely standardised formats, and you project it onto a user-defined space, dependent on a wide variety of different settings; this is tough! Rendering a web page can also change the state of complex systems (hey, it can move hundreds of pounds from my bank account into an airline’s bank). Rendering a web page is serious computation.&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;A significant part of rendering (one might think, THE significant feature of rendering, with the exception of the state-changing role above) is to present information for the human consumer. That’s also a key intent of computation on data. I’ll assert that with few exceptions, ALL computation is ultimately intended to render results for human consumption. You may spend a gazillion petaflops sifting through data and silently chucking out the uninteresting stuff. But sooner or later your computation finds some prima facie evidence of a possible Higgs Boson, and sure as eggs is eggs you want it to tell you. I reckon you’ve just rendered the result of part of your LHC experiment!&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;So perhaps rendering involves (potentially) multiple inputs with some state change and significant human-readable output. And computation involves (potentially) multiple inputs and outputs and state changes, with eventually some human-readable output. No, I give up. Anyone got a better idea?&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;Grump.&lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt; &lt;/span&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;p class="MsoNormal"&gt;&lt;span lang="EN-US"  style="font-family:TimesNewRomanPSMT;"&gt;&lt;span class="Apple-style-span" style="font-size: medium;"&gt;PS, apologies for the break in postings. I've been rather busy with a re-funding bid. I hope I can do better now! I'm way behind my targets for the period (:-(.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;  &lt;!--EndFragment--&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7044342322939597258?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7044342322939597258/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/to-render-or-to-compute.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7044342322939597258'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7044342322939597258'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/07/to-render-or-to-compute.html' title='To render or to compute?'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-215094235571873366</id><published>2009-06-10T22:48:00.006+01:00</published><updated>2010-01-05T17:29:19.266Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Digital Curation'/><category scheme='http://www.blogger.com/atom/ns#' term='Social bookmarking'/><category scheme='http://www.blogger.com/atom/ns#' term='DCC'/><category scheme='http://www.blogger.com/atom/ns#' term='Web 2.0'/><title type='text'>How can Social Bookmarking tools support community resource building?</title><content type='html'>In the DCC we are trying to work out ways that we can present tools to the community that help you to help us, to help you. The most primitive example of this would be the use of email lists: having identified some issue, we ask a question on a list, and use feedback from list members to develop our response to the issue. But we want to go further; just not sure how.&lt;br /&gt;&lt;br /&gt;In this post I want to explore two use cases where social bookmarking tools might be helpful, and to seek advice on how to take these ideas forward. The two cases are:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;getting input from the community on curation tools and resources worth investigating&lt;/li&gt;&lt;li&gt;extending a proposed bibliography on data curation.&lt;/li&gt;&lt;/ul&gt;In the first case, we’re looking for suggestions for quality curation resources. These could be tools of various kinds, guides, policies, templates, even standards (although we have separate ideas on the latter). We currently have a &lt;a href="http://www.dcc.ac.uk/adding/resource_up"&gt;form&lt;/a&gt; on our web site for suggestions, but it’s long-winded and clunky, and we don’t get much input. What if we could use something like &lt;a href="http://delicious.com/"&gt;Delicious&lt;/a&gt;. It’s very easy to bookmark a resource with Delicious, as I’m sure you know. A couple of clicks and a few keystrokes, and you’ve bookmarked and tagged something. But how can we arrange for Delicious bookmarking to feed in to a set of resources for us to review? I wondered if asking people to use a tag such as &amp;lt;DCC-suggest&amp;gt; &lt;dcc-suggest&gt; might work?&lt;br /&gt;&lt;br /&gt;In the second case, I have been building an extensive bibliography of books, articles and reports relevant to research data curation, management and preservation. We can load such a bibliography onto our web site in a variety of formats, including simple web pages for reading, and downloadable BibTex, RIS or other formats. But that leaves the bibliography as a static resource, and the responsibility for maintenance and enhancement lies entirely with us. And if someone identifies a good candidate, there’s no easy way to feed it into the bibliography.&lt;br /&gt;&lt;br /&gt;Now there are a few social bookmarking sites that are specifically oriented towards managing references, including &lt;a href="http://www.connotea.org/"&gt;Connotea&lt;/a&gt;. But I can’t work out how to use them in this way. I had a go at using Connotea a year or so ago, but have largely given up because it wasn’t very good at extracting the metadata for the kinds of resources I was bookmarking (so I had to do all the work anyway), and while I could do a download once from Connotea into the reference management tool I was then beginning to use on my desktop (a commercial product I won’t name), I couldn’t work out how to do incremental downloads. I had another poke around today, and while there clearly is some way of sharing, it didn’t feel like the simple act that social networking requires. And I couldn’t see much value in other people’s tags.&lt;br /&gt;&lt;br /&gt;Today after reading an interesting article (&lt;a href="http://dx.doi.org/10.1371%2Fjournal.pcbi.1000204"&gt;Hull, Pettifer, &amp;amp; Kell, 2008&lt;/a&gt;), I experimented with &lt;a href="http://www.mendeley.com/"&gt;Mendeley&lt;/a&gt;, which looked interesting. I’m not sure it works a lot better for me, for various reasons (although the metadata extraction works a bit better), but it was hard to be convinced it would be useful for this use case, given relatively low usage. I also remembered that I played with &lt;a href="http://www.citeulike.org/home"&gt;CiteULike&lt;/a&gt; a while ago; again I couldn’t quite work out how to use it as I want to, either personally, or in this use case.&lt;br /&gt;&lt;br /&gt;I’m hoping that there is some way, with one or other of these tools, to load up the bibliography, maybe tagged in some way such as &lt;data-curation&gt;&amp;lt;data-curation&amp;gt;. That might allow others to find and access these resources, download the bookmarks etc. People could also presumably upload further bookmarks and tag them with the same tag, so that adds to the resources available to others. I’m not sure what can be done in this circumstance to quality-validate these resources, so that the whole bibliography remains of appropriate quality. Any ideas?&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Hull, D., Pettifer, S. R., &amp;amp; Kell, D. B. (2008). Defrosting the Digital Library: Bibliographic Tools for the Next Generation Web. PLoS Comput Biol, 4(10), e1000204. &lt;a href="http://dx.doi.org/10.1371%2Fjournal.pcbi.1000204"&gt;http://dx.doi.org/10.1371%2Fjournal.pcbi.1000204 &lt;/a&gt;&lt;/data-curation&gt;&lt;/dcc-suggest&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-215094235571873366?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/215094235571873366/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/06/how-can-social-bookmarking-tools.html#comment-form' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/215094235571873366'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/215094235571873366'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/06/how-can-social-bookmarking-tools.html' title='How can Social Bookmarking tools support community resource building?'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-8296900898319862367</id><published>2009-06-10T22:07:00.003+01:00</published><updated>2009-06-10T22:16:36.536+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Optical drives'/><category scheme='http://www.blogger.com/atom/ns#' term='Data Recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='Data Re-use'/><title type='text'>Need to read a CT scan from an old Magneto-optical disk</title><content type='html'>Philipp Westebbe from Berlin wrote to me:&lt;br /&gt;&lt;blockquote&gt;“I am working at Rathgen Research Laboratory (National Museums in Berlin) in Germany. I have some valuable old data (computer tomography images) on MOD [Magneto-optical disc – CR] format. I am aware this is not the usual topic of the list, but we cannot think of a better place to start with. I am working on a project using an old MOD (Laser Drive Media LM1200-002) for recording computer tomography images. I'm looking for an old PC and the belonging software with which I can read this old MOD from 1992.”&lt;br /&gt;&lt;/blockquote&gt;In a later email, he wrote:&lt;br /&gt;&lt;blockquote&gt;“At the moment I am coordinating the project about the Egyptian Queen Nefertiti bust [see &lt;a href="http://en.wikipedia.org/wiki/Nefertiti"&gt;Wikipedia&lt;/a&gt;- CR]. 1992 the Charite hospital in Berlin made a CT from the bust to analyse the internal structure. The CT images were saved at that time on a MOD LM 1200-002. The data are unfortunately no longer readable, because the Charite hospital doesn't have the specific computer any more. The only things I have in my research lab are the drive and the MOD with Nefertiti's scan on it. I got over long paths to your email. I am looking for someone who has such a computer with which I can read my old MOD from Nefertiti, and for a possibility to copy this old format into a new digital format. Nevertheless, I hope to get a contact, which helps me to read the old MOD from Nefertiti. It would be very nice if the person who reads this mail could help me handling my problem, or could send the email to someone who is still working with such an old CT system. It is quite difficult to get an old MOD from 1992 back into a common digital format.”&lt;br /&gt;&lt;/blockquote&gt;In a third email, he made clear that he has one disk, on which the CT scans are stored, and that the data are regarded as very valuable, as the bust is too fragile to scan again. He didn’t say, but perhaps the importance (and hence the value) of these data is increased by a current &lt;a href="http://www.breitbart.com/article.php?id=CNG.257dfff07c3043c229466ad2dffefed6.a21&amp;amp;show_article=1"&gt;controversy&lt;/a&gt; over the authenticity of the bust.&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.archives.gov/preservation/technical/imaging-storage-appendix.html"&gt;Documents&lt;/a&gt; from NARA and other places show that the LM1200 disk is 12” Write-Once, Read-Many (WORM) dual-sided media, capable of holding 1 GByte each side. A &lt;a href="http://www.siemens.dk/ccmi/bu/med/download/pdf/Medicinsk_tilbehor/engelsk/Kap_4_Storage_Media.pdf"&gt;catalogue&lt;/a&gt; from Siemens identifies it as Plasmon (Philips) WORM Media. These media are reputed to have a long life expectancy.&lt;br /&gt;&lt;br /&gt;As far as I can tell, this drive will have been part of an embedded computer in the CT machine, although we don’t at this stage know the brand of the CT machine (or the type of embedded computer; many such machines of that vintage would have been “mini-computers” rather than PCs). The UK-based Plasmon PLC bought Philips LMS, the company that produced Philips 12” optical disks and drives, in 1999. It looks like &lt;a href="http://www.plasmon.com/"&gt;Plasmon&lt;/a&gt; failed at the end of last year, and was in its turn bought by Alliance Storage Technologies in the US.&lt;br /&gt;&lt;br /&gt;Their technical support points out that reading the data off the disk may not be enough, because of the proprietary formats involved at the time. “Your best bet is to contact a data migration company that has experience with the application that wrote the data or the company that wrote the software. If you have trouble reading the media because of a problem with your drive we may be able to help but there are few parts left for the drive… Sorry to provide such a bleak outlook. Medical companies tend to maintain legacy equipment the longest so there is some hope.”&lt;br /&gt;&lt;br /&gt;Nevertheless, before there is a chance of interpreting the data, Phillip needs to get it off the disk. Does anyone know of a contact who could read and interpret such a disk?&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-8296900898319862367?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/8296900898319862367/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/06/need-to-read-ct-scan-from-old-magneto.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8296900898319862367'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/8296900898319862367'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/06/need-to-read-ct-scan-from-old-magneto.html' title='Need to read a CT scan from an old Magneto-optical disk'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-1215298155646194858</id><published>2009-06-04T19:43:00.004+01:00</published><updated>2009-06-04T19:57:23.772+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data management'/><category scheme='http://www.blogger.com/atom/ns#' term='scale'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><title type='text'>SNIA "Terminology Bridge" report</title><content type='html'>Quite a nice report has just been published by the Storage Networking Industry Association's Data Management Forum, called "&lt;a href="http://tinyurl.com/pp9xf2"&gt;Building a Terminology Bridge: Guidelines for Digital Information Retention and Preservation Practices in the Datacenter&lt;/a&gt;". I don't think I'd agree with everything I saw on a quick skim, but overall it looks like a good set of terminology definitions.&lt;br /&gt;&lt;br /&gt;The report identifies "two huge and urgent gaps that need to be solved. First, it is clear that digital information is at risk of being lost as current practices cannot preserve it reliably for the long-term, especially in the datacenter. Second, the explosion of the amount of information and data being kept long-term make the cost and complexity of keeping digital information and periodically migrating it prohibitive." (I'm not sure that I agree with their apocalyptic cost analysis, but it certainly deserves some serious thought!)&lt;br /&gt;&lt;br /&gt;However, while still addressing these large problems, they found that what "began as a paper focused at developing a terminology set to improve communication around the long-term preservation of digital information in the datacenter based on ILM[*]-practices, has now evolved more broadly into explaining terminology and supporting practices aimed at stimulating all information owning and managing departments in the enterprise to communicate with each other about these terms as they begin the process of implementing any governance or service management practices or projects related to retention and preservation."&lt;br /&gt;&lt;br /&gt;It's worth a read!&lt;br /&gt;&lt;br /&gt;(* ILM = Information Lifecycle Management, generally not related to the Curation Lifecycle, but oriented towards management of data on appropriate storage media, eg moving less-used data onto offline tapes, etc.)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-1215298155646194858?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/1215298155646194858/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/06/snia-terminology-bridge-report.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1215298155646194858'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/1215298155646194858'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/06/snia-terminology-bridge-report.html' title='SNIA &quot;Terminology Bridge&quot; report'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4569098858349560133</id><published>2009-06-02T17:42:00.004+01:00</published><updated>2009-06-02T18:11:28.933+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Digital Curation'/><category scheme='http://www.blogger.com/atom/ns#' term='Data management'/><category scheme='http://www.blogger.com/atom/ns#' term='Research data'/><category scheme='http://www.blogger.com/atom/ns#' term='Data sharing'/><title type='text'>New JISC Research Data Management Programme</title><content type='html'>I have not been posting for over a month now, due to pressure of work (bid writing, mainly), and its curiously hard to get back into it. There's plenty of things to write about; too many, really; too hard to pick the right one. But now something has arrived that absolutely demands to be blogged: the JISC Research Data Programme, Data Management Infrastructure Call for Projects (&lt;a href="http://www.jisc.ac.uk/fundingopportunities/funding_calls/2009/05/grant0709.aspx"&gt;JISC 07/09&lt;/a&gt;) has been released (closing data 6 August 2009, Community Briefing 6 July 2009)!&lt;br /&gt;&lt;br /&gt;We are very excited about this. The Programme aims for 6-8 projects in English or Welsh (not Scottish or Irish, grrrr) institutions that will &lt;span style="font-weight: bold;"&gt;identify requirements to manage data&lt;/span&gt; created by researchers, and then will &lt;span style="font-weight: bold;"&gt;deploy a pilot data management infrastructure&lt;/span&gt; to address these requirements. Coupled with some projects already funded under the JISC 12/08 call (&lt;a href="http://clarionproject.wordpress.com/"&gt;CLARION&lt;/a&gt; at Cambridge, &lt;a href="http://www.kcl.ac.uk/iss/cerch/projects/portfolio/bril.html"&gt;Bril&lt;/a&gt; at Kings, &lt;a href="http://eidcsr.blogspot.com/"&gt;EIDCSR&lt;/a&gt; at Oxford, &lt;a href="http://www.jisc.ac.uk/whatwedo/programmes/inf11/lifespanradar.aspx"&gt;Lifespan RADAR&lt;/a&gt; at Royal Holloway, and the &lt;a href="http://www.materialsdatacentre.com/index.html"&gt;Materials Data Centre&lt;/a&gt; at Southampton), and some work not eligible for this call, such as &lt;a href="http://datashare.edina.ac.uk/dspace/"&gt;Edinburgh DataShare&lt;/a&gt;, these projects will really begin to build experience in managing research data at institutional level (or groups of institutions) in the UK.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://www.dcc.ac.uk/"&gt;DCC&lt;/a&gt; has a key role in the Programme. Apart from mentions of the &lt;a href="http://www.dcc.ac.uk/docs/publications/DCCLifecycle.pdf"&gt;Curation Lifecycle&lt;/a&gt; and the &lt;a href="http://www.dcc.ac.uk/tools/daf/"&gt;Data Audit Framework&lt;/a&gt;, paragraph 30 of the Call says&lt;br /&gt;&lt;blockquote&gt;"30. Role of the Digital Curation Centre (DCC): Bidders are invited to consult with the DCC in preparing their bids. The DCC will provide general support for this strand of activities and for the programme more broadly. This will be done by contributions to programme events as well as the current channels of information, and through its principal role as a broker for expertise and advice in the management and curation of data. Projects are encouraged to engage directly with the DCC and its programme of information exchange - for example, by contributing to the Research Data Management Forum (&lt;a href="http://www.dcc.ac.uk/data-forum"&gt;RDMF&lt;/a&gt;)."&lt;br /&gt;&lt;/blockquote&gt;We are preparing to deliver on this role both in the short term (during the bid-writing phase), and later during the Programme execution. I have been very pleased to work with Simon Hodson, the JISC Programme Manager for this Call. I'm sure Simon does not yet realise how much influence he will be exerting over the future of research data curation in institutions in the UK!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4569098858349560133?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4569098858349560133/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/06/new-jisc-research-data-management.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4569098858349560133'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4569098858349560133'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/06/new-jisc-research-data-management.html' title='New JISC Research Data Management Programme'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7138402596235953424</id><published>2009-05-14T13:24:00.004+01:00</published><updated>2009-05-14T13:32:44.868+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='OAIS'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><title type='text'>OAIS version for public examination</title><content type='html'>Thanks to David Giaretta for the following information on the state of the revision to &lt;a href="http://public.ccsds.org/publications/archive/650x0b1.pdf"&gt;OAIS&lt;/a&gt; (I have commented &lt;a href="http://digitalcuration.blogspot.com/2008/12/comments-on-oais-responses-to-our.html"&gt;earlier&lt;/a&gt; on this process):&lt;br /&gt;&lt;blockquote&gt;&lt;span style="font-weight: bold;"&gt;OAIS version for public examination&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Many comments and ideas for clarifications and improvements for OAIS were received as part of its 5 year review process.&lt;br /&gt;&lt;br /&gt;These suggestions were reviewed and the proposed dispositions sent to their originators for further comment.  This draft version of OAIS contains these and many other improvements and is the candidate for submission to ISO for review. At this stage we are seeking primarily to identify errors rather than further ideas.&lt;br /&gt;&lt;br /&gt;The PDF file is available at &lt;a href="http://cwe.ccsds.org/moims/docs/MOIMS-DAI/Draft%20Documents/OAIS-candidate-V2-markup.pdf"&gt;http://cwe.ccsds.org/moims/docs/MOIMS-DAI/Draft%20Documents/OAIS-candidate-V2-markup.pdf&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Please send corrections to &lt;a href="mailto:oais-support@oais.info"&gt;oais-support@oais.info&lt;/a&gt; by 15 June 2009&lt;br /&gt;&lt;br /&gt;(NB there are some cross-reference errors which will be corrected in the final version)&lt;br /&gt;&lt;br /&gt;Shortly after this date the corrected OAIS update will be sent to ISO and in due course this will be released for international review at which point further comments may be submitted.&lt;br /&gt;&lt;br /&gt;John Garrett (chair)               David Giaretta (deputy-chair)&lt;br /&gt;DAI-WG CCSDS&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7138402596235953424?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7138402596235953424/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/05/oais-version-for-public-examination.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7138402596235953424'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7138402596235953424'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/05/oais-version-for-public-examination.html' title='OAIS version for public examination'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-6418680687700222972</id><published>2009-05-05T20:26:00.003+01:00</published><updated>2009-05-05T20:34:37.431+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data'/><title type='text'>What are data?</title><content type='html'>Another nice blog &lt;a href="http://wwmm.ch.cam.ac.uk/blogs/murrayrust/?p=1831"&gt;post&lt;/a&gt; from Peter Murray-Rust, in his "thinking out loud before a presentation" series, from which I quote:&lt;br /&gt;&lt;blockquote&gt;"Different people would have cutoffs at different points on this hierarchy [CR: Data → Information → Knowledge → Wisdom] but I think the following are fairly common attributes of data:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;it is [sic] distinct from most prose (although some prose would be better recast as data)&lt;/li&gt;&lt;li&gt;it is generally a component of a larger information or knowledge structure&lt;/li&gt;&lt;li&gt;facts and data are closely related&lt;/li&gt;&lt;li&gt;many data are potentially reproducible or unique observations, are not opinions (though different people may produce different data)&lt;/li&gt;&lt;li&gt;data, as facts, are not copyrightable.&lt;/li&gt;&lt;li&gt;Collections of data and annotated data (data + metadata) may have considerably enhanced value over the individual items.&lt;/li&gt;&lt;li&gt;Data can be processed by machine&lt;/li&gt;&lt;/ul&gt;Here are some statements which provide data:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;36 26 38&lt;/li&gt;&lt;li&gt;Melting Point: 300 K&lt;/li&gt;&lt;li&gt;The reaction product was red&lt;/li&gt;&lt;li&gt;my blog page is http://wwmm.ch.cam.ac.uk/blogs/murrayrust&lt;/li&gt;&lt;/ul&gt;and here are some which are not data&lt;br /&gt;&lt;ul&gt;&lt;li&gt;her work is well respected&lt;/li&gt;&lt;li&gt;we thank Dr. XYZZY for the crystals&lt;/li&gt;&lt;li&gt;we find this reaction very difficult to perform&lt;/li&gt;&lt;/ul&gt;&lt;/blockquote&gt;I think I mostly agree with that, although it made me think quite hard, and of course at the extreme anything is data for someone (these words are mere data for Blogger or our RSS aggregator or Google, for example). this can get difficult when your mission is to support research data! Our advice from JISC is to recognise the potential ambiguity, be flexible in accepting others, but take our own view in what we create.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-6418680687700222972?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/6418680687700222972/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/05/what-are-data.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6418680687700222972'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/6418680687700222972'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/05/what-are-data.html' title='What are data?'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-2875665397862647930</id><published>2009-05-04T20:27:00.003+01:00</published><updated>2009-05-05T16:08:20.741+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='data curation'/><category scheme='http://www.blogger.com/atom/ns#' term='Nature'/><category scheme='http://www.blogger.com/atom/ns#' term='Science publishing'/><title type='text'>New nature corresponding author policy on data</title><content type='html'>Thanks to &lt;a href="http://www.rin.ac.uk/node/618"&gt;Michael Jubb&lt;/a&gt; for pointing me to the &lt;a href="http://www.nature.com/nature/journal/v458/n7242/full/4581078a.html"&gt;editorial&lt;/a&gt; outlining the new Nature policy:&lt;br /&gt;&lt;blockquote&gt;"Accordingly, we have modified the Nature journal policy on authorship, which is detailed on our website (&lt;a href="http://tinyurl.com/dkgbf8"&gt;http://tinyurl.com/dkgbf8&lt;/a&gt;). For papers submitted by collaborations, we now delineate the responsibilities of the senior members of each collaboration group on the paper. Before submitting the paper, at least one senior member from each collaborating group must take responsibility for their group's contribution. Three major responsibilities are covered: preservation of the original data on which the paper is based, verification that the figures and conclusions accurately reflect the data collected and that manipulations to images are in accordance with Nature journal guidelines (&lt;a href="http://tinyurl.com/cmmrp7"&gt;http://tinyurl.com/cmmrp7&lt;/a&gt;), and minimization of obstacles to sharing materials, data and algorithms through appropriate planning."&lt;/blockquote&gt;&lt;br /&gt;This sort of policy from respected journals is seriously good for data curation!&lt;br /&gt;&lt;br /&gt;[Updated to enable the Nature tinyurls.]&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-2875665397862647930?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/2875665397862647930/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/05/new-nature-corresponding-author-policy.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2875665397862647930'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/2875665397862647930'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/05/new-nature-corresponding-author-policy.html' title='New nature corresponding author policy on data'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-7228220780205214046</id><published>2009-04-21T17:57:00.004+01:00</published><updated>2010-01-05T17:36:34.347Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='Backup'/><title type='text'>Lazyweb: Mac backup software associated with Iomega Jaz?</title><content type='html'>In my data recovery sideline, I thought I might tackle some of my own ancient media. I have a number of Mac backups on ancient Jaz 1 GB disks, written in the period 1997-1999. Andrew Treloar of ANDS has recovered the contents of those files onto CD-ROM, so I'm not looking to read the media any more. There are 2 backups on each of 2 Jaz disks, and the CD has 4 files from each Jaz disk. The earliest two have the following names:&lt;br /&gt;&lt;blockquote&gt;FIGIT 2.971113A #001&lt;br /&gt;FIGIT 2.971113A.FULL&lt;br /&gt;&lt;/blockquote&gt;I'm guessing these were written on 13 November, 1997. Definitely from a Mac. Does anyone know what backup software wrote these? I would like to recover the contents if possible!&lt;br /&gt;&lt;br /&gt;The .FULL is about 250 KB, while the #001 appears to be around 300 MB and is presumably the actual backup!&lt;br /&gt;&lt;br /&gt;Any ideas on what the backup software might be? I thought it might be some standard software on the Iomega install. The install disks I still have don't read properly, but I managed to get sight of a directory for a French DOS install disk, and couldn't see any file names that looked like backup software!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-7228220780205214046?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/7228220780205214046/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/04/lazyweb-mac-backup-software-associated.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7228220780205214046'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/7228220780205214046'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/04/lazyweb-mac-backup-software-associated.html' title='Lazyweb: Mac backup software associated with Iomega Jaz?'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-3936670217829929774</id><published>2009-04-19T22:17:00.002+01:00</published><updated>2009-04-19T22:31:54.178+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='eScience'/><category scheme='http://www.blogger.com/atom/ns#' term='eResearch'/><category scheme='http://www.blogger.com/atom/ns#' term='Infrastructure'/><title type='text'>Engaging e-Science with Infrastructure</title><content type='html'>Last Friday I was at the National e-Science Centre (NeSC) for the second day of a &lt;a href="http://twurl.nl/znseuc"&gt;workshop&lt;/a&gt; on e-Science; unfortunately I wasn’t able to get to the first day. I tweeted most of it with the &lt;a href="http://twitter.com/"&gt;Twitter&lt;/a&gt; hashtag #eresearch, so you might learn all you need to just be searching for that hashtag (although it is used for other things, especially by some of the Australians involved in &lt;a href="http://www.ands.org.au/"&gt;ANDS&lt;/a&gt;). One thing I did like was the use of e-Science quite generally, without a specific focus on any one technology (ie it was not GRID-dominated!). However, that does it make it a little harder to define the e-Science bit.&lt;br /&gt;&lt;br /&gt;Leif Laaksonen from Helsinki, Chair of the &lt;a href="http://www.e-irg.eu/"&gt;e-Infrastructure Reflection Group&lt;/a&gt;, gave the keynote (not sure if, or when, the slides will be made available). The e-IRG prepares white papers and other documents aiming to realise the “vision for the future […] an open e-Infrastructure enabling flexible cooperation and optimal use of all electronically available resources”. He mentioned the European Strategy Forum on Research Infrastructures (&lt;a href="http://cordis.europa.eu/esfri/"&gt;ESFRI&lt;/a&gt;); this group must be doing something right, as apparently they have 10 billion Euros to play with to build and maintain this infrastructure! He spoke quite a lot about sustainability, but it appeared that this means not paying for infrastructure through a succession of European projects, but rather through sustained funding through national governments. Hmmm. I wondered what happened when a country has a bad budget year and cuts its infrastructure project; how much of the global infrastructure can be damaged by that?&lt;br /&gt;&lt;br /&gt;Laaksonen had a number of interesting diagrams in his presentation, which can be seen from the web site. One that alarmed me (slide 17?) had an un-differentiated layer of “data” just above the network layer. I worry that is a dangerously simplistic summary; the data layer is far more fractured than that, with disciplinary, sub-disciplinary, even sub-group different approaches to curation.&lt;br /&gt;&lt;br /&gt;Laaksonen did show a great slide, dating from 2003, showing routes of innovation from academic research to industrial acceptance. Not a monotonic progress!&lt;br /&gt;&lt;br /&gt;This was followed by 3 quick presentations, one on the &lt;a href="http://www.grid-support.ac.uk/"&gt;National Grid Service&lt;/a&gt; (not the one that supplies electricity and gas, but the one that supplies compute and data storage resources, using, yes, GRID technologies. The second was on &lt;a href="http://www.omii.ac.uk/"&gt;OMII-UK&lt;/a&gt;, aiming to sustain community software developments, but itself, like the DCC, perhaps facing its own sustainability crisis? Finally, I gave a short presentation on the &lt;a href="http://www.dcc.ac.uk/"&gt;DCC&lt;/a&gt;, based on the one I gave to the environmental data managers a month or so ago. Then Bruce Beckles from Cambridge gave a wonderfully enthusiastic talk on being a one-man support service for e-Science within the Cambridge IT infrastructure.&lt;br /&gt;&lt;br /&gt;After an hour-long panel session (no notes as I was on the panel; there was quite a focus on data, and on education &amp;amp; training), and lunch, there was an interesting demo session in the afternoon. This was organised so that each demo was given twice, and there were 3 sessions of 20 minutes each, so you could see quite a lot. I took in a &lt;a href="http://taverna.sourceforge.net/"&gt;Taverna&lt;/a&gt; demo (workflows, which I've wanted to understand better for some time), found I was the sole person in the second demo on &lt;a href="http://www.nactem.ac.uk/"&gt;NaCTeM&lt;/a&gt; (which meant I could ask all the text mining questions I wanted), and then saw the &lt;a href="http://twurl.nl/km484b"&gt;e-Science Central&lt;/a&gt; demo. Disappointing that the latter invented their own workflow system, although they claim there were good reasons, and they are hoping to backstitch Taverna in later (@lescarr tweeted back to me that their site doesn't use UTF-encodings, so if you hover over their rather nice cartoon images, the captions come up all wonky!).&lt;br /&gt;&lt;br /&gt;Finally the wrap-up session, chaired by David de Roure (not as advertised). We were meant to find “just 3 things” that needed to be done to move e-Science more firmly into the national infrastructure. But of course our enthusiasm got the better of us, and we couldn’t stop. I think David and Malcolm Atkinson between them have the job of winnowing it down to the 3 top priorities. Altogether, a very interesting day; it’s good to see data becoming a real priority in e-Science!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-3936670217829929774?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/3936670217829929774/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/04/engaging-e-science-with-infrastructure.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3936670217829929774'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/3936670217829929774'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/04/engaging-e-science-with-infrastructure.html' title='Engaging e-Science with Infrastructure'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-4727943344762596925</id><published>2009-04-19T20:43:00.011+01:00</published><updated>2010-01-05T17:26:41.532Z</updated><category scheme='http://www.blogger.com/atom/ns#' term='Data Recovery'/><category scheme='http://www.blogger.com/atom/ns#' term='Digital Preservation'/><title type='text'>Amiga disk data recovery: progress and limitations</title><content type='html'>You may remember that I have been attempting to recover files and content from various sources from 10 or more years ago. One of these was an Amiga disk. On the label is the note: "Dissertation 17/4/96, CV.asc, CV, 29 September 1996".&lt;br /&gt;&lt;br /&gt;I’ve described &lt;a href="http://digitalcuration.blogspot.com/2009/04/update-on-my-data-recovery-efforts.html"&gt;earlier&lt;/a&gt; some attempts to get the Catweasel controller to read the disks. After eventually figuring out how to configure the disk-reading program ImageTool3 for the Catweasel, I tried the Amiga disk. It worked fine, with as far as I can see zero errors. From a cursory scan of Google, I reckon this raw disk format is known as ADF, so I renamed it XXXAmiga.adf (.adf was one of the candidate extension names under the selected "Plain" category for the ImageTool3 program).&lt;br /&gt;&lt;br /&gt;Now, of course, we have to work out first how to extract files from the disk image, and then how to convert your particular file format into a modern day format.&lt;br /&gt;&lt;br /&gt;Just simply reading the raw disk image with Notepad on Windows or Textedit (on my Mac) shows that there is real text there, that made sense to my colleague (see below)!&lt;br /&gt;&lt;br /&gt;A comment from “Euan” on my earlier post suggested that we try the WinUAE Amiga emulator, and my colleague did that. He reported:&lt;blockquote&gt;“Success.  I've not only got the WinUAE Amiga emulator working, but managed to find a copy of the application that I wrote my CV and dissertation in (Final Writer 5) and have been able to read the files off the disk image you sent and display them (screenshots attached).&lt;br /&gt;&lt;br /&gt;Not having any luck reading the individual files directly [CR: from his Windows system], though -- other than the odd word related to fonts and colours -- but then they are in native FW5 format.”&lt;/blockquote&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_9rv9tNJgLmI/Se2WFuvGsgI/AAAAAAAAABY/JO8nAy6CuwE/s1600-h/Dissertation_in_Final_Writer.gif"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 349px;" src="http://4.bp.blogspot.com/_9rv9tNJgLmI/Se2WFuvGsgI/AAAAAAAAABY/JO8nAy6CuwE/s400/Dissertation_in_Final_Writer.gif" alt="" id="BLOGGER_PHOTO_ID_5327078959438279170" border="0" /&gt;&lt;/a&gt;&lt;span style="font-size:78%;"&gt;Image from the dissertation seen in Final Writer&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;span style="font-size:78%;"&gt;&lt;/span&gt;&lt;div style="text-align: left;"&gt;I asked if he was able to do any "save as" operations in his emulated Final Writer program, to move files from the disk image into the Windows file store. He reported:&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;blockquote&gt;“I've tried re-saving my files in another format, if that's what you mean, but the program doesn't do anything -- I can select Save &gt; Save As... from the menu but nothing happens.  However, I can see all my individual dissertation files from my PC as the file system is mapped onto a directory.”&lt;br /&gt;&lt;/blockquote&gt;&lt;div style="text-align: center;"&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_9rv9tNJgLmI/Se2dFih0fCI/AAAAAAAAABg/kiX4yxRRexU/s1600-h/text-abstract.gif"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 373px;" src="http://2.bp.blogspot.com/_9rv9tNJgLmI/Se2dFih0fCI/AAAAAAAAABg/kiX4yxRRexU/s400/text-abstract.gif" alt="" id="BLOGGER_PHOTO_ID_5327086652742728738" border="0" /&gt;&lt;/a&gt;&lt;span style="font-size:78%;"&gt;Raw image from the same part of the dissertation grabbed from Textedit on the Mac&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;It was remarkable how much could be read directly in the disk image!&lt;br /&gt;&lt;br /&gt;Now my colleague was able to read his CV.asc file with Notepad on Windows, but so far we have not been able to convert the dissertation to a modern format, nor to connect the Final Writer program inside the emulator to a printer. Frustratingly close, but still not quite where we would like to be. I did find a demo copy of Final Writer for Windows 95 on the WayBack machine (earliest lift of the site, also 1996), but unfortunately this wouldn't open the existing image unless we upgrade to the full-featured version... but the company appears to have gone bust in 1996-7 or thereabouts!&lt;br /&gt;&lt;br /&gt;So what have we learned from this?&lt;br /&gt;&lt;ul&gt;&lt;li&gt;It is possible to read a 13 year old floppy disk from an obsolete machine with an apparently incompatible disk format, kept under conditions of less than benign neglect, using cheap hardware on a recent Windows PC.&lt;/li&gt;&lt;li&gt;It is possible to access the files from the obsolete operating system using an emulator that appears to have been written by spare time volunteers.&lt;/li&gt;&lt;li&gt;It is possible to run the original application that created some of these files, under the emulator, and to read and process them (but not, so far, to save in another format).&lt;/li&gt;&lt;li&gt;Using the emulator is valuable, but constraining (in being unfamiliar technology, with few manuals etc) and limiting (in not, so far, being able to do much more with the files). We would now like to migrate them to a modern environment; for my colleague, this means Windows or Linux.&lt;/li&gt;&lt;/ul&gt;Fascinating!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1303975371294158246-4727943344762596925?l=digitalcuration.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://digitalcuration.blogspot.com/feeds/4727943344762596925/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://digitalcuration.blogspot.com/2009/04/amiga-disk-data-recovery-progress-and.html#comment-form' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4727943344762596925'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1303975371294158246/posts/default/4727943344762596925'/><link rel='alternate' type='text/html' href='http://digitalcuration.blogspot.com/2009/04/amiga-disk-data-recovery-progress-and.html' title='Amiga disk data recovery: progress and limitations'/><author><name>Chris Rusbridge</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_9rv9tNJgLmI/Se2WFuvGsgI/AAAAAAAAABY/JO8nAy6CuwE/s72-c/Dissertation_in_Final_Writer.gif' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1303975371294158246.post-8399195618070707553</id><published>2009-04-15T15:36:00.003+01:00</published><updated>2009-04-15T15:44:18.504+01:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Digital Curation'/><category scheme='http://www.blogger.com/atom/ns#' term='Research data'/><category scheme='http://www.blogger.com/atom/ns#' term='IJDC'/><category scheme='http://www.blogger.com/atom/ns#' term='Conference'/><title type='text'>5th International Digital Curation Conference: Call for Papers</title><content type='html'>The &lt;a href="http://www.dcc.ac.uk/events/dcc-2009/call-for-papers/"&gt;Call for Papers&lt;/a&gt; for the 5th International Digital Curation Conference has just been published. With the title "Moving to Multi-Scale Science: Managing Complexity and Diversity", the conference will be held in London from 2-4 December, 2009. I believe this is THE conference for papers on advances in digital and data curation! The text of the call follows:&lt;br /&gt;&lt;blockquote&gt;We invite submission of full papers, posters, workshops and demos and welcome contributions and participation from individuals, organisations and institutions across all disciplines and domains that are engaged in the creation, use and management of digital data, especially those involved in the challenge of curating data for e-science and e-research.&lt;br /&gt;&lt;br /&gt;Proposals will be considered for short (up to 6 pages) or long (up to 12 pages) papers and also for demonstrations, workshops and posters. The full text of papers will be peer-reviewed; abstracts for all posters, workshops and demos will be reviewed by the co-chairs. Final copy of accepted contributions will be made available to conference delegates, and papers will be published in our International Journal of Digital Curation [external]. Accordingly, we recommend that you download our template and read the advice on its use.&lt;br /&gt;&lt;br /&gt;Papers should be original and innovative, probably analytical in approach, and should present or reference significant evidence (whether experimental, observational or textual) to support their conclusions.&lt;br /&gt;&lt;br /&gt;Subject matter could be policy, strategic, operational, experimental, infrastructural, tool-based, and so on, in nature, but the key elements are originality and evidence. Layout and structure should be appropriate for the disciplinary area. Papers should not have been published in their current or a very similar form before, other than as a pre-print in a repository.&lt;br /&gt;&lt;br /&gt;We seek papers that respond to the main themes of the conference: &lt;span style="font-style: italic;"&gt;multi-scale, multi-discipline, multi-skill and multi-sector&lt;/span&gt;, and that relate to the creation, curation, management and re-use of research data. Research data should be interpreted broadly to include the digital subjects of all types of research and scholarship (including Arts and Humanities, and all the Sciences). Papers may cover:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Curation practice and data management at the extremes of scale (e.g. interactions between small science and big science, or extremes of object size, numbers of objects, rates of deposit and use)&lt;/li&gt;&lt;li&gt;Challenging content: (e.g. addressing issues of data complexity, diversity and granularity)&lt;/li&gt;&lt;li&gt;Curation and e-research, including contextual, provenance, authenticity and
