Friday, 14 August 2009

DCC web site and Linked Data

We at the DCC are in the early stages of refreshing our web site ( Nothing you can see yet, but we're talking to a few consultants about what and how we can do better. The ones we have spoken to so far seem pretty clued up on content management systems, and even on web 2.0 approaches. But questions about the role of the Semantic Web or Linked Data get blank looks.

Now our web site is not and will probably never be a major source of data as facts; rather it should contain resources: often documents, sometimes tools, sometimes sharing opportunities. There definitely are facts of various kinds there (which may not sufficiently explicit yet), such as staff contact details, document metadata, event locations and times, etc. But these are a comparatively small part of the content.

Does this (or anything else) justify investment in building a web site that is based on Linked Data/Semantic Web? What advantages could we get in doing so? What advantages could our users get if we did so?

I would really like to get some views on this!


  1. As a linked data/semweb enthusiast, of course I think it would be great if you could include some RDF on the DCC site!

    The things you identify above are indeed the obvious targets for adding some RDF metadata. In addition, the list of Digital Curation tools might be a nice test case.

    I think the best approach would be to add some RDFa markup to the existing pages (or the pages you would have anyway in your new website). So you don't need to base your site on Linked Data, you can just add some in where it's appropriate. This may well involve some hand-editing of HTML, though namespace declarations etc could be built into page templates via a CMS. I'm not aware of any good HTML+RDFa editors (though there may be some) but I believe an RDFa module for Drupal is on the way.

    As you say, sites with a greater emphasis on data and facts possibly stand to gain more benefits, but nonetheless adding RDF to your site would show commitment of the DCC to best practice use of information standards ("dogfooding") and would make your material findable by RDFa search engines such as Sindice and to a certain extent Yahoo and Google - as well as of course all the unknown and various applications that re-users of the data might choose to create.

    Even adding some basic Dublin Core metadata to documents on the site, a bit of FOAF for people and times and locations for events would be a small but useful enhancement.

    I'll be in Edinburgh a few times over the next couple of weeks - if you are around and have time for a cup of coffee and a chat, would be glad to meet up and talk further. (Mail me at if you like).


    Bill Roberts

  2. Re. Linked Data and semantic capabilities - as part of a local exercise I came across a number of ostensibly free utilities which offer the possibility of semantifying web-content, namely:

    Calais – - ‘toolkit of capabilities that allow you to readily incorporate state-of-the-art semantic functionality within your blog, content management system, website or application.’

    OpenPublish - - ‘Developed by Phase2 Technology with the support of Thomson Reuters, OpenPublish is designed to leverage the power of Drupal [Content Management System] as a social publishing platform, integrate semantic web technologies, and incorporate best practices from other publishing sites. OpenPublish features a semantic metadata engine that uses Thomson Reuter's Calais Web Service to provide contextual metadata.’

    LODr - - ‘a RDF-based (re-)tagging service, that allows people to weave their Web 2.0 tagged data into the Linked Data Web and provides a dedicated browsing interface’ - other similar tools include Apture ( and zemanta ( - they don’t add tags to the content itself, but they do extract concepts and add highly relevant related links and media

    Triplify – - is a lightweight plugin for Web applications, ‘which reveals the semantic structures encoded in relational databases by making database content available as RDF, JSON or Linked Data’

    I have not used the aforementioned utilities in anger and there will no doubt be other tools out there offering a variety of similar functionalities however I’d be interested to hear from anyone as to their effectiveness and suitability as 'off-the-shelf' solutions.

    Stuart Macdonald

  3. Stuart - I haven't used those tools in anger either, though a lot of blogs I read now seem to be using Zemanta quite effectively to suggest relevant links.

    However, I think you can't look at solutions until you decide what it is you want to achieve, and that will involve looking at the data that the DCC wants to communicate.

  4. Bill – totally agree re. decisions have to be made re. type of content that DCC want to exploit through RDF/FOAF etc. What would be interesting (and you referred to this in your earlier posting) would be if an organisation such as DCC could make some headway in this area and feedback their experience to the wider community (e.g. technical expertise required implementation-wise, infrastructure, man-hour resources, community feedback and exposure). There must be a long queue of potential toe-dippers into the semantic pool looking for authoritative guidance and direction.


Please note that this blog has a Creative Commons Attribution licence, and that by posting a comment you agree to your comment being published under this licence. You must be registered to comment, but I'm turning off moderation as an experiment.