Sunday, 19 April 2009

Engaging e-Science with Infrastructure

Last Friday I was at the National e-Science Centre (NeSC) for the second day of a workshop on e-Science; unfortunately I wasn’t able to get to the first day. I tweeted most of it with the Twitter hashtag #eresearch, so you might learn all you need to just be searching for that hashtag (although it is used for other things, especially by some of the Australians involved in ANDS). One thing I did like was the use of e-Science quite generally, without a specific focus on any one technology (ie it was not GRID-dominated!). However, that does it make it a little harder to define the e-Science bit.

Leif Laaksonen from Helsinki, Chair of the e-Infrastructure Reflection Group, gave the keynote (not sure if, or when, the slides will be made available). The e-IRG prepares white papers and other documents aiming to realise the “vision for the future […] an open e-Infrastructure enabling flexible cooperation and optimal use of all electronically available resources”. He mentioned the European Strategy Forum on Research Infrastructures (ESFRI); this group must be doing something right, as apparently they have 10 billion Euros to play with to build and maintain this infrastructure! He spoke quite a lot about sustainability, but it appeared that this means not paying for infrastructure through a succession of European projects, but rather through sustained funding through national governments. Hmmm. I wondered what happened when a country has a bad budget year and cuts its infrastructure project; how much of the global infrastructure can be damaged by that?

Laaksonen had a number of interesting diagrams in his presentation, which can be seen from the web site. One that alarmed me (slide 17?) had an un-differentiated layer of “data” just above the network layer. I worry that is a dangerously simplistic summary; the data layer is far more fractured than that, with disciplinary, sub-disciplinary, even sub-group different approaches to curation.

Laaksonen did show a great slide, dating from 2003, showing routes of innovation from academic research to industrial acceptance. Not a monotonic progress!

This was followed by 3 quick presentations, one on the National Grid Service (not the one that supplies electricity and gas, but the one that supplies compute and data storage resources, using, yes, GRID technologies. The second was on OMII-UK, aiming to sustain community software developments, but itself, like the DCC, perhaps facing its own sustainability crisis? Finally, I gave a short presentation on the DCC, based on the one I gave to the environmental data managers a month or so ago. Then Bruce Beckles from Cambridge gave a wonderfully enthusiastic talk on being a one-man support service for e-Science within the Cambridge IT infrastructure.

After an hour-long panel session (no notes as I was on the panel; there was quite a focus on data, and on education & training), and lunch, there was an interesting demo session in the afternoon. This was organised so that each demo was given twice, and there were 3 sessions of 20 minutes each, so you could see quite a lot. I took in a Taverna demo (workflows, which I've wanted to understand better for some time), found I was the sole person in the second demo on NaCTeM (which meant I could ask all the text mining questions I wanted), and then saw the e-Science Central demo. Disappointing that the latter invented their own workflow system, although they claim there were good reasons, and they are hoping to backstitch Taverna in later (@lescarr tweeted back to me that their site doesn't use UTF-encodings, so if you hover over their rather nice cartoon images, the captions come up all wonky!).

Finally the wrap-up session, chaired by David de Roure (not as advertised). We were meant to find “just 3 things” that needed to be done to move e-Science more firmly into the national infrastructure. But of course our enthusiasm got the better of us, and we couldn’t stop. I think David and Malcolm Atkinson between them have the job of winnowing it down to the 3 top priorities. Altogether, a very interesting day; it’s good to see data becoming a real priority in e-Science!


Post a Comment

Please note that this blog has a Creative Commons Attribution licence, and that by posting a comment you agree to your comment being published under this licence. You must be registered to comment, but I'm turning off moderation as an experiment.