"Different people would have cutoffs at different points on this hierarchy [CR: Data → Information → Knowledge → Wisdom] but I think the following are fairly common attributes of data:I think I mostly agree with that, although it made me think quite hard, and of course at the extreme anything is data for someone (these words are mere data for Blogger or our RSS aggregator or Google, for example). this can get difficult when your mission is to support research data! Our advice from JISC is to recognise the potential ambiguity, be flexible in accepting others, but take our own view in what we create.Here are some statements which provide data:
- it is [sic] distinct from most prose (although some prose would be better recast as data)
- it is generally a component of a larger information or knowledge structure
- facts and data are closely related
- many data are potentially reproducible or unique observations, are not opinions (though different people may produce different data)
- data, as facts, are not copyrightable.
- Collections of data and annotated data (data + metadata) may have considerably enhanced value over the individual items.
- Data can be processed by machine
and here are some which are not data
- 36 26 38
- Melting Point: 300 K
- The reaction product was red
- my blog page is http://wwmm.ch.cam.ac.uk/blogs/murrayrust
- her work is well respected
- we thank Dr. XYZZY for the crystals
- we find this reaction very difficult to perform
Tuesday, 5 May 2009
What are data?
Another nice blog post from Peter Murray-Rust, in his "thinking out loud before a presentation" series, from which I quote:
Subscribe to:
Post Comments (Atom)
I agree; while the DIKW framework sounds simple at first, it's application can be tendentious. Of course, it's easy to get caught up arguing edge cases, but even the basic assumptions of the model are open to question.
ReplyDeleteI don't know if you read the recent article in the Journal of Info Science on this topic, The Knowledge Pyramid: A critique of the DIKW hierarchy (Fricke, 2009), but it's a nicely related and a thought-provoking piece (if not one I quite agree with).
You make a good point how the categorization of data/info/etc is inherently subjective, and I think you draw the right moral: the system should work for the application, not the other way around.