I have to confess, that it was a measure of surprise to check the joint program of the two conferences (The 5th International Symposium on Information Management in a Changing World and the 10th International Conference on Knowledge Management) before travelling to the venue (Antalya, Turkey). Every third presentation had something about ‘data’ in its title: and even if the speakers are concurrently concentrated to ‘research data’ context, the academic/librarian approach can also be highly relevant for the business in the immediate future. Furthermore, don’t forget the fact, that managing scientific data itself is also a voluminous market for different business performers. (As it was proven on the spot, in a special way, with an exhibition of individually purchased and printed posters from a collection of map-like scientific infographics – Science Maps Online, Spaces&Places)
It was a real novelty for me to meet and understand the concerted efforts to create the new, common culture of data citation. Enormous datasets are being generated day by day in different research projects, and we should cite data in just the same way that we make with conference papers, monographs, articles, books and other sources of information (See more here). Data Cite, a long-time international collaboration program was established to support the idea, that citeable research data can help a lot by
- enabling easy identification, reuse and verification of data
- allowing the impact of data to be tracked
- creating a scholarly structure that recognises and rewards data producers
No question, that the ultimate, most important reason behind data citation is the rallying need for data reuse: since every piece of data have more affordancies (possible relations to different meaning structures and other data clusters), than one, the unstoppably growing common/open data assets are gold mines for any kind of future data hunters.
If you are interested in existing practices, make an excursion to the Digital Curation Centre (DCC) which serves the full United Kingdom’s higher education research community with research data management, or check DataONE, the most ambitious data-driven environmental research program of US National Science Foundation (NSF). This innovative Data Observation Network for Earth (DataONE) will “ensure the preservation, access, use and reuse of multi-scale, multi-discipline, and multi-national science data”, providing sustainable cyber-infrastructure and distributed framework (open, persistent, robust, secure access) to every Earth observational data having been discovered and collected so far.
How close we are to the universe of Linked Data, the vision of Tim Berners-Lee, the father of World Wide Web! (Don’t miss his TED-talk on it). A standardized data citation, a DOI (Digital Object Identifier) for Datasets are almost the same than Berners-Lee’s idea to give URI (Uniform Resource Identifier) and http names to every data.
Believe it? Are you sceptic? Since the data domain is more and more complex and challenging, it’s time to discuss the frontiers of a new Data Science. As Gobinda Chowdhury distracted the conference with a simple question: is it a new discipline or simply an agglomeration of skills? Are there any needs and reasons to define university curriculum for future Data Scientist? Anyway, does such a thing that Data Science exist?
My answer is an imperious yes to this question. Ibn Khaldún, the brilliant and original thinker of the 14th century (historian, political theorist and social scientist of Classical Islam) starts his pivotal main work with this sentence: “every subject, which has a history, has its own discipline”. But it is not a simple terminology issue: when we nose the data-shaped future, we have to really know simultaneously more and more about the 30.000 years old history of Data Culture to support our better understanding. Be ready, now and then, for smaller data archaeology jogs …
Did you like this article? Follow us on Twitter!