The thesaurus project of the Geological Survey of Austria (GBA) is motivated by the increasing need for a uniform description (i.e. semantics behind the language) of our geospatial data products which should enhance the value and reusability for our stakeholders. The main driving force behind this project is the INSPIRE Directive of the European Parliament and Council which should establish an infrastructure for spatial information in Europe to support Community environmental policies, and policies or activities which may have an impact on the environment. As a public authority the Geological Survey of Austria is legally called to implement this directive for the data themes geology and mineral resources. To implement this regulation we need an agreed standard (both in terms of technical interoperability and a semantic framework for a knowledge organisation system) to start building upon. For users on an institutional level the most important aspect is probably that during this project an agreed controlled vocabulary is developed that can be used as a reference work. Probably it is quite similar for external users who are asking for a standard controlled vocabulary that can be referred to. In addition our data should then be readable for non-experts too.
After the public launch of our thesaurus project we now take the effort to build some use cases for applications with both in-house and external users, e.g. for semantic web map applications. What we aim at with our work is to target professionals and other interested parties to use our vocabulary as chance to encode, define and interlink geoscientific information. One of our new projects based on our thesaurus hosted vocabulary is the Geology Data Viewer. In future this data viewer will provide the visual access to all semantically harmonized geodata of Austria at scale 1:50 000.
Christine Hörfarter has studied earth sciences at the University of Vienna and obtained her Master’s degree in petrology. She joined the Geological Survey of Austria (GBA) in march 2010 and is responsible for the coordination and modeling of geoscientific data. Christine is the co-editior of the GBA-Thesaurus and contact person in case of implementing INSPIRE.
It is widely accepted that by controlling metadata, it is easier to publish high quality data on the web. Metadata, in the context of Linked Data, refers to vocabularies and ontologies used for describing data. With more and more data published on the web, the need for reusing controlled taxonomies and vocabularies is becoming more and more a necessity. Catalogues of vocabularies are generally a starting point to search for vocabularies based on search terms. Some recent studies recommend that it is better to reuse terms from ``popular'' vocabularies. However, there is not yet an agreement on what makes a popular vocabulary since it depends on diverse criteria such as the number of properties, the number of datasets using part or the whole vocabulary, etc. In this paper, we propose a method for ranking vocabularies based on an information content metric which combines three features: (i) the datasets using the vocabulary, (ii) the outlinks from the vocabulary and (iii) the inlinks to the vocabulary. We applied this method to 366 vocabularies described in the LOV catalogue. The results are then compared with other catalogues which provide alternative rankings.
EURECOM