Presentation

A study of multilingual semantic data integration

Download PDF Read Online
Abstract

The availability of the various forms of open data today offers great opportunity for meta level research that draws on combinations of data previously considered only in isolation. There are also great challenges to be overcome; datasets may have different schemas, may employ different terminology or languages, data may only be represented by textual reports. Metadata and vocabularies of different kinds have the potential to help address many of these issues. Previous work explored semantic integration of English language archaeological datasets and reports (Binding et al., 2015; Tudhope et al., 2011). This presentation reflects on initial experience from a semantic integration exercise involving archaeological datasets and reports in different languages. Different forms of Knowledge Organization Systems (KOS) were key to the exercise. The Getty Art and Architecture Thesaurus (AAT) was used as the underlying value vocabulary and the CIDOC CRM ontology as the metadata element set (Isaac et al. 2011) for the semantic integration. Linked data expressions of the vocabularies formed part of an integration dataset (RDF) extracted from the source data, together with subject metadata automatically generated from the reports via Natural Language Processing (NLP) techniques. The data was selected following a broad theme of wooden material, objects and samples dated via dendrochronological analysis. The investigation was conducted as an advanced data integration case study for the ARIADNE FP7 archaeological infrastructure project (ARIADNE 2017), with the datasets and reports provided by Dutch, English and Swedish ARIADNE project partners.

Author information

Douglas Tudhope
Hypermedia Research Group, University of South Wales, United Kingdom
Ceri Binding
Hypermedia Research Group, University of South Wales, United Kingdom

Cite this article

Tudhope, D., & Binding, C. (2018). A study of multilingual semantic data integration. International Conference on Dublin Core and Metadata Applications, 2018. https://doi.org/10.23106/dcmi.952139116

DOI : 10.23106/dcmi.952139116

CC-0 Logo Metadata and citations of this article is published under the Creative Commons Zero Universal Public Domain Dedication (CC0), allowing unrestricted reuse. Anyone can freely use the metadata from DCPapers articles for any purpose without limitations.
CC-BY Logo This article full-text is published under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license allows use, sharing, adaptation, distribution, and reproduction in any medium or format, provided that appropriate credit is given to the original author(s) and the source is cited.