The SPARQL usage for mapping maintenance and reuse methodology
Citation:
MEEHAN, ALAN, The SPARQL usage for mapping maintenance and reuse methodology, Trinity College Dublin.School of Computer Science & Statistics.COMPUTER SYSTEMS, 2017Download Item:

Abstract:
This thesis presents the SPARQL Usage for Mapping Maintenance and Reuse (SUMMR) methodology, which is for the support of performing maintenance and reuse of Linked Data mappings.
The providers of Linked Data datasets have made a great effort in recent years to publish more Linked Data on the web. These datasets cover a wide range of knowledge domains. For some datasets, their knowledge domains overlap. Due to the de-centralised nature of their development, ownership and management they use heterogeneous vocabularies. This leads to diversity of vocabulary term representations and data instance representations for overlapping terms and instances. Mappings can be used to overcome these heterogeneities, but the creation of new mappings can be a difficult and time consuming task. Therefore, reusing existing mappings, to aid in the creation of new mappings becomes appealing to dataset maintainers. In addition datasets can change over time. These changes can effect existing mappings that reference the changed dataset, causing the mappings to become invalid and no longer produce correct results. Mapping maintenance is concerned with the discovery and repair of invalid mappings and this is also a difficult and time consuming task.
Through a systematic literature review, modelling and evaluation, the SUMMR methodology unifies both mapping maintenance and mapping reuse, through identifying use cases from both and combining them together, along with specific tasks, into a common methodology. SUMMR provides standard SPARQL query templates to perform maintenance and reuse use cases and tasks over RDF-based mapping representations and datasets. SUMMR provides a specialised mapping representation ? named the SPARQL Centric Mapping Representation (SCMR). SCMR is designed to represent two categories of ontology mappings (i) those used to transform data described from one Linked Data vocabulary into another and (ii) interlinks between semantically similar instances in Linked Data datasets. The SCMR represents these categories of mappings with sufficient detail to maximise support for SUMMR templates. However, SUMMR can be used with alternative RDF-based mapping representations.
The SUMMR methodology has been evaluated through three lab-based experiments and a case study which involved maintenance of interlink category mappings in the DBpedia dataset. The lab-based experiments provide evidence that SUMMR templates can perform all mapping maintenance and reuse tasks over mappings represented in
vi
SCMR. An additional lab-based experiment provides evidence the SCMR is sufficiently expressive for representing ontology mappings which are concerned with the transformation of data from one vocabulary to another. The case study evaluation was performed to evaluate the usefulness of SUMMR in a real world Linked Data dataset management situation. SUMMR was applied to the interlink management process of the DBpedia dataset, through an open-source software tool - named the SUMMR Interlink Validation Tool, to provide SUMMR-based invalid interlink category mapping detection. The SUMMR Interlink Validation Tool was used to validate 1,679,634 interlink category mappings that were to be published in the v.2015-10 DBpedia dataset release and discovered that 53,418 of these interlink category mappings were invalid.
The research in this thesis has yielded one major and two minor contributions. The major contribution is the design, development and evaluation of the SPARQL Usage for Mapping Maintenance and Reuse (SUMMR) methodology, for the support of performing mapping maintenance and reuse of vocabulary transformation and interlink category mappings. The minor contributions are the SPARQL Centric Mapping Representation (SCMR) ? developed as part of SUMMR, and the SUMMR Interlink Validation Tool which is an open-source software tool which implements SUMMR-based invalid interlink category mapping detection.
Sponsor
Grant Number
Science Foundation Ireland (SFI)
Author's Homepage:
http://people.tcd.ie/meehanalDescription:
APPROVED
Author: MEEHAN, ALAN
Advisor:
O'Sullivan, DeclanPublisher:
Trinity College Dublin. School of Computer Science & Statistics. Discipline of Computer ScienceType of material:
ThesisCollections:
Availability:
Full text availableKeywords:
Mapping Reuse, Mapping Maintenace, Linked Data, SPARQLLicences: