Search CORE

3,383 research outputs found

Mapping Large Scale Research Metadata to Linked Data: A Performance Comparison of HBase, CSV and XML

Author: Huang Jyun-Yao
Karim Farah
Lange Christoph
Vahdati Sahar
Publication venue
Publication date: 01/01/2015
Field of study

OpenAIRE, the Open Access Infrastructure for Research in Europe, comprises a database of all EC FP7 and H2020 funded research projects, including metadata of their results (publications and datasets). These data are stored in an HBase NoSQL database, post-processed, and exposed as HTML for human consumption, and as XML through a web service interface. As an intermediate format to facilitate statistical computations, CSV is generated internally. To interlink the OpenAIRE data with related data on the Web, we aim at exporting them as Linked Open Data (LOD). The LOD export is required to integrate into the overall data processing workflow, where derived data are regenerated from the base data every day. We thus faced the challenge of identifying the best-performing conversion approach.We evaluated the performances of creating LOD by a MapReduce job on top of HBase, by mapping the intermediate CSV files, and by mapping the XML output.Comment: Accepted in 0th Metadata and Semantics Research Conferenc

arXiv.org e-Print Archive

Fraunhofer-ePrints

The DIGMAP geo-temporal web gazetteer service

Author: Borbinha José
Manguinhas H.
Martins Bruno
Siabato Vaca Willington Libardo
Publication venue: E.T.S.I. en Topografía, Geodesia y Cartografía (UPM)
Publication date: 01/01/2009
Field of study

This paper presents the DIGMAP geo-temporal Web gazetteer service, a system providing access to names of places, historical periods, and associated geo-temporal information. Within the DIGMAP project, this gazetteer serves as the unified repository of geographic and temporal information, assisting in the recognition and disambiguation of geo-temporal expressions over text, as well as in resource searching and indexing. We describe the data integration methodology, the handling of temporal information and some of the applications that use the gazetteer. Initial evaluation results show that the proposed system can adequately support several tasks related to geo-temporal information extraction and retrieval

Archivo Digital UPM

Metamodel Instance Generation: A systematic literature review

Author: Monahan Rosemary
Power James F.
Wu Hao
Publication venue
Publication date: 01/01/2012
Field of study

Modelling and thus metamodelling have become increasingly important in Software Engineering through the use of Model Driven Engineering. In this paper we present a systematic literature review of instance generation techniques for metamodels, i.e. the process of automatically generating models from a given metamodel. We start by presenting a set of research questions that our review is intended to answer. We then identify the main topics that are related to metamodel instance generation techniques, and use these to initiate our literature search. This search resulted in the identification of 34 key papers in the area, and each of these is reviewed here and discussed in detail. The outcome is that we are able to identify a knowledge gap in this field, and we offer suggestions as to some potential directions for future research.Comment: 25 page

arXiv.org e-Print Archive

CiteSeerX

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

XML Matchers: approaches and challenges

Author: Agreste Santa
De Meo Pasquale
Ferrara Emilio
Ursino Domenico
Publication venue: 'Elsevier BV'
Publication date: 10/07/2014
Field of study

Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

arXiv.org e-Print Archive

IRIS UniversitÃ Politecnica delle Marche

Bridging the XML-relational divide with LegoDB: a demonstration

Author: Bohannon Philip
Freire Juliana
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2003
Field of study

Journal ArticleWe present LegoDB, a cost-based XML storage mapping engine that automatically explores a space of possible XML-to-relational mappings and selects an efficient mapping for a given application

The University of Utah: J. Willard Marriott Digital Library

Terminology server for improved resource discovery: analysis of model and functions

Author: Macgregor G.
McCulloch E.
Nicholson D.
Publication venue
Publication date: 12/10/2007
Field of study

This paper considers the potential to improve distributed information retrieval via a terminologies server. The restriction upon effective resource discovery caused by the use of disparate terminologies across services and collections is outlined, before considering a DDC spine based approach involving inter-scheme mapping as a possible solution. The developing HILT model is discussed alongside other existing models and alternative approaches to solving the terminologies problem. Results from the current HILT pilot are presented to illustrate functionality and suggestions are made for further research and development

University of Strathclyde Institutional Repository