Search CORE

1,131 research outputs found

XML Matchers: approaches and challenges

Author: Agreste Santa
De Meo Pasquale
Ferrara Emilio
Ursino Domenico
Publication venue: 'Elsevier BV'
Publication date: 10/07/2014
Field of study

Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

arXiv.org e-Print Archive

IRIS UniversitÃ Politecnica delle Marche

A Semantic Approach to Integrating XML Schemas Using Domain Ontologies

Author: Kang Haeran
Lee Kyong Ho
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 26/01/2012
Field of study

XML documents might often conform to different schemas even in the same application domain. To support the interoperability among different IT systems, this paper proposes a sophisticated method for integrating XML schemas. The proposed method determines the synonym, hypernym, and holonym relationships among XML elements and attributes by using domain ontologies as well as general dictionaries. Specifically, the proposed method takes the structural information of elements and attributes into account. The conciseness of the schema integrated is also considered. Experimental results with a variety of schemas show that the utilization of a domain ontology and the structural information improved the performance of schema integration

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Multimodality Data Integration in Epilepsy

Author: Asano Eishi
Chugani Diane C.
Chugani Harry T.
Hua Jing
Lu Shiyong
Lu Yi
Muzik Otto
Zou Guangyu
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2007
Field of study

An important goal of software development in the medical field is the design of methods which are able to integrate information obtained from various imaging and nonimaging modalities into a cohesive framework in order to understand the results of qualitatively different measurements in a larger context. Moreover, it is essential to assess the various features of the data quantitatively so that relationships in anatomical and functional domains between complementing modalities can be expressed mathematically. This paper presents a clinically feasible software environment for the quantitative assessment of the relationship among biochemical functions as assessed by PET imaging and electrophysiological parameters derived from intracranial EEG. Based on the developed software tools, quantitative results obtained from individual modalities can be merged into a data structure allowing a consistent framework for advanced data mining techniques and 3D visualization. Moreover, an effort was made to derive quantitative variables (such as the spatial proximity index, SPI) characterizing the relationship between complementing modalities on a more generic level as a prerequisite for efficient data mining strategies. We describe the implementation of this software environment in twelve children (mean age 5.2 ± 4.3 years) with medically intractable partial epilepsy who underwent both high-resolution structural MR and functional PET imaging. Our experiments demonstrate that our approach will lead to a better understanding of the mechanisms of epileptogenesis and might ultimately have an impact on treatment. Moreover, our software environment holds promise to be useful in many other neurological disorders, where integration of multimodality data is crucial for a better understanding of the underlying disease mechanisms

Directory of Open Access Journals

PubMed Central

A decision support system for corporations cyber security risk management

Author: Molina Gabriela del Rocio Roldan
Publication venue
Publication date: 15/09/2017
Field of study

This thesis presents a decision aiding system named C3-SEC (Contex-aware Corporative Cyber Security), developed in the context of a master program at Polytechnic Institute of Leiria, Portugal. The research dimension and the corresponding software development process that followed are presented and validated with an application scenario and case study performed at Universidad de las Fuerzas Armadas ESPE – Ecuador. C3-SEC is a decision aiding software intended to support cyber risks and cyber threats analysis of a corporative information and communications technological infrastructure. The resulting software product will help corporations Chief Information Security Officers (CISO) on cyber security risk analysis, decision-making and prevention measures for the infrastructure and information assets protection. The work is initially focused on the evaluation of the most popular and relevant tools available for risk assessment and decision making in the cyber security domain. Their properties, metrics and strategies are studied and their support for cyber security risk analysis, decision-making and prevention is assessed for the protection of organization's information assets. A contribution for cyber security experts decision support is then proposed by the means of reuse and integration of existing tools and C3-SEC software. C3-SEC extends existing tools features from the data collection and data analysis (perception) level to a full context-ware reference model. The software developed makes use of semantic level, ontology-based knowledge representation and inference supported by widely adopted standards, as well as cyber security standards (CVE, CPE, CVSS, etc.) and cyber security information data sources made available by international authorities, to share and exchange information in this domain. C3-SEC development follows a context-aware systems reference model addressing the perception, comprehension, projection and decision/action layers to create corporative scale cyber security situation awareness

IC-online

Validation Framework for RDF-based Constraint Languages

Author: Hartmann Thomas
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2016
Field of study

In this thesis, a validation framework is introduced that enables to consistently execute RDF-based constraint languages on RDF data and to formulate constraints of any type. The framework reduces the representation of constraints to the absolute minimum, is based on formal logics, consists of a small lightweight vocabulary, and ensures consistency regarding validation results and enables constraint transformations for each constraint type across RDF-based constraint languages

KITopen

Building an XML document warehouse

Author: Aouabed H.
Ben Messaoud I.
Ben Messaoud I.
Ben Messaoud I.
Ben-Abdallah H.
Carpani F.
Feki J.
Ghozzi F.
Gilles Zurfluh
Hachaichi Y.
Hurtado C. A.
Ines Ben Messaoud
Jamel Feki
Jaro M. A.
Kimball R.
Lee M. L.
McCabe M. C.
Mello R. D. S.
Pujolle G.
Ravat F.
Sullivan D.
Tseng F. S. C.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2013
Field of study

International audienceData Warehouses and OLAP (On Line Analytical Processing) technologies are dedicated to analyzing structured data issued from organizations' OLTP (On Line Transaction Processing) systems. Furthermore, in order to enhance their decision support systems, these organizations need to explore XML (eXtensible Markup Language) documents as an additional and important source of unstructured data. In this context, this paper addresses the warehousing of document-centric XML documents. More specifically, we propose a two-method approach to build Document Warehouse conceptual schemas. The first method is for the unification of XML document structures; it aims to elaborate a global and generic view for a set of XML documents belonging to the same domain. The second method is for designing multidimensional galaxy schemas for Document Warehouses

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Toulouse Capitole Publications

Toulouse 1 Capitole Publications

BlogForever D3.2: Interoperability Prospects

Author: Banos V.
Berninger L.
Kalb H.
Kim Y.
Kopidaki S.
Lazaridou P.
Pinsent E.
Ross S.
Publication venue
Publication date: 25/10/2013
Field of study

This report evaluates the interoperability prospects of the BlogForever platform. Therefore, existing interoperability models are reviewed, a Delphi study to identify crucial aspects for the interoperability of web archives and digital libraries is conducted, technical interoperability standards and protocols are reviewed regarding their relevance for BlogForever, a simple approach to consider interoperability in specific usage scenarios is proposed, and a tangible approach to develop a succession plan that would allow a reliable transfer of content from the current digital archive to other digital repositories is presented

ZENODO

A geo-database for potentially polluting marine sites and associated risk index

Author: Masetti Giuseppe
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2012
Field of study

The increasing availability of geospatial marine data provides an opportunity for hydrographic offices to contribute to the identification of Potentially Polluting Marine Sites (PPMS). To adequately manage these sites, a PPMS Geospatial Database (GeoDB) application was developed to collect and store relevant information suitable for site inventory and geo-spatial analysis. The benefits of structuring the data to conform to the Universal Hydrographic Data Model (IHO S-100) and to use the Geographic Mark-Up Language (GML) for encoding are presented. A storage solution is proposed using a GML-enabled spatial relational database management system (RDBMS). In addition, an example of a risk index methodology is provided based on the defined data structure. The implementation of this example was performed using scripts containing SQL statements. These procedures were implemented using a cross-platform C++ application based on open-source libraries and called PPMS GeoDB Manager

UNH Scholars' Repository

A conceptual framework and a risk management approach for interoperability between geospatial datacubes

Author: Sboui Tarek
Publication venue: Bibliotheque de l' Universite Laval
Publication date: 01/01/2010
Field of study

De nos jours, nous observons un intérêt grandissant pour les bases de données géospatiales multidimensionnelles. Ces bases de données sont développées pour faciliter la prise de décisions stratégiques des organisations, et plus spécifiquement lorsqu’il s’agit de données de différentes époques et de différents niveaux de granularité. Cependant, les utilisateurs peuvent avoir besoin d’utiliser plusieurs bases de données géospatiales multidimensionnelles. Ces bases de données peuvent être sémantiquement hétérogènes et caractérisées par différent degrés de pertinence par rapport au contexte d’utilisation. Résoudre les problèmes sémantiques liés à l’hétérogénéité et à la différence de pertinence d’une manière transparente aux utilisateurs a été l’objectif principal de l’interopérabilité au cours des quinze dernières années. Dans ce contexte, différentes solutions ont été proposées pour traiter l’interopérabilité. Cependant, ces solutions ont adopté une approche non systématique. De plus, aucune solution pour résoudre des problèmes sémantiques spécifiques liés à l’interopérabilité entre les bases de données géospatiales multidimensionnelles n’a été trouvée. Dans cette thèse, nous supposons qu’il est possible de définir une approche qui traite ces problèmes sémantiques pour assurer l’interopérabilité entre les bases de données géospatiales multidimensionnelles. Ainsi, nous définissons tout d’abord l’interopérabilité entre ces bases de données. Ensuite, nous définissons et classifions les problèmes d’hétérogénéité sémantique qui peuvent se produire au cours d’une telle interopérabilité de différentes bases de données géospatiales multidimensionnelles. Afin de résoudre ces problèmes d’hétérogénéité sémantique, nous proposons un cadre conceptuel qui se base sur la communication humaine. Dans ce cadre, une communication s’établit entre deux agents système représentant les bases de données géospatiales multidimensionnelles impliquées dans un processus d’interopérabilité. Cette communication vise à échanger de l’information sur le contenu de ces bases. Ensuite, dans l’intention d’aider les agents à prendre des décisions appropriées au cours du processus d’interopérabilité, nous évaluons un ensemble d’indicateurs de la qualité externe (fitness-for-use) des schémas et du contexte de production (ex., les métadonnées). Finalement, nous mettons en œuvre l’approche afin de montrer sa faisabilité.Today, we observe wide use of geospatial databases that are implemented in many forms (e.g., transactional centralized systems, distributed databases, multidimensional datacubes). Among those possibilities, the multidimensional datacube is more appropriate to support interactive analysis and to guide the organization’s strategic decisions, especially when different epochs and levels of information granularity are involved. However, one may need to use several geospatial multidimensional datacubes which may be semantically heterogeneous and having different degrees of appropriateness to the context of use. Overcoming the semantic problems related to the semantic heterogeneity and to the difference in the appropriateness to the context of use in a manner that is transparent to users has been the principal aim of interoperability for the last fifteen years. However, in spite of successful initiatives, today's solutions have evolved in a non systematic way. Moreover, no solution has been found to address specific semantic problems related to interoperability between geospatial datacubes. In this thesis, we suppose that it is possible to define an approach that addresses these semantic problems to support interoperability between geospatial datacubes. For that, we first describe interoperability between geospatial datacubes. Then, we define and categorize the semantic heterogeneity problems that may occur during the interoperability process of different geospatial datacubes. In order to resolve semantic heterogeneity between geospatial datacubes, we propose a conceptual framework that is essentially based on human communication. In this framework, software agents representing geospatial datacubes involved in the interoperability process communicate together. Such communication aims at exchanging information about the content of geospatial datacubes. Then, in order to help agents to make appropriate decisions during the interoperability process, we evaluate a set of indicators of the external quality (fitness-for-use) of geospatial datacube schemas and of production context (e.g., metadata). Finally, we implement the proposed approach to show its feasibility

CorpusUL