15 research outputs found

    Supporting Tools for Automated Generation and Visual Editing of Relational-to-Ontology Mappings

    Get PDF
    La integració de dades amb formats heterogenis i de diversos dominis mitjançant tecnologies de la web semàntica permet solucionar la seva disparitat estructural i semàntica. L'accés a dades basat en ontologies (OBDA, en anglès) és una solució integral que es basa en l'ús d'ontologies com esquemes mediadors i el mapatge entre les dades i les ontologies per facilitar la consulta de les fonts de dades. No obstant això, una de les principals barreres que pot dificultar més l'adopció de OBDA és la manca d'eines per donar suport a la creació de mapatges entre dades i ontologies. L'objectiu d'aquesta investigació ha estat desenvolupar noves eines que permetin als experts sense coneixements d'ontologies la creació de mapatges entre dades i ontologies. Amb aquesta finalitat, s'han dut a terme dues línies de treball: la generació automàtica de mapatges entre dades relacionals i ontologies i l'edició dels mapatges a través de la seva representació visual. Les eines actualment disponibles per automatitzar la generació de mapatges estan lluny de proporcionar una solució completa, ja que es basen en els esquemes relacionals i amb prou feines tenen en compte els continguts de la font de dades relacional i les característiques de l'ontologia. No obstant això, les dades poden contenir relacions ocultes que poden ajudar a la generació de mapatges. Per superar aquesta limitació, hem desenvolupat AutoMap4OBDA, un sistema que genera automàticament mapatges R2RML a partir de l'anàlisi dels continguts de la font relacional i tenint en compte les característiques de l'ontologia. El sistema fa servir una tècnica d'aprenentatge d'ontologies per inferir jerarquies de classes, selecciona les mètriques de similitud de cadenes en base a les etiquetes de les ontologies i analitza les estructures de grafs per generar els mapatges a partir de l'estructura de l'ontologia. La representació visual per mitjà d'interfícies intuïtives pot ajudar els usuaris sense coneixements tècnics a establir mapatges entre una font relacional i una ontologia. No obstant això, les eines existents per a l'edició visual de mapatges mostren algunes limitacions. En particular, la representació visual de mapatges no contempla les estructures de la font relacional i de l'ontologia de forma conjunta. Per superar aquest inconvenient, hem desenvolupat Map-On, un entorn visual web per a l'edició manual de mapatges. AutoMap4OBDA ha demostrat que supera les prestacions de les solucions existents per a la generació de mapatges. Map-On s'ha aplicat en projectes d'investigació per verificar la seva eficàcia en la gestió de mapatges.La integración de datos con formatos heterogéneos y de diversos dominios mediante tecnologías de la Web Semántica permite solventar su disparidad estructural y semántica. El acceso a datos basado en ontologías (OBDA, en inglés) es una solución integral que se basa en el uso de ontologías como esquemas mediadores y mapeos entre los datos y las ontologías para facilitar la consulta de las fuentes de datos. Sin embargo, una de las principales barreras que puede dificultar más la adopción de OBDA es la falta de herramientas para apoyar la creación de mapeos entre datos y ontologías. El objetivo de esta investigación ha sido desarrollar nuevas herramientas que permitan a expertos sin conocimientos de ontologías la creación de mapeos entre datos y ontologías. Con este fin, se han llevado a cabo dos líneas de trabajo: la generación automática de mapeos entre datos relacionales y ontologías y la edición de los mapeos a través de su representación visual. Las herramientas actualmente disponibles para automatizar la generación de mapeos están lejos de proporcionar una solución completa, ya que se basan en los esquemas relacionales y apenas tienen en cuenta los contenidos de la fuente de datos relacional y las características de la ontología. Sin embargo, los datos pueden contener relaciones ocultas que pueden ayudar a la generación de mapeos. Para superar esta limitación, hemos desarrollado AutoMap4OBDA, un sistema que genera automáticamente mapeos R2RML a partir del análisis de los contenidos de la fuente relacional y teniendo en cuenta las características de la ontología. El sistema emplea una técnica de aprendizaje de ontologías para inferir jerarquías de clases, selecciona las métricas de similitud de cadenas en base a las etiquetas de las ontologías y analiza las estructuras de grafos para generar los mapeos a partir de la estructura de la ontología. La representación visual por medio de interfaces intuitivas puede ayudar a los usuarios sin conocimientos técnicos a establecer mapeos entre una fuente relacional y una ontología. Sin embargo, las herramientas existentes para la edición visual de mapeos muestran algunas limitaciones. En particular, la representación de mapeos no contempla las estructuras de la fuente relacional y de la ontología de forma conjunta. Para superar este inconveniente, hemos desarrollado Map-On, un entorno visual web para la edición manual de mapeos. AutoMap4OBDA ha demostrado que supera las prestaciones de las soluciones existentes para la generación de mapeos. Map-On se ha aplicado en proyectos de investigación para verificar su eficacia en la gestión de mapeos.Integration of data from heterogeneous formats and domains based on Semantic Web technologies enables us to solve their structural and semantic heterogeneity. Ontology-based data access (OBDA) is a comprehensive solution which relies on the use of ontologies as mediator schemas and relational-to-ontology mappings to facilitate data source querying. However, one of the greatest obstacles in the adoption of OBDA is the lack of tools to support the creation of mappings between physically stored data and ontologies. The objective of this research has been to develop new tools that allow non-ontology experts to create relational-to-ontology mappings. For this purpose, two lines of work have been carried out: the automated generation of relational-to-ontology mappings, and visual support for mapping editing. The tools currently available to automate the generation of mappings are far from providing a complete solution, since they rely on relational schemas and barely take into account the contents of the relational data source and features of the ontology. However, the data may contain hidden relationships that can help in the process of mapping generation. To overcome this limitation, we have developed AutoMap4OBDA, a system that automatically generates R2RML mappings from the analysis of the contents of the relational source and takes into account the characteristics of ontology. The system employs an ontology learning technique to infer class hierarchies, selects the string similarity metric based on the labels of ontologies, and analyses the graph structures to generate the mappings from the structure of the ontology. The visual representation through intuitive interfaces can help non-technical users to establish mappings between a relational source and an ontology. However, existing tools for visual editing of mappings show somewhat limitations. In particular, the visual representation of mapping does not embrace the structure of the relational source and the ontology at the same time. To overcome this problem, we have developed Map-On, a visual web environment for the manual editing of mappings. AutoMap4OBDA has been shown to outperform existing solutions in the generation of mappings. Map-On has been applied in research projects to verify its effectiveness in managing mappings

    Data Integration Driven Ontology Design, Case Study Smart City

    Get PDF
    Methods to design of formal ontologies have been in focus of research since the early nineties when their importance and conceivable practical application in engineering sciences had been understood. However, often significant customization of generic methodologies is required when they are applied in tangible scenarios. In this paper, we present a methodology for ontology design developed in the context of data integration. In this scenario, a targeting ontology is applied as a mediator for distinct schemas of individual data sources and, furthermore, as a reference schema for federated data queries. The methodology has been used and evaluated in a case study aiming at integration of buildings' energy and carbon emission related data. We claim that we have made the design process much more efficient and that there is a high potential to reuse the methodology

    Relational Database information availability to Semantic Web technologies

    Get PDF
    Thesis is about possible ways to connect semantic web technologies and relational databases. RDB2OWL language is developed that enable to define mappings between elements of OWL ontology and relational database schema in a concise and human readable form. The language is aimed to get ontology class instances and property values from relational database data, e.g, to generate RDF from relational table data. Promocijas darbā apskatītas Semantiskā tīmekļa tehnoloģiju un relāciju datu bāzu sasaistes iespējas. Izstrādāta valoda RDB2OWL, kas ļauj īsi un cilvēkam lasāmā formā nodibināt atbilstību starp OWL ontoloģijas un relāciju datu bāzes shēmas elementiem ar mērķi iegūt ontoloģijas klašu instances un īpašību vērtības no relāciju datiem, t.i., iegūt RDF trijniekus no tabulu rakstiem.This work has been supported by the European Social Fund within the project «Support for Doctoral Studies at University of Latvia

    On the Foundations of Data Interoperability and Semantic Search on the Web

    Get PDF
    This dissertation studies the problem of facilitating semantic search across disparate ontologies that are developed by different organizations. There is tremendous potential in enabling users to search independent ontologies and discover knowledge in a serendipitous fashion, i.e., often completely unintended by the developers of the ontologies. The main difficulty with such search is that users generally do not have any control over the naming conventions and content of the ontologies. Thus terms must be appropriately mapped across ontologies based on their meaning. The meaning-based search of data is referred to as semantic search, and its facilitation (aka semantic interoperability) then requires mapping between ontologies. In relational databases, searching across organizational boundaries currently involves the difficult task of setting up a rigid information integration system. Linked Data representations more flexibly tackle the problem of searching across organizational boundaries on the Web. However, there exists no consensus on how ontology mapping should be performed for this scenario, and the problem is open. We lay out the foundations of semantic search on the Web of Data by comparing it to keyword search in the relational model and by providing effective mechanisms to facilitate data interoperability across organizational boundaries. We identify two sharply distinct goals for ontology mapping based on real-world use cases. These goals are: (i) ontology development, and (ii) facilitating interoperability. We systematically analyze these goals, side-by-side, and contrast them. Our analysis demonstrates the implications of the goals on how to perform ontology mapping and how to represent the mappings. We rigorously compare facilitating interoperability between ontologies to information integration in databases. Based on the comparison, class matching is emphasized as a critical part of facilitating interoperability. For class matching, various class similarity metrics are formalized and an algorithm that utilizes these metrics is designed. We also experimentally evaluate the effectiveness of the class similarity metrics on real-world ontologies. In order to encode the correspondences between ontologies for interoperability, we develop a novel W3C-compliant representation, named skeleton

    Relāciju datu bāzu informācijas pieejamība semantiskā tīmekļa tehnoloģijām

    Get PDF
    Promocijas darba „Relāciju datu bāzu informācijas pieejamība semantiskā tīmekļa tehnoloģijām” Anotācija Atslēgas vārdi: relāciju datu bāze, RDF, OWL, ontoloģijas, attēlojums Semantiskā tīmekļa uzdevums ir pārvērst tīmekli no milzīgas dokumentu kolekcijas par bagātu datu avotu, no kura datorprogrammas varētu automātiski iegūt informāciju. Lielākā daļa esošo datu glabājas relāciju datu bāzēs un nav pieejamas semantiskam tīmeklim. Darba mērķis ir atrast risinājumu relāciju datu bāzu un OWL ontololoģijas sasaistei. Darbā 1) pētīti esošie risinājumus relāciju datu bāzu un RDF/OWL ontoloģiju sasaistes nodrošināšanā; 2) izveidota RDB-RDF/OWL attēlojuma specificēšanas valoda, kas orientēta uz lasāmību, pieraksta īsumu un augsta līmeņa konstrukciju izmantošanu; 3) izveidota praktiskām situācijām piemērota efektīva atbilstību realizācija, izmantojot relāciju datu bāzi atbilstību informācijas glabāšanai un RDF trijnieku ģenerēšanai, un tā pielietota Latvijas medicīnas 6 reģistru datu semantiskā reinženierijā.Thesis „Relational Database information availability to Semantic Web technologies” Annotation Keywords: Relational databases, RDF, OWL, ontologies, mapping The purpose of Semantic Web is to convert the web from huge collection of documents into rich dataset from which it would be possible for computer programs to get information automatically. Most of real data are stored in relational databases and are not available to Semantic Web. The purpose of this work is to find solution to map relational databases and OWL ontologies. In the thesis 1) existing RDB-to-RDF/OWL mapping solutions are investigated; 2) RDB-to-RDF/OWL mapping specification language is designed that is oriented towards readability, conciseness and high level construct usage; 3) effective and suitable for practical use-cases mapping implementation is created using relational database for mapping data storage and RDF triple generation; the implementation was applied to semantic re-engineering of 6 Latvian Medicine registries

    Knowledge hypergraph based-approach for multi-source data integration and querying : Application for Earth Observation domain

    Get PDF
    Early warning against natural disasters to save lives and decrease damages has drawn increasing interest to develop systems that observe, monitor, and assess the changes in the environment. Over the last years, numerous environmental monitoring systems and Earth Observation (EO) programs were implemented. Nevertheless, these systems generate a large amount of EO data while using different vocabularies and different conceptual schemas. Accordingly, data resides in many siloed systems and are mainly untapped for integrated operations, insights, and decision making situations. To overcome the insufficient exploitation of EO data, a data integration system is crucial to break down data silos and create a common information space where data will be semantically linked. Within this context, we propose a semantic data integration and querying approach, which aims to semantically integrate EO data and provide an enhanced query processing in terms of accuracy, completeness, and semantic richness of response. . To do so, we defined three main objectives. The first objective is to capture the knowledge of the environmental monitoring domain. To do so, we propose MEMOn, a domain ontology that provides a common vocabulary of the environmental monitoring domain in order to support the semantic interoperability of heterogeneous EO data. While creating MEMOn, we adopted a development methodology, including three fundamental principles. First, we used a modularization approach. The idea is to create separate modules, one for each context of the environment domain in order to ensure the clarity of the global ontology’s structure and guarantee the reusability of each module separately. Second, we used the upper-level ontology Basic Formal Ontology and the mid-level ontologies, the Common Core ontologies, to facilitate the integration of the ontological modules in order to build the global one. Third, we reused existing domain ontologies such as ENVO and SSN, to avoid creating the ontology from scratch, and this can improve its quality since the reused components have already been evaluated. MEMOn is then evaluated using real use case studies, according to the Sahara and Sahel Observatory experts’ requirements. The second objective of this work is to break down the data silos and provide a common environmental information space. Accordingly, we propose a knowledge hypergraphbased data integration approach to provide experts and software agents with a virtual integrated and linked view of data. This approach generates RML mappings between the developed ontology and metadata and then creates a knowledge hypergraph that semantically links these mappings to identify more complex relationships across data sources. One of the strengths of the proposed approach is it goes beyond the process of combining data retrieved from multiple and independent sources and allows the virtual data integration in a highly semantic and expressive way, using hypergraphs. The third objective of this thesis concerns the enhancement of query processing in terms of accuracy, completeness, and semantic richness of response in order to adapt the returned results and make them more relevant and richer in terms of relationships. Accordingly, we propose a knowledge-hypergraph based query processing that improves the selection of sources contributing to the final result of an input query. Indeed, the proposed approach moves beyond the discovery of simple one-to-one equivalence matches and relies on the identification of more complex relationships across data sources by referring to the knowledge hypergraph. This enhancement significantly showcases the increasing of answer completeness and semantic richness. The proposed approach was implemented in an open-source tool and has proved its effectiveness through a real use case in the environmental monitoring domain

    A Knowledge Graph Based Integration Approach for Industry 4.0

    Get PDF
    The fourth industrial revolution, Industry 4.0 (I40) aims at creating smart factories employing among others Cyber-Physical Systems (CPS), Internet of Things (IoT) and Artificial Intelligence (AI). Realizing smart factories according to the I40 vision requires intelligent human-to-machine and machine-to-machine communication. To achieve this communication, CPS along with their data need to be described and interoperability conflicts arising from various representations need to be resolved. For establishing interoperability, industry communities have created standards and standardization frameworks. Standards describe main properties of entities, systems, and processes, as well as interactions among them. Standardization frameworks classify, align, and integrate industrial standards according to their purposes and features. Despite being published by official international organizations, different standards may contain divergent definitions for similar entities. Further, when utilizing the same standard for the design of a CPS, different views can generate interoperability conflicts. Albeit expressive, standardization frameworks may represent divergent categorizations of the same standard to some extent, interoperability conflicts need to be resolved to support effective and efficient communication in smart factories. To achieve interoperability, data need to be semantically integrated and existing conflicts conciliated. This problem has been extensively studied in the literature. Obtained results can be applied to general integration problems. However, current approaches fail to consider specific interoperability conflicts that occur between entities in I40 scenarios. In this thesis, we tackle the problem of semantic data integration in I40 scenarios. A knowledge graphbased approach allowing for the integration of entities in I40 while considering their semantics is presented. To achieve this integration, there are challenges to be addressed on different conceptual levels. Firstly, defining mappings between standards and standardization frameworks; secondly, representing knowledge of entities in I40 scenarios described by standards; thirdly, integrating perspectives of CPS design while solving semantic heterogeneity issues; and finally, determining real industry applications for the presented approach. We first devise a knowledge-driven approach allowing for the integration of standards and standardization frameworks into an Industry 4.0 knowledge graph (I40KG). The standards ontology is used for representing the main properties of standards and standardization frameworks, as well as relationships among them. The I40KG permits to integrate standards and standardization frameworks while solving specific semantic heterogeneity conflicts in the domain. Further, we semantically describe standards in knowledge graphs. To this end, standards of core importance for I40 scenarios are considered, i.e., the Reference Architectural Model for I40 (RAMI4.0), AutomationML, and the Supply Chain Operation Reference Model (SCOR). In addition, different perspectives of entities describing CPS are integrated into the knowledge graphs. To evaluate the proposed methods, we rely on empirical evaluations as well as on the development of concrete use cases. The attained results provide evidence that a knowledge graph approach enables the effective data integration of entities in I40 scenarios while solving semantic interoperability conflicts, thus empowering the communication in smart factories

    Federated Query Processing over Heterogeneous Data Sources in a Semantic Data Lake

    Get PDF
    Data provides the basis for emerging scientific and interdisciplinary data-centric applications with the potential of improving the quality of life for citizens. Big Data plays an important role in promoting both manufacturing and scientific development through industrial digitization and emerging interdisciplinary research. Open data initiatives have encouraged the publication of Big Data by exploiting the decentralized nature of the Web, allowing for the availability of heterogeneous data generated and maintained by autonomous data providers. Consequently, the growing volume of data consumed by different applications raise the need for effective data integration approaches able to process a large volume of data that is represented in different format, schema and model, which may also include sensitive data, e.g., financial transactions, medical procedures, or personal data. Data Lakes are composed of heterogeneous data sources in their original format, that reduce the overhead of materialized data integration. Query processing over Data Lakes require the semantic description of data collected from heterogeneous data sources. A Data Lake with such semantic annotations is referred to as a Semantic Data Lake. Transforming Big Data into actionable knowledge demands novel and scalable techniques for enabling not only Big Data ingestion and curation to the Semantic Data Lake, but also for efficient large-scale semantic data integration, exploration, and discovery. Federated query processing techniques utilize source descriptions to find relevant data sources and find efficient execution plan that minimize the total execution time and maximize the completeness of answers. Existing federated query processing engines employ a coarse-grained description model where the semantics encoded in data sources are ignored. Such descriptions may lead to the erroneous selection of data sources for a query and unnecessary retrieval of data, affecting thus the performance of query processing engine. In this thesis, we address the problem of federated query processing against heterogeneous data sources in a Semantic Data Lake. First, we tackle the challenge of knowledge representation and propose a novel source description model, RDF Molecule Templates, that describe knowledge available in a Semantic Data Lake. RDF Molecule Templates (RDF-MTs) describes data sources in terms of an abstract description of entities belonging to the same semantic concept. Then, we propose a technique for data source selection and query decomposition, the MULDER approach, and query planning and optimization techniques, Ontario, that exploit the characteristics of heterogeneous data sources described using RDF-MTs and provide a uniform access to heterogeneous data sources. We then address the challenge of enforcing privacy and access control requirements imposed by data providers. We introduce a privacy-aware federated query technique, BOUNCER, able to enforce privacy and access control regulations during query processing over data sources in a Semantic Data Lake. In particular, BOUNCER exploits RDF-MTs based source descriptions in order to express privacy and access control policies as well as their automatic enforcement during source selection, query decomposition, and planning. Furthermore, BOUNCER implements query decomposition and optimization techniques able to identify query plans over data sources that not only contain the relevant entities to answer a query, but also are regulated by policies that allow for accessing these relevant entities. Finally, we tackle the problem of interest based update propagation and co-evolution of data sources. We present a novel approach for interest-based RDF update propagation that consistently maintains a full or partial replication of large datasets and deal with co-evolution

    Linked Open Data - Creating Knowledge Out of Interlinked Data: Results of the LOD2 Project

    Get PDF
    Database Management; Artificial Intelligence (incl. Robotics); Information Systems and Communication Servic

    IDEAS-1997-2021-Final-Programs

    Get PDF
    This document records the final program for each of the 26 meetings of the International Database and Engineering Application Symposium from 1997 through 2021. These meetings were organized in various locations on three continents. Most of the papers published during these years are in the digital libraries of IEEE(1997-2007) or ACM(2008-2021)
    corecore