4,861 research outputs found

    Automatic Table Extension with Open Data

    Get PDF
    With thousands of data sources available on the web as well as within organisations, data scientists increasingly spend more time searching for data than analysing it. To ease the task of find and integrating relevant data for data mining projects, this dissertation presents two new methods for automatic table extension. Automatic table extension systems take over the task of tata discovery and data integration by adding new columns with new information (new attributes) to any table. The data values in the new columns are extracted from a given corpus of tables

    A universal ontology-based approach to data integration

    Get PDF
    One of the main problems in building data integration systems is that of semantic integration. It has been acknowledged that the problem would not exist if all systems were developed using the same global schema, but so far, such global schema has been considered unfeasible in practice. However, in our previous work, we have argued that given the current state-of-the-art, a global schema may be feasible now, and we have put forward a vision of a Universal Ontology (UO) that may be desirable, feasible, and viable. One of the reasons why the UO may be desirable is that it might solve the semantic integration problem. The objective of this paper is to show that indeed the UO could solve, or at least greatly alleviate, the semantic integration problem. We do so by presenting an approach to semantic integration based on the UO that requires much less effort than other approaches.Peer ReviewedPostprint (published version

    On the Foundations of Data Interoperability and Semantic Search on the Web

    Get PDF
    This dissertation studies the problem of facilitating semantic search across disparate ontologies that are developed by different organizations. There is tremendous potential in enabling users to search independent ontologies and discover knowledge in a serendipitous fashion, i.e., often completely unintended by the developers of the ontologies. The main difficulty with such search is that users generally do not have any control over the naming conventions and content of the ontologies. Thus terms must be appropriately mapped across ontologies based on their meaning. The meaning-based search of data is referred to as semantic search, and its facilitation (aka semantic interoperability) then requires mapping between ontologies. In relational databases, searching across organizational boundaries currently involves the difficult task of setting up a rigid information integration system. Linked Data representations more flexibly tackle the problem of searching across organizational boundaries on the Web. However, there exists no consensus on how ontology mapping should be performed for this scenario, and the problem is open. We lay out the foundations of semantic search on the Web of Data by comparing it to keyword search in the relational model and by providing effective mechanisms to facilitate data interoperability across organizational boundaries. We identify two sharply distinct goals for ontology mapping based on real-world use cases. These goals are: (i) ontology development, and (ii) facilitating interoperability. We systematically analyze these goals, side-by-side, and contrast them. Our analysis demonstrates the implications of the goals on how to perform ontology mapping and how to represent the mappings. We rigorously compare facilitating interoperability between ontologies to information integration in databases. Based on the comparison, class matching is emphasized as a critical part of facilitating interoperability. For class matching, various class similarity metrics are formalized and an algorithm that utilizes these metrics is designed. We also experimentally evaluate the effectiveness of the class similarity metrics on real-world ontologies. In order to encode the correspondences between ontologies for interoperability, we develop a novel W3C-compliant representation, named skeleton

    Survey on Techniques for Ontology Interoperability in Semantic Web

    Get PDF
    Ontology is a shared conceptualization of knowledge representation of particular domain. These are used for the enhancement of semantic information explicitly. It is considered as a key element in semantic web development. Creation of global web data sources is impossible because of the dynamic nature of the web. Ontology Interoperability provides the reusability of ontologies. Different domain experts and ontology engineers create different ontologies for the same or similar domain depending on their data modeling requirements. These cause ontology heterogeneity and inconsistency problems. For more better and precise results ontology mapping is the solution. As their use has increased, providing means of resolving semantic differences has also become very important. Papers on ontology interoperability report the results on different frameworks and this makes their comparison almost impossible. Therefore, the main focus of this paper will be on providing some basics of ontology interoperability and briefly introducing its different approaches. In this paper we survey the approaches that have been proposed for providing interoperability among domain ontologies and its related techniques and tools

    Overcoming database heterogeneity to facilitate social networks: the Colombian displaced population as a case study

    Get PDF
    In this paper we describe a two-step approach for the publication of data about displaced people in Colombia, whose lack of homogeneity represents a major barrier for the application of adequate policies. This data is available in heterogeneous data sources, mainly relational, and is not connected to social networking sites. Our approach consists in a first step where ontologies are automatically derived from existing relational databases, exploiting the semantics underlying the SQL-DDL schema description, and a second step where these ontologies are aligned with existing ontologies (FOAF in our example), facilitating a better integration of data coming from multiple sources

    The mediated data integration (MeDInt) : An approach to the integration of database and legacy systems

    Get PDF
    The information required for decision making by executives in organizations is normally scattered across disparate data sources including databases and legacy systems. To gain a competitive advantage, it is extremely important for executives to be able to obtain one unique view of information in an accurate and timely manner. To do this, it is necessary to interoperate multiple data sources, which differ structurally and semantically. Particular problems occur when applying traditional integration approaches, for example, the global schema needs to be recreated when the component schema has been modified. This research investigates the following heterogeneities between heterogeneous data sources: Data Model Heterogeneities, Schematic Heterogeneities and Semantic Heterogeneities. The problems of existing integration approaches are reviewed and solved by introducing and designing a new integration approach to logically interoperate heterogeneous data sources and to resolve three previously classified heterogeneities. The research attempts to reduce the complexity of the integration process by maximising the degree of automation. Mediation and wrapping techniques are employed in this research. The Mediated Data Integration (MeDint) architecture has been introduced to integrate heterogeneous data sources. Three major elements, the MeDint Mediator, wrappers, and the Mediated Data Model (MDM) play important roles in the integration of heterogeneous data sources. The MeDint Mediator acts as an intermediate layer transforming queries to sub-queries, resolving conflicts, and consolidating conflict-resolved results. Wrappers serve as translators between the MeDint Mediator and data sources. Both the mediator and wrappers arc well-supported by MDM, a semantically-rich data model which can describe or represent heterogeneous data schematically and semantically. Some organisational information systems have been tested and evaluated using the MeDint architecture. The results have addressed all the research questions regarding the interoperability of heterogeneous data sources. In addition, the results also confirm that the Me Dint architecture is able to provide integration that is transparent to users and that the schema evolution does not affect the integration

    A contextual semantic mediator for a distributed cooperative maintenance platform.

    No full text
    International audiencePlatforms expand maintenance systems from centralized systems into e-maintenance platforms integrating various cooperative distributed systems and maintenance applications. This phenomenon allowed an evolution in services offered to maintenance actors by integrating more intelligent applications, providing decision support and facilitating the access to needed data. To manage this evolution, e-maintenance platforms must respond to a great challenge which is ensuring an interoperable communication between its integrated systems. By combining different techniques used in previous works, we propose in this work a semantic mediator system ensuring a high level of interoperability between systems in the maintenance platform
    • …
    corecore