7 research outputs found

    Probabilistic Message Passing in Peer Data Management Systems

    Get PDF
    Until recently, most data integration techniques involved central components, e.g., global schemas, to enable transparent access to heterogeneous databases. Today, however, with the democratization of tools facilitating knowledge elicitation in machine-processable formats, one cannot rely on global, centralized schemas anymore as knowledge creation and consumption are getting more and more dynamic and decentralized. Peer Data Management Systems (PDMS) provide an answer to this problem by eliminating the central semantic component and considering instead compositions of local, pair-wise mappings to propagate queries from one database to the others. PDMS approaches proposed so far make the implicit assumption that all mappings used in this way are correct. This obviously cannot be taken as granted in typical PDMS settings where mappings can be created (semi) automatically by independent parties. In this work, we propose a totally decentralized, efficient message passing scheme to automatically detect erroneous mappings in PDMS. Our scheme is based on a probabilistic model where we take advantage of transitive closures of mapping operations to confront local belief on the correctness of a mapping against evidences gathered around the network. We show that our scheme can be efficiently embedded in any PDMS and provide a preliminary evaluation of our techniques on sets of both automatically-generated and real-world schemas

    Viewpoints on emergent semantics

    Get PDF
    Authors include:Philippe CudrÂŽe-Mauroux, and Karl Aberer (editors), Alia I. Abdelmoty, Tiziana Catarci, Ernesto Damiani, Arantxa Illaramendi, Robert Meersman, Erich J. Neuhold, Christine Parent, Kai-Uwe Sattler, Monica Scannapieco, Stefano Spaccapietra, Peter Spyns, and Guy De TrÂŽeWe introduce a novel view on how to deal with the problems of semantic interoperability in distributed systems. This view is based on the concept of emergent semantics, which sees both the representation of semantics and the discovery of the proper interpretation of symbols as the result of a self-organizing process performed by distributed agents exchanging symbols and having utilities dependent on the proper interpretation of the symbols. This is a complex systems perspective on the problem of dealing with semantics. We highlight some of the distinctive features of our vision and point out preliminary examples of its applicatio

    idMesh: graph-based disambiguation of linked data

    Get PDF
    We tackle the problem of disambiguating entities on the Web. We propose a user-driven scheme where graphs of entities -- represented by globally identifiable declarative artifacts -- self-organize in a dynamic and probabilistic manner. Our solution has the following two desirable properties: i) it lets end-users freely define associations between arbitrary entities and ii) it probabilistically infers entity relationships based on uncertain links using constraint-satisfaction mechanisms. We outline the interface between our scheme and the current data Web, and show how higher-layer applications can take advantage of our approach to enhance search and update of information relating to online entities. We describe a decentralized infrastructure supporting efficient and scalable entity disambiguation and demonstrate the practicability of our approach in a deployment over several hundreds of machines

    Reconciling Matching Networks of Conceptual Models

    Get PDF
    Conceptual models such as database schemas, ontologies or process models have been established as a means for effective engineering of information systems. Yet, for complex systems, conceptual models are created by a variety of stakeholders, which calls for techniques to manage consistency among the different views on a system. Techniques for model matching generate correspondences between the elements of conceptual models, thereby supporting effective model creation, utilization, and evolution. Although various automatic matching tools have been developed for different types of conceptual models, their results are often incomplete or erroneous. Automatically generated correspondences, therefore, need to be reconciled, i.e., validated by a human expert. We analyze the reconciliation process in a network setting, where a large number of conceptual models need to be matched. Then, the network induced by the generated correspondences shall meet consistency expectations in terms of mutual reinforcing relations between the correspondences. We develop a probabilistic model to identify the most uncertain correspondences in order to guide the expert's validation work. We also show how to construct a set of high-quality correspondences, even if the expert does not validate all generated correspondences. We demonstrate the efficiency of our techniques for real-world datasets in the domains of schema matching and ontology alignment

    Probabilistic Message Passing in Peer Data Management Systems ∗

    No full text
    Until recently, most data integration techniques involved central components, e.g., global schemas, to enable transparent access to heterogeneous databases. Today, however, with the democratization of tools facilitating knowledge elicitation in machine-processable formats, one cannot rely on global, centralized schemas anymore as knowledge creation and consumption are getting more and more dynamic and decentralized. Peer Data Management Systems (PDMS) provide an answer to this problem by eliminating the central semantic component and considering instead compositions of local, pair-wise mappings to propagate queries from one database to the others. PDMS approaches proposed so far make the implicit assumption that all mappings used in this way are correct. This obviously cannot be taken as granted in typical PDMS settings where mappings can be created (semi) automatically by independent parties. In this work, we propose a totally decentralized, efficient message passing scheme to automatically detect erroneous mappings in PDMS. Our scheme is based on a probabilistic model where we take advantage of transitive closures of mapping operations to confront local belief on the correctness of a mapping against evidences gathered around the network. We show that our scheme can be efficiently embedded in any PDMS and provide a preliminary evaluation of our techniques on sets of both automatically-generated and real-world schemas.
    corecore