220 research outputs found

    XML-based approaches for the integration of heterogeneous bio-molecular data

    Get PDF
    Background: The today's public database infrastructure spans a very large collection of heterogeneous biological data, opening new opportunities for molecular biology, bio-medical and bioinformatics research, but raising also new problems for their integration and computational processing. Results: In this paper we survey the most interesting and novel approaches for the representation, integration and management of different kinds of biological data by exploiting XML and the related recommendations and approaches. Moreover, we present new and interesting cutting edge approaches for the appropriate management of heterogeneous biological data represented through XML. Conclusion: XML has succeeded in the integration of heterogeneous biomolecular information, and has established itself as the syntactic glue for biological data sources. Nevertheless, a large variety of XML-based data formats have been proposed, thus resulting in a difficult effective integration of bioinformatics data schemes. The adoption of a few semantic-rich standard formats is urgent to achieve a seamless integration of the current biological resources. </p

    Viewpoints on emergent semantics

    Get PDF
    Authors include:Philippe Cudr´e-Mauroux, and Karl Aberer (editors), Alia I. Abdelmoty, Tiziana Catarci, Ernesto Damiani, Arantxa Illaramendi, Robert Meersman, Erich J. Neuhold, Christine Parent, Kai-Uwe Sattler, Monica Scannapieco, Stefano Spaccapietra, Peter Spyns, and Guy De Tr´eWe introduce a novel view on how to deal with the problems of semantic interoperability in distributed systems. This view is based on the concept of emergent semantics, which sees both the representation of semantics and the discovery of the proper interpretation of symbols as the result of a self-organizing process performed by distributed agents exchanging symbols and having utilities dependent on the proper interpretation of the symbols. This is a complex systems perspective on the problem of dealing with semantics. We highlight some of the distinctive features of our vision and point out preliminary examples of its applicatio

    A framework for information integration using ontological foundations

    Get PDF
    With the increasing amount of data, ability to integrate information has always been a competitive advantage in information management. Semantic heterogeneity reconciliation is an important challenge of many information interoperability applications such as data exchange and data integration. In spite of a large amount of research in this area, the lack of theoretical foundations behind semantic heterogeneity reconciliation techniques has resulted in many ad-hoc approaches. In this thesis, I address this issue by providing ontological foundations for semantic heterogeneity reconciliation in information integration. In particular, I investigate fundamental semantic relations between properties from an ontological point of view and show how one of the basic and natural relations between properties – inferring implicit properties from existing properties – can be used to enhance information integration. These ontological foundations have been exploited in four aspects of information integration. First, I propose novel algorithms for semantic enrichment of schema mappings. Second, using correspondences between similar properties at different levels of abstraction, I propose a configurable data integration system, in which query rewriting techniques allows the tradeoff between accuracy and completeness in query answering. Third, to keep the semantics in data exchange, I propose an entity preserving data exchange approach that reflects source entities in the target independent of classification of entities. Finally, to improve the efficiency of the data exchange approach proposed in this thesis, I propose an extended model of the column-store model called sliced column store. Working prototypes of the techniques proposed in this thesis are implemented to show the feasibility of realizing these techniques. Experiments that have been performed using various datasets show the techniques proposed in this thesis outperform many existing techniques in terms of ability to handle semantic heterogeneities and performance of information exchange

    Model morphisms (MoMo) to enable language independent information models and interoperable business networks

    Get PDF
    MSc. Dissertation presented at Faculdade de Ciências e Tecnologia of Universidade Nova de Lisboa to obtain the Master degree in Electrical and Computer EngineeringWith the event of globalisation, the opportunities for collaboration became more evident with the effect of enlarging business networks. In such conditions, a key for enterprise success is a reliable communication with all the partners. Therefore, organisations have been searching for flexible integrated environments to better manage their services and product life cycle, where their software applications could be easily integrated independently of the platform in use. However, with so many different information models and implementation standards being used, interoperability problems arise. Moreover,organisations are themselves at different technological maturity levels, and the solution that might be good for one, can be too advanced for another, or vice-versa. This dissertation responds to the above needs, proposing a high level meta-model to be used at the entire business network, enabling to abstract individual models from their specificities and increasing language independency and interoperability, while keeping all the enterprise legacy software‟s integrity intact. The strategy presented allows an incremental mapping construction, to achieve a gradual integration. To accomplish this, the author proposes Model Driven Architecture (MDA) based technologies for the development of traceable transformations and execution of automatic Model Morphisms

    Intuitionistic fuzzy XML query matching and rewriting

    Get PDF
    With the emergence of XML as a standard for data representation, particularly on the web, the need for intelligent query languages that can operate on XML documents with structural heterogeneity has recently gained a lot of popularity. Traditional Information Retrieval and Database approaches have limitations when dealing with such scenarios. Therefore, fuzzy (flexible) approaches have become the predominant. In this thesis, we propose a new approach for approximate XML query matching and rewriting which aims at achieving soft matching of XML queries with XML data sources following different schemas. Unlike traditional querying approaches, which require exact matching, the proposed approach makes use of Intuitionistic Fuzzy Trees to achieve approximate (soft) query matching. Through this new approach, not only the exact answer of a query, but also approximate answers are retrieved. Furthermore, partial results can be obtained from multiple data sources and merged together to produce a single answer to a query. The proposed approach introduced a new tree similarity measure that considers the minimum and maximum degrees of similarity/inclusion of trees that are based on arc matching. New techniques for soft node and arc matching were presented for matching queries against data sources with highly varied structures. A prototype was developed to test the proposed ideas and it proved the ability to achieve approximate matching for pattern queries with a number of XML schemas and rewrite the original query so that it obtain results from the underlying data sources. This has been achieved through several novel algorithms which were tested and proved efficiency and low CPU/Memory cost even for big number of data sources

    Data Sharing in P2P Systems

    Get PDF
    To appear in Springer's "Handbook of P2P Networking"In this chapter, we survey P2P data sharing systems. All along, we focus on the evolution from simple file-sharing systems, with limited functionalities, to Peer Data Management Systems (PDMS) that support advanced applications with more sophisticated data management techniques. Advanced P2P applications are dealing with semantically rich data (e.g. XML documents, relational tables), using a high-level SQL-like query language. We start our survey with an overview over the existing P2P network architectures, and the associated routing protocols. Then, we discuss data indexing techniques based on their distribution degree and the semantics they can capture from the underlying data. We also discuss schema management techniques which allow integrating heterogeneous data. We conclude by discussing the techniques proposed for processing complex queries (e.g. range and join queries). Complex query facilities are necessary for advanced applications which require a high level of search expressiveness. This last part shows the lack of querying techniques that allow for an approximate query answering
    • …
    corecore