8,317 research outputs found

    XML-Based Heterogeneous Database Integration For Data Warehouse Creation

    Get PDF

    An Ontology Based Approach To The Integration Of Heterogeneous Information Systems Supporting Integrated Provincial Administration In Khon Kaen, Thailand

    Get PDF
    Information systems are a necessity to the administration of organizations. In a recent reform to the Thai administration, the governor of each province is entrusted with the full responsibility for the strategic planning and execution of the Integrated Provincial Administration (IPA). This presents a big challenge and many difficult problems for a potentially fast growing, both economically and demographically, province, such as Khon Kaen. To provide the administrator of the province with reliable and up to date information, the Provincial Operation Centre (POC) has been set up and assigned the task of collecting all required information from disparate information systems, many of which are legacy systems. This information lacks interoperability and integration of data due to many different structures and semantic heterogeneity encountered in many information systems. This research is a part of a collaborative data sources community development project. It attempts to aid high-level decision makers by using ontology to resolve heterogeneities among many disparate data sources. After relevant data sources are identified, they are analysed to reveal important and corresponding concepts, attributes and relations. They are then used in the creation of ontologies to resolve schematic and semantic conflicts in the data sources. The integration of many heterogeneous information systems will provide a unified view of information facilitating the provincial administrator in his decision making

    Database independent Migration of Objects into an Object-Relational Database

    Get PDF
    This paper reports on the CERN-based WISDOM project which is studying the serialisation and deserialisation of data to/from an object database (objectivity) and ORACLE 9i.Comment: 26 pages, 18 figures; CMS CERN Conference Report cr02_01

    Discovering and reconciling conflicts for data integration

    Get PDF
    Title from cover. "January, 1998."Includes bibliographical references (p. 24-25).Hongjun Lu ... [et al.

    Resolution of Semantic Heterogeneity in Database Schema Integration Using Formal Ontologies

    Get PDF
    This paper addresses the problem of handling semantic heterogeneity during database schema integration. We focus on the semantics of terms used as identifiers in schema definitions. Our solution does not rely on the names of the schema elements or the structure of the schemas. Instead, we utilize formal ontologies consisting of intensional definitions of terms represented in a logical language. The approach is based on similarity relations between intensional definitions in different ontologies. We present the definitions of similarity relations based on intensional definitions in formal ontologies. The extensional consequences of intensional relations are addressed. The paper shows how similarity relations are discovered by a reasoning system using a higher-level ontology. These similarity relations are then used to derive an integrated schema in two steps. First, we show how to use similarity relations to generate the class hierarchy of the global schema. Second, we explain how to enhance the class definitions with attributes. This approach reduces the cost of generating or re-generating global schemas for tightly-coupled federated database

    Design of petroleum company's metadata and an effective knowledge mapping methodology

    Get PDF
    Success of information flow depends on intelligent datastorage and its management in a multi-disciplinaryenvironment. Multi-dimensional data entities, data typesand ambiguous semantics, often pose uncertainty andinconsistency in data retrieval from volumes of petroleumdata sources. In our approach, conceptual schemas andsub-schemas have been described based on variousoperational functions of the petroleum industry. Theseschemas are integrated, to ensure their consistency andvalidity, so that the information retrieved from anintegrated metadata (in the form of a data warehouse)structure derives its authenticity from its implementation.The data integration process validating the petroleummetadata has been demonstrated for one of the Gulfoffshore basins for an effective knowledge mapping andinterpreting it successfully for the derivation of usefulgeological knowledge. Warehoused data are used formining data patterns, trends and correlations amongknowledge-base data attributes that led to interpretation ofinteresting geological features. These technologies appearto be more amenable for exploration of more petroleumresources in the mature gulf basins

    BIOZON: a system for unification, management and analysis of heterogeneous biological data

    Get PDF
    BACKGROUND: Integration of heterogeneous data types is a challenging problem, especially in biology, where the number of databases and data types increase rapidly. Amongst the problems that one has to face are integrity, consistency, redundancy, connectivity, expressiveness and updatability. DESCRIPTION: Here we present a system (Biozon) that addresses these problems, and offers biologists a new knowledge resource to navigate through and explore. Biozon unifies multiple biological databases consisting of a variety of data types (such as DNA sequences, proteins, interactions and cellular pathways). It is fundamentally different from previous efforts as it uses a single extensive and tightly connected graph schema wrapped with hierarchical ontology of documents and relations. Beyond warehousing existing data, Biozon computes and stores novel derived data, such as similarity relationships and functional predictions. The integration of similarity data allows propagation of knowledge through inference and fuzzy searches. Sophisticated methods of query that span multiple data types were implemented and first-of-a-kind biological ranking systems were explored and integrated. CONCLUSION: The Biozon system is an extensive knowledge resource of heterogeneous biological data. Currently, it holds more than 100 million biological documents and 6.5 billion relations between them. The database is accessible through an advanced web interface that supports complex queries, "fuzzy" searches, data materialization and more, online at

    Semantic Web Based Relational Database Access With Conflict Resolution

    Get PDF
    This thesis focuses on (1) accessing relational databases through Semantic Web technologies and (2) resolving conflicts that usually arises when integrating data from heterogeneous source schemas and/or instances. In the first part of the thesis, we present an approach to access relational databases using Semantic Web technologies. Our approach is built on top of Ontop framework for Ontology Based Data Access. It extracts both Ontop mappings and an equivalent OWL ontology from an existing database schema. The end users can then access the underlying data source through SPARQL queries. The proposed approach takes into consideration the different relationships between the entities of the database schema when it extracts the mapping and the equivalent ontology. Instead of extracting a flat ontology that is an exact copy of the database schema, it extracts a rich ontology. The extracted ontology can also be used as an intermediary between a domain ontology and the underlying database schema. Our approach covers independent or master entities that do not have foreign references, dependent or detailed entities that have some foreign keys that reference other entities, recursive entities that contain some self references, binary join entities that relate two entities together, and n-ary join entities that map two or more entities in an n-ary relation. The implementation results indicate that the extracted Ontop mappings and ontology are accurate. i.e., end users can query all data (using SPARQL) from the underlying database source in the same way as if they have written SQL queries. In the second part, we present an overview of the conflict resolution approaches in both conventional data integration systems and collaborative data sharing communities. We focus on the latter as it supports the needs of scientific communities for data sharing and collaboration. We first introduce the purpose of the study, and present a brief overview of data integration. Next, we talk about the problem of inconsistent data in conventional integration systems, and we summarize the conflict handling strategies used to handle such inconsistent data. Then we focus on the problem of conflict resolution in collaborative data sharing communities. A collaborative data sharing community is a group of users who agree to share a common database instance, such that all users have access to the shared instance and they can add to, update, and extend this shared instance. We discuss related works that adopt different conflict resolution strategies in the area of collaborative data sharing, and we provide a comparison between them. We find that a Collaborative Data Sharing System (CDSS) can best support the needs of certain communities such as scientific communities. We then discuss some open research opportunities to improve the efficiency and performance of the CDSS. Finally, we summarize our work so far towards achieving these open research directions
    corecore