8,278 research outputs found

    UK utility data integration: overcoming schematic heterogeneity

    Get PDF
    In this paper we discuss syntactic, semantic and schematic issues which inhibit the integration of utility data in the UK. We then focus on the techniques employed within the VISTA project to overcome schematic heterogeneity. A Global Schema based architecture is employed. Although automated approaches to Global Schema definition were attempted the heterogeneities of the sector were too great. A manual approach to Global Schema definition was employed. The techniques used to define and subsequently map source utility data models to this schema are discussed in detail. In order to ensure a coherent integrated model, sub and cross domain validation issues are then highlighted. Finally the proposed framework and data flow for schematic integration is introduced

    Comparing human and automatic thesaurus mapping approaches in the agricultural domain

    Get PDF
    Knowledge organization systems (KOS), like thesauri and other controlled vocabularies, are used to provide subject access to information systems across the web. Due to the heterogeneity of these systems, mapping between vocabularies becomes crucial for retrieving relevant information. However, mapping thesauri is a laborious task, and thus big efforts are being made to automate the mapping process. This paper examines two mapping approaches involving the agricultural thesaurus AGROVOC, one machine-created and one human created. We are addressing the basic question "What are the pros and cons of human and automatic mapping and how can they complement each other?" By pointing out the difficulties in specific cases or groups of cases and grouping the sample into simple and difficult types of mappings, we show the limitations of current automatic methods and come up with some basic recommendations on what approach to use when.Comment: 10 pages, Int'l Conf. on Dublin Core and Metadata Applications 200

    Ontology-assisted database integration to support natural language processing and biomedical data-mining

    Get PDF
    Successful biomedical data mining and information extraction require a complete picture of biological phenomena such as genes, biological processes, and diseases; as these exist on different levels of granularity. To realize this goal, several freely available heterogeneous databases as well as proprietary structured datasets have to be integrated into a single global customizable scheme. We will present a tool to integrate different biological data sources by mapping them to a proprietary biomedical ontology that has been developed for the purposes of making computers understand medical natural language

    Towards automated knowledge-based mapping between individual conceptualisations to empower personalisation of Geospatial Semantic Web

    No full text
    Geospatial domain is characterised by vagueness, especially in the semantic disambiguation of the concepts in the domain, which makes defining universally accepted geo- ontology an onerous task. This is compounded by the lack of appropriate methods and techniques where the individual semantic conceptualisations can be captured and compared to each other. With multiple user conceptualisations, efforts towards a reliable Geospatial Semantic Web, therefore, require personalisation where user diversity can be incorporated. The work presented in this paper is part of our ongoing research on applying commonsense reasoning to elicit and maintain models that represent users' conceptualisations. Such user models will enable taking into account the users' perspective of the real world and will empower personalisation algorithms for the Semantic Web. Intelligent information processing over the Semantic Web can be achieved if different conceptualisations can be integrated in a semantic environment and mismatches between different conceptualisations can be outlined. In this paper, a formal approach for detecting mismatches between a user's and an expert's conceptual model is outlined. The formalisation is used as the basis to develop algorithms to compare models defined in OWL. The algorithms are illustrated in a geographical domain using concepts from the SPACE ontology developed as part of the SWEET suite of ontologies for the Semantic Web by NASA, and are evaluated by comparing test cases of possible user misconceptions

    Image databases: Problems and perspectives

    Get PDF
    With the increasing number of computer graphics, image processing, and pattern recognition applications, economical storage, efficient representation and manipulation, and powerful and flexible query languages for retrieval of image data are of paramount importance. These and related issues pertinent to image data bases are examined

    Heterogeneous biomedical database integration using a hybrid strategy: a p53 cancer research database.

    Get PDF
    Complex problems in life science research give rise to multidisciplinary collaboration, and hence, to the need for heterogeneous database integration. The tumor suppressor p53 is mutated in close to 50% of human cancers, and a small drug-like molecule with the ability to restore native function to cancerous p53 mutants is a long-held medical goal of cancer treatment. The Cancer Research DataBase (CRDB) was designed in support of a project to find such small molecules. As a cancer informatics project, the CRDB involved small molecule data, computational docking results, functional assays, and protein structure data. As an example of the hybrid strategy for data integration, it combined the mediation and data warehousing approaches. This paper uses the CRDB to illustrate the hybrid strategy as a viable approach to heterogeneous data integration in biomedicine, and provides a design method for those considering similar systems. More efficient data sharing implies increased productivity, and, hopefully, improved chances of success in cancer research. (Code and database schemas are freely downloadable, http://www.igb.uci.edu/research/research.html.)

    A Framework for Semantic Interoperability for Distributed Geospatial Repositories

    Get PDF
    Interoperable access of geospatial information across disparate geospatial applications has become essential. Geospatial data are highly heterogeneous -- the heterogeneity arises both at the syntactic and semantic levels. Finding and accessing appropriate data in such a distributed environment is an important research issue. The paper proposes a methodology for interoperable access of geospatial information based on Open Geospatial Consortium (OGC) specified standards. An architecture for integrating diverse geospatial data repositories has been proposed using service-based methodology. The semantic issues for discovery and retrieval of geospatial data over distributed geospatial services have also been proposed in the paper. The proposed architecture utilizes the ontological concepts for service description and subsequent discovery of services. An approach for semantic similarity assessment of geospatial services has been discussed
    • …
    corecore