Search CORE

8,278 research outputs found

UK utility data integration: overcoming schematic heterogeneity

Author: Beck A.R.
Bennett B.
Cohn AG
Fu G.
Ramage S.
Sanderson M.
Stell J.G.
Tagg C
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 31/10/2008
Field of study

In this paper we discuss syntactic, semantic and schematic issues which inhibit the integration of utility data in the UK. We then focus on the techniques employed within the VISTA project to overcome schematic heterogeneity. A Global Schema based architecture is employed. Although automated approaches to Global Schema definition were attempted the heterogeneities of the sector were too great. A manual approach to Global Schema definition was employed. The techniques used to define and subsequently map source utility data models to this schema are discussed in detail. In order to ensure a coherent integrated model, sub and cross domain validation issues are then highlighted. Finally the proposed framework and data flow for schematic integration is introduced

Crossref

White Rose Research Online

Comparing human and automatic thesaurus mapping approaches in the agricultural domain

Author: Caracciolo Caterina
Johannsen Gudrun
Keizer Johannes
Lauser Boris
Mayr Philipp
van Hage Willem Robert
Publication venue
Publication date: 01/01/2008
Field of study

Knowledge organization systems (KOS), like thesauri and other controlled vocabularies, are used to provide subject access to information systems across the web. Due to the heterogeneity of these systems, mapping between vocabularies becomes crucial for retrieving relevant information. However, mapping thesauri is a laborious task, and thus big efforts are being made to automate the mapping process. This paper examines two mapping approaches involving the agricultural thesaurus AGROVOC, one machine-created and one human created. We are addressing the basic question "What are the pros and cons of human and automatic mapping and how can they complement each other?" By pointing out the difficulties in specific cases or groups of cases and grouping the sample into simple and difficult types of mappings, we show the limitations of current automatic methods and come up with some basic recommendations on what approach to use when.Comment: 10 pages, Int'l Conf. on Dublin Core and Metadata Applications 200

arXiv.org e-Print Archive

Proceedings of the International Conference on Dublin Core and Metadata Applications (DCMI)

CiteSeerX

E-LIS

VU Research Portal

SSOAR - Social Science Open Access Repository

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

Ontology-assisted database integration to support natural language processing and biomedical data-mining

Author: Ceusters Werner
Deray Tom
Santos Marianna C.
Smith Barry
Verschelde Jean-Luc
Publication venue
Publication date: 01/01/2004
Field of study

Successful biomedical data mining and information extraction require a complete picture of biological phenomena such as genes, biological processes, and diseases; as these exist on different levels of granularity. To realize this goal, several freely available heterogeneous databases as well as proprietary structured datasets have to be integrated into a single global customizable scheme. We will present a tool to integrate different biological data sources by mapping them to a proprietary biomedical ontology that has been developed for the purposes of making computers understand medical natural language

PhilPapers

Crossref

Towards automated knowledge-based mapping between individual conceptualisations to empower personalisation of Geospatial Semantic Web

Author: Agarwal Pragya
Dimitrova Vania
Huang Yongjian
Publication venue
Publication date: 01/01/2005
Field of study

Geospatial domain is characterised by vagueness, especially in the semantic disambiguation of the concepts in the domain, which makes defining universally accepted geo- ontology an onerous task. This is compounded by the lack of appropriate methods and techniques where the individual semantic conceptualisations can be captured and compared to each other. With multiple user conceptualisations, efforts towards a reliable Geospatial Semantic Web, therefore, require personalisation where user diversity can be incorporated. The work presented in this paper is part of our ongoing research on applying commonsense reasoning to elicit and maintain models that represent users' conceptualisations. Such user models will enable taking into account the users' perspective of the real world and will empower personalisation algorithms for the Semantic Web. Intelligent information processing over the Semantic Web can be achieved if different conceptualisations can be integrated in a semantic environment and mismatches between different conceptualisations can be outlined. In this paper, a formal approach for detecting mismatches between a user's and an expert's conceptual model is outlined. The formalisation is used as the basis to develop algorithms to compare models defined in OWL. The algorithms are illustrated in a geographical domain using concepts from the SPACE ontology developed as part of the SWEET suite of ontologies for the Semantic Web by NASA, and are evaluated by comparing test cases of possible user misconceptions

Southampton (e-Prints Soton)

Image databases: Problems and perspectives

Author: Gudivada V. Naidu
Publication venue
Publication date
Field of study

With the increasing number of computer graphics, image processing, and pattern recognition applications, economical storage, efficient representation and manipulation, and powerful and flexible query languages for retrieval of image data are of paramount importance. These and related issues pertinent to image data bases are examined

NASA Technical Reports Server

Heterogeneous biomedical database integration using a hybrid strategy: a p53 cancer research database.

Author: Bichutskiy Vadim Y
Brachmann Rainer K
Colman Richard
Lathrop Richard H
Publication venue: eScholarship, University of California
Publication date: 01/01/2006
Field of study

Complex problems in life science research give rise to multidisciplinary collaboration, and hence, to the need for heterogeneous database integration. The tumor suppressor p53 is mutated in close to 50% of human cancers, and a small drug-like molecule with the ability to restore native function to cancerous p53 mutants is a long-held medical goal of cancer treatment. The Cancer Research DataBase (CRDB) was designed in support of a project to find such small molecules. As a cancer informatics project, the CRDB involved small molecule data, computational docking results, functional assays, and protein structure data. As an example of the hybrid strategy for data integration, it combined the mediation and data warehousing approaches. This paper uses the CRDB to illustrate the hybrid strategy as a viable approach to heterogeneous data integration in biomedicine, and provides a design method for those considering similar systems. More efficient data sharing implies increased productivity, and, hopefully, improved chances of success in cancer research. (Code and database schemas are freely downloadable, http://www.igb.uci.edu/research/research.html.)

Directory of Open Access Journals

eScholarship - University of California

Modeling Spatial and Temporal Semantics in a Large Heterogeneous GIS Database Environment †

Author: Park Jinsoo
Ram Sudga
Publication venue: AIS Electronic Library (AISeL)
Publication date: 16/08/1996
Field of study

AIS Electronic Library (AISeL)

A Framework for Semantic Interoperability for Distributed Geospatial Repositories

Author: Ghosh Soumya Kanti
Paul Manoj
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 27/01/2012
Field of study

Interoperable access of geospatial information across disparate geospatial applications has become essential. Geospatial data are highly heterogeneous -- the heterogeneity arises both at the syntactic and semantic levels. Finding and accessing appropriate data in such a distributed environment is an important research issue. The paper proposes a methodology for interoperable access of geospatial information based on Open Geospatial Consortium (OGC) specified standards. An architecture for integrating diverse geospatial data repositories has been proposed using service-based methodology. The semantic issues for discovery and retrieval of geospatial data over distributed geospatial services have also been proposed in the paper. The proposed architecture utilizes the ontological concepts for service description and subsequent discovery of services. An approach for semantic similarity assessment of geospatial services has been discussed

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)