2,656 research outputs found
Heterogeneous biomedical database integration using a hybrid strategy: a p53 cancer research database.
Complex problems in life science research give rise to multidisciplinary collaboration, and hence, to the need for heterogeneous database integration. The tumor suppressor p53 is mutated in close to 50% of human cancers, and a small drug-like molecule with the ability to restore native function to cancerous p53 mutants is a long-held medical goal of cancer treatment. The Cancer Research DataBase (CRDB) was designed in support of a project to find such small molecules. As a cancer informatics project, the CRDB involved small molecule data, computational docking results, functional assays, and protein structure data. As an example of the hybrid strategy for data integration, it combined the mediation and data warehousing approaches. This paper uses the CRDB to illustrate the hybrid strategy as a viable approach to heterogeneous data integration in biomedicine, and provides a design method for those considering similar systems. More efficient data sharing implies increased productivity, and, hopefully, improved chances of success in cancer research. (Code and database schemas are freely downloadable, http://www.igb.uci.edu/research/research.html.)
Automated syntactic mediation for Web service integration
As the Web Services and Grid community adopt Semantic Web technology, we observe a shift towards higher-level workflow composition and service discovery practices. While this provides excellent functionality to non-expert users, more sophisticated middleware is required to hide the details of service invocation and service integration. An investigation of a common Bioinformatics use case reveals that the execution of high-level workflow designs requires additional processing to harmonise syntactically incompatible service interfaces. In this paper, we present an architecture to support the automatic reconciliation of data formats in such Web Service worklflows. The mediation of data is driven by ontologies that encapsulate the information contained in heterogeneous data structures supplying a common, conceptual data representation. Data conversion is carried out by a Configurable Mediator component, consuming mappings between \xml schemas and \owl ontologies. We describe our system and give examples of our mapping language against the background of a Bioinformatics use case
Bioinformatics service reconciliation by heterogeneous schema transformation
This paper focuses on the problem of bioinformatics service reconciliation in a generic and scalable manner so as to enhance interoperability in a highly evolving field. Using XML as a common representation format, but also supporting existing flat-file representation formats, we propose an approach for the scalable semi-automatic reconciliation of services, possibly invoked from within a scientific workflows tool. Service reconciliation may use the AutoMed heterogeneous data integration system as an intermediary service, or may use AutoMed to produce services that mediate between services. We discuss the application of our approach for the reconciliation of services in an example bioinformatics workflow. The main contribution of this research is an architecture for the scalable reconciliation of bioinformatics services
XML for Domain Viewpoints
Within research institutions like CERN (European Organization for Nuclear
Research) there are often disparate databases (different in format, type and
structure) that users need to access in a domain-specific manner. Users may
want to access a simple unit of information without having to understand detail
of the underlying schema or they may want to access the same information from
several different sources. It is neither desirable nor feasible to require
users to have knowledge of these schemas. Instead it would be advantageous if a
user could query these sources using his or her own domain models and
abstractions of the data. This paper describes the basis of an XML (eXtended
Markup Language) framework that provides this functionality and is currently
being developed at CERN. The goal of the first prototype was to explore the
possibilities of XML for data integration and model management. It shows how
XML can be used to integrate data sources. The framework is not only applicable
to CERN data sources but other environments too.Comment: 9 pages, 6 figures, conference report from SCI'2001 Multiconference
on Systemics & Informatics, Florid
Data integration through service-based mediation for web-enabled information systems
The Web and its underlying platform technologies have often been used to integrate existing software and information systems. Traditional techniques for data representation and transformations between documents are not sufficient to support a flexible and maintainable data integration solution that meets the requirements of modern complex Web-enabled software and information systems. The difficulty
arises from the high degree of complexity of data structures, for example in business and technology applications, and from the constant change of data and its
representation. In the Web context, where the Web platform is used to integrate different organisations or software systems, additionally the problem of heterogeneity
arises. We introduce a specific data integration solution for Web applications such as Web-enabled information systems. Our contribution is an integration technology
framework for Web-enabled information systems comprising, firstly, a data integration technique based on the declarative specification of transformation rules and the construction of connectors that handle the integration and, secondly, a mediator architecture based on information services and the constructed connectors to handle the integration process
Mediated data integration and transformation for web service-based software architectures
Service-oriented architecture using XML-based web services has been widely accepted by many organisations as the standard infrastructure to integrate heterogeneous and autonomous data sources. As a result, many Web service providers are built up on top of the data sources to share the data by supporting provided and required interfaces and methods of data access in a unified manner. In the context of data integration, problems arise when Web services are assembled to deliver an integrated view of data, adaptable to the specific needs of individual clients and providers. Traditional approaches of data integration and transformation are not suitable to automate the construction of connectors dedicated to connect selected Web services to render integrated and tailored views of data. We propose a declarative approach that addresses the oftenneglected data integration and adaptivity aspects of serviceoriented
architecture
Recommended from our members
A linked data-driven & service-oriented architecture for sharing educational resources
The two fundamental aims of managing educational resources are to enable resources to be reusable and interoperable and to enable Web-scale sharing of resources across learning communities. Currently, a variety of approaches have been proposed to expose and manage educational resources and their metadata on the Web. These are usually based on heterogeneous metadata standards and schemas, such as IEEE LOM or ADL SCORM, and diverse repository interfaces such as OAI-PMH or SQI. Also, there is still a lack of usage of controlled vocabularies and available data sets that could replace the widespread use of unstructured text for describing resources. On the other hand, the Linked Data approach has proven that it offers a set of successful principles that have the potential to alleviate the aforementioned issues. In this paper, we introduce an architecture and prototype which is fundamentally based on (a) Linked Data principles and (b) Service-orientation to resolve the integration issues for sharing educational resources
An Ontology Based Method to Solve Query Identifier Heterogeneity in Post-Genomic Clinical Trials
The increasing amount of information available for biomedical research has led to issues related to knowledge discovery in large collections of data. Moreover, Information Retrieval techniques must consider heterogeneities present in databases, initially belonging to different domains—e.g. clinical and genetic data. One of the goals, among others, of the ACGT European is to provide seamless and homogeneous access to integrated databases. In this work, we describe an approach to overcome heterogeneities in identifiers inside queries. We present an ontology classifying the most common identifier semantic heterogeneities, and a service that makes use of it to cope with the problem using the described approach. Finally, we illustrate the solution by analysing a set of real queries
- …