46,836 research outputs found
IMPrECISE: Good-is-good-enough data integration
IMPrECISE is an XQuery module that adds probabilistic XML functionality to an existing XML DBMS, in our case MonetDB/XQuery. We demonstrate probabilistic XML and data integration functionality of IMPrECISE. The prototype is configurable with domain knowledge such that the amount of uncertainty arising during data integration is reduced to an acceptable level, thus obtaining a "good is good enough" data integration with minimal human effort
Duplicate Detection in Probabilistic Data
Collected data often contains uncertainties. Probabilistic databases have been proposed to manage uncertain data. To combine data from multiple autonomous probabilistic databases, an integration of probabilistic data has to be performed. Until now, however, data integration approaches have focused on the integration of certain source data (relational or XML). There is no work on the integration of uncertain (esp. probabilistic) source data so far. In this paper, we present a first step towards a concise consolidation of probabilistic data. We focus on duplicate detection as a representative and essential step in an integration process. We present techniques for identifying multiple probabilistic representations of the same real-world entities. Furthermore, for increasing the efficiency of the duplicate detection process we introduce search space reduction methods adapted to probabilistic data
XML in Motion from Genome to Drug
Information technology (IT) has emerged as a central to the solution of contemporary genomics and drug discovery problems. Researchers involved in genomics, proteomics, transcriptional profiling, high throughput structure determination, and in other sub-disciplines of bioinformatics have direct impact on this IT revolution. As the full genome sequences of many species, data from structural genomics, micro-arrays, and proteomics became available, integration of these data to a common platform require sophisticated bioinformatics tools. Organizing these data into knowledgeable databases and developing appropriate software tools for analyzing the same are going to be major challenges. XML (eXtensible Markup Language) forms the backbone of biological data representation and exchange over the internet, enabling researchers to aggregate data from various heterogeneous data resources. The present article covers a comprehensive idea of the integration of XML on particular type of biological databases mainly dealing with sequence-structure-function relationship and its application towards drug discovery. This e-medical science approach should be applied to other scientific domains and the latest trend in semantic web applications is also highlighted
Towards a novel framework for the assessment of enterprise application integration packages
In addressing enterprise integration problems, a diversity
of technologies such as CORBA and XML were
promoted, yet no single integration technology solves all
integration problems. As a result, a new generation of
software called Enterprise Application Integration (EAI)
is emerging to addresses many integration problems by
combining a diversity of integration technologies (e.g.
message brokers, adapters, XML). Since EAI is a new
research area, there is an absence of literature discussing
issues like its adoption, evaluation and implementation.
This paper, examines the application of two frameworks
for the evaluation of EAI packages in the practical arena.
In doing so, the authors use case study strategy to
investigate integration issues. Empirical data derived
from the case study suggest additions to the two
evaluation frameworks. Therefore, the authors revised
and extend previous works by proposing a novel
evaluation framework for the assessment of EAI
packages. The proposed framework makes novel
contribution at two levels. First, at the conceptual level,
as it incorporates criteria identified separately in previous
studies as evaluation criteria. The proposed framework
can be used as a decision-making tool and, supports
management when taking decisions regarding the
adoption of EAI. Additionally, it can be used by
researchers to analyse and understand the capabilities o
XML for Domain Viewpoints
Within research institutions like CERN (European Organization for Nuclear
Research) there are often disparate databases (different in format, type and
structure) that users need to access in a domain-specific manner. Users may
want to access a simple unit of information without having to understand detail
of the underlying schema or they may want to access the same information from
several different sources. It is neither desirable nor feasible to require
users to have knowledge of these schemas. Instead it would be advantageous if a
user could query these sources using his or her own domain models and
abstractions of the data. This paper describes the basis of an XML (eXtended
Markup Language) framework that provides this functionality and is currently
being developed at CERN. The goal of the first prototype was to explore the
possibilities of XML for data integration and model management. It shows how
XML can be used to integrate data sources. The framework is not only applicable
to CERN data sources but other environments too.Comment: 9 pages, 6 figures, conference report from SCI'2001 Multiconference
on Systemics & Informatics, Florid
Conceptual Workflow for Complex Data Integration using AXML
International audienceRelevant data for decision support systems are available everywhere and in various formats. Such data must be integrated into a unified format. Traditional data integration approaches are not adapted to handle complex data. Thus, we exploit the Active XML language for integrating complex data. Its XML part allows to unify, model and store complex data. Moreover, its services part tackles the distributed issue of data sources. Accordingly, different integration tasks are proposed as services. These services are managed via a set of active rules that are built upon metadata and events of the integration system. In this paper, we design an architecture for integrating complex data autonomously. We have also designed the workflow for data integration tasks
- …