Search CORE

298 research outputs found

K2/Kleisli and GUS: Experiments in Integrated Access to Genomic Data Sources

Author: Brunk Brian P.
Crabtree Jonathan
Davidson Susan B
Overton Chris
Schug Jonathan
Stoeckert Christian J.
Tannen Val
Publication venue: ScholarlyCommons
Publication date: 01/01/2000
Field of study

The integration of heterogeneous data sources and software systems is a major issue in the biomed ical community and several approaches have been explored: linking databases, on-the- fly integration through views, and integration through warehousing. In this paper we report on our experiences with two systems that were developed at the University of Pennsylvania: an integration system called K2, which has primarily been used to provide views over multiple external data sources and software systems; and a data warehouse called GUS which downloads, cleans, integrates and annotates data from multiple external data sources. Although the view and warehouse approaches each have their advantages, there is no clear winner . Therefore, users must consider how the data is to be used, what the performance guarantees must be, and how much programmer time and expertise is available to choose the best strategy for a particular application

CiteSeerX

ScholarlyCommons@Penn

A abordagem POESIA para a integração de dados e serviços na Web semantica

Author: Fileto Renato
Publication venue: [s.n.]
Publication date: 03/08/2018
Field of study

Orientador: Claudia Bauzer MedeirosTese (doutorado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: POESIA (Processes for Open-Ended Systems for lnformation Analysis), a abordagem proposta neste trabalho, visa a construção de processos complexos envolvendo integração e análise de dados de diversas fontes, particularmente em aplicações científicas. A abordagem é centrada em dois tipos de mecanismos da Web semântica: workflows científicos, para especificar e compor serviços Web; e ontologias de domínio, para viabilizar a interoperabilidade e o gerenciamento semânticos dos dados e processos. As principais contribuições desta tese são: (i) um arcabouço teórico para a descrição, localização e composição de dados e serviços na Web, com regras para verificar a consistência semântica de composições desses recursos; (ii) métodos baseados em ontologias de domínio para auxiliar a integração de dados e estimar a proveniência de dados em processos cooperativos na Web; (iii) implementação e validação parcial das propostas, em urna aplicação real no domínio de planejamento agrícola, analisando os benefícios e as limitações de eficiência e escalabilidade da tecnologia atual da Web semântica, face a grandes volumes de dadosAbstract: POESIA (Processes for Open-Ended Systems for Information Analysis), the approach proposed in this work, supports the construction of complex processes that involve the integration and analysis of data from several sources, particularly in scientific applications. This approach is centered in two types of semantic Web mechanisms: scientific workflows, to specify and compose Web services; and domain ontologies, to enable semantic interoperability and management of data and processes. The main contributions of this thesis are: (i) a theoretical framework to describe, discover and compose data and services on the Web, inc1uding mIes to check the semantic consistency of resource compositions; (ii) ontology-based methods to help data integration and estimate data provenance in cooperative processes on the Web; (iii) partial implementation and validation of the proposal, in a real application for the domain of agricultural planning, analyzing the benefits and scalability problems of the current semantic Web technology, when faced with large volumes of dataDoutoradoCiência da ComputaçãoDoutor em Ciência da Computaçã

Repositorio da Producao Cientifica e Intelectual da Unicamp

Effectively Maintaining Single View Consistency in Web Warehouses

Author: Qin Xiangdong
Zhang Yan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Web warehouse provides high availability and efficiency by utilizing materialized webviews, which should be refreshed in time to keep freshness. During the refreshing, the consistency between a webview and its base data, which is formally named single view consistency (abbreviated as SVC), must be guaranteed. Since the base data changes in a web warehousing environment do not propagate from data sources to the information consumers, which is far different from the case in the traditional data warehouses, we must pursue new maintenance methods. In this paper we first introduce the definition for SVC, and then we present an algorithm RCA to keep SVC as well as an effective base data change detection method SAA. We illustrate that RCA and SAA can guarantee SVC and they are effective in the web environment. ? 2005 IEEE.EI

Crossref

Enterprise Information Integration Using a Peer to Peer Approach

Author: Kokemueller Jochen
Weisbecker Anette
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2010
Field of study

The integration of enterprise information systems has unique requirements and frequently posesproblems to business partners. We discuss specific integration issues for micro-sized enterprises onthe special case of independent sales agencies and their suppliers. We argue that the enterpriseinformation systems of those independent enterprises are technically best represented by equal peers.Therefore, we have designed the Peer-To-Peer (P2P) integration architecture VIANA for theintegration of enterprise information systems. Its architecture provides materializing P2P integrationusing optimistic replication. It is applicable to inter- and intraorganizational integration scenarios. Itis accomplished by the propagation of write operations between peers. We argue that this type ofintegration can be realized with no alteration of the participating information systems

Fraunhofer-ePrints

AIS Electronic Library (AISeL)

Estocada: Stockage Hybride et Ré-écriture sous Contraintes d'Intégrité

Author: Al-Otaibi Rana B.
Bugiotti Francesca
Bursztyn Damian
Deutsch Alin
Manolescu Ioana
Zampetakis Stamatis
Publication venue: HAL CCSD
Publication date: 15/11/2016
Field of study

National audienceLa production croissante de données numériques a conduit a l'´ emergence d'une grande variété de systemes de gestion de données (Data Management Systems, ou DMS). Dans ce contexte, les applications a usage intensif de données ont besoin (i) d' accéder a des données hétérogenes de grande taille (" Big Data "), ayant une structure potentiellement complexe, et (ii) de manipuler des données de façon efficace afin de garantir une bonne performance de l'application. Comme ces différents systemes sont spécialisés sur certaines opérations mais sont moins performants sur d'autres, il peut s' avérer essentiel pour une application d'utiliser plusieurs DMS en même temps. Dans ce contexte nous présentons Estocada, une application donnant la possibilité de tirer profit simultanément de plusieurs DMSs et permettant une manipulation efficace et automatique de données de grande taille et hétérogenes, offrant ainsi un meilleur support aux applications a usage intensif de données. Dans Estocada, les données sont reparties dans plusieurs fragments qui sont stockés dans différents DMSs. Pour répondrè a une requêtè a partir de ces fragments , Estocada est basé sur la reecriture de requêtes sous contraintes; cesdernìeres sont utilisées pour représenter les différents modeles de données et la répartition des fragments entre les differents DMSs

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Peer-to-peer systems for simple and flexible information sharing

Author: Pather Suhendran
Publication venue: Department of Computer Science
Publication date: 01/01/2009
Field of study

Includes abstract.Includes bibliographical references (leaves 76-80).Peer to peer computing (P2P) is an architecture that enables applications to access shared resources, with peers having similar capabilities and responsibilities. The ubiquity of P2P computing and its increasing adoption for a decentralized data sharing mechanism have fueled my research interests. P2P networks are useful for sharing content files containing audio, video, and data. This research aims to address the problem of simple and flexible access to data from a variety of data sources across peers with different operating systems, databases and hardware. The proposed architecture makes use of SQL queries, web services, heterogeneous database servers and XML data transformation for the peer to peer data sharing prototype. SQL queries and web services provide a data sharing mechanism that allows both simple and flexible data access

Cape Town University OpenUCT

Incremental maintenance of materialized xquery views

Author: Maged F. El-sayed
Publication venue
Publication date: 01/01/2005
Field of study

Keeping views fresh by maintaining the consistency between materialized views and their base data in the presence of base updates is a critical prob-lem for many applications, including data warehousing and data integra-tion. While heavily studied for traditional databases, the maintenance of XML views remains largely unexplored. Maintaining XML views is com-plex due to the richness of the XML data model and the powerful capabili-ties of XML query languages, such as XQuery. This dissertation proposes a comprehensive solution for the general problem of maintaining materialized XQuery views. Our solution is the first to enable the maintenance of a large class of XQuery views including XPath expressions, FLWOR expressions, and Element Constructors. These views may contain arbitrary result construction and arbitrary grouping and join operations. Our solution also supports the unique order requirements of XQuery including source document order and query order. Th

CiteSeerX

DigitalCommons@WPI

The multi-agent system architecture in SEWASIE

Author: Fillottrani Pablo Rubén
Publication venue
Publication date: 01/12/2005
Field of study

We describe the design, implementation and deployment of the multi-level agent-based system architecture developed for the SEWASIE project. The aim of the system is to help the user in querying heterogeneous data sources which are integrated by means of ontologies. The agent architecture is based on a two level data integration scheme supported by mediators and brokers, connected by a peer to peer mechanism. Implementation is done on top of the JADE system, a modular and scalable platform that satisfies FIPA standards.Facultad de Informátic

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

A Service Late Binding Enabled Solution for Data Integration from Autonomous and Evolving Databases

Author: WANG CHONG
Publication venue
Publication date: 01/01/2010
Field of study

Integrating data from autonomous, distributed and heterogeneous data sources to provide a unified vision is a common demand for many businesses. Since the data sources may evolve frequently to satisfy their own independent business needs, solutions which use hard coded queries to integrate participating databases may cause high maintenance costs when evolution occurs. Thus a new solution which can handle database evolution with lower maintenance effort is required. This thesis presents a new solution: Service Late binding Enabled Data Integration (SLEDI) which is set into a framework modeling the essential processes of the data integration activity. It integrates schematic heterogeneous relational databases with decreased maintenance costs for handling database evolution. An algorithm, named Information Provision Unit Describing (IPUD) is designed to describe each database as a set of Information Provision Units (IPUs). The IPUs are represented as Directed Acyclic Graph (DAG) structured data instead of hard coded queries, and further realized as data services. Hence the data integration is achieved through service invocations. Furthermore, a set of processes is defined to handle the database evolution through automatically identifying and modifying the IPUs which are affected by the evolution. An extensive evaluation based on a case study is presented. The result shows that the schematic heterogeneities defined in this thesis can be solved by IPUD except the relation isomorphism discrepancy. Ten out of thirteen types of schematic database evolution can be automatically handled by the evolution handling processes as long as the evolution is represented by the designed data model. The computational costs of the automatic evolution handling show a slow linear growth with the number of participating databases. Other characteristics addressed include SLEDI’s scalability, independence of application domain and databases model. The descriptive comparison with other data integration approaches shows that although the Data as a Service approach may result in lower performance under some circumstances, it supports better flexibility for integrating data from autonomous and evolving data sources

Durham e-Theses