Search CORE

50 research outputs found

Data Management in the APPA System

Author: Akbarinia Reza
Martins Vidal
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

International audienceCombining Grid and P2P technologies can be exploited to provide high-level data sharing in large-scale distributed environments. However, this combination must deal with two hard problems: the scale of the network and the dynamic behavior of the nodes. In this paper, we present our solution in APPA (Atlas Peer-to-Peer Architecture), a data management system with high-level services for building large-scale distributed applications. We focus on data availability and data discovery which are two main requirements for implementing large-scale Grids. We have validated APPA's services through a combination of experimentation over Grid5000, which is a very large Grid experimental platform, and simulation using SimJava. The results show very good performance in terms of communication cost and response time

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Summary Management in P2P Systems

Author: Hayek Rabab
Mouaddib Noureddine
Raschia Guillaume
Valduriez Patrick
Publication venue: HAL CCSD
Publication date: 01/03/2008
Field of study

International audienceSharing huge, massively distributed databases in P2P systems is inherently difficult. As the amount of stored data increases, data localization techniques become no longer suf- ficient. A practical approach is to rely on compact database summaries rather than raw database records, whose access is costly in large P2P systems. In this paper, we consider summaries that are synthetic, multidimensional views with two main virtues. First, they can be directly queried and used to approximately answer a query without exploring the original data. Second, as semantic indexes, they support locating relevant nodes based on data content. Our main contribution is to define a summary model for P2P systems, and the appropriate algorithms for summary management. Our performance evaluation shows that the cost of query routing is minimized, while incurring a low cost of summary maintenance

Design of PeerSum: a Summary Service for P2P Applications

Author: Hayek Rabab
Mouaddib Noureddine
Raschia Guillaume
Valduriez Patrick
Publication venue: HAL CCSD
Publication date: 02/04/2007
Field of study

International audienceSharing huge databases in distributed systems is inherently difficult. As the amount of stored data increases, data localization techniques become no longer sufficient. A more efficient approach is to rely on compact database summaries rather than raw database records, whose access is costly in large distributed systems. In this paper, we propose PeerSum, a new service for managing summaries over shared data in large P2P and Grid applications. Our summaries are synthetic, multidimensional views with two main virtues. First, they can be directly queried and used to approximately answer a query without exploring the original data. Second, as semantic indexes, they support locating relevant nodes based on data content. Our main contribution is to define a summary model for P2P systems, and the algorithms for summary management. Our performance evaluation shows that the cost of query routing is minimized, while incurring a low cost of summary maintenance

INRIA a CCSD electronic archive server

Grid Data Management: Open Problems and New Issues

Author: Mattoso Marta
Pacitti Esther
Valduriez Patrick
Publication venue: Springer Verlag
Publication date: 01/01/2007
Field of study

International audienceInitially developed for the scientific community, Grid computing is now gaining much interest in important areas such as enterprise information systems. This makes data management critical since the techniques must scale up while addressing the autonomy, dynamicity and heterogeneity of the data sources. In this paper, we discuss the main open problems and new issues related to Grid data management. We first recall the main principles behind data management in distributed systems and the basic techniques. Then we make precise the requirements for Grid data management. Finally, we introduce the main techniques needed to address these requirements. This implies revisiting distributed database techniques in major ways, in particular, using P2P techniques

CiteSeerX

INRIA a CCSD electronic archive server

Summary Management in P2P Systems

Author: Hayek Rabab
Mouaddib Noureddine
Raschia Guillaume
Valduriez Patrick
Publication venue: HAL CCSD
Publication date: 01/03/2008
Field of study

INRIA a CCSD electronic archive server

Gestion de résumés de données dans les systèmes pair–pair

Author: Hayek Rabab
Mouaddib Noureddine
Raschia Guillaume
Valduriez Patrick
Publication venue: 'Lavoisier'
Publication date: 01/01/2008
Field of study

International audienceIn this paper, we propose managing data summaries in unstructured P2P systems. Our summaries are intelligible views with two main virtues. First, they can be directly queried and used to approximately answer a query. Second, as semantic indexes, they support locating relevant nodes based on data content. The performance evaluation of our proposal shows that the cost of query routing is minimized, while incurring a low cost of summary maintenance.Dans ce travail, nous proposons de maintenir des résumés de données dans les systèmes P2P non structurés. Nos résumés sont des vues intelligibles ayant un double avantage en traitement de requête. Ils peuvent soit répondre d'une manière approximative à une requête, soit guider sa propagation vers les pairs pertinents en se basant sur le contenu des données. L'évaluation de performance de notre proposition a montré que le coût de requêtes est largement réduit, sans induire des côuts élevés de maintenance de résumés

INRIA a CCSD electronic archive server

Query processing in P2P systems

Author: Akbarinia Reza
Pacitti Esther
Valduriez Patrick
Publication venue: HAL CCSD
Publication date: 01/01/2007
Field of study

Peer-to-peer (P2P) computing offers new opportunities for building highly distributed data systems. Unlike client-server computing, P2P is a very dynamic environment where peers can join and leave the network at any time. This yields important advantages such as operation without central coordination, peers autonomy, and scale up to large number of peers. However, providing high-level data management services is difficult. Most techniques designed in distributed database systems which statically exploit schema and network information no longer apply. New techniques are needed which should be decentralized, dynamic and self-adaptive. In this paper, we survey the techniques which have been developed for query processing in P2P systems. We first give an overview of the existing P2P networks, and com-pare their properties from the perspective of data management. Then, we discuss the ap-proaches which are used for schema mapping. Then, we describe the algorithms which have been proposed for query routing. In particular, we focus on query routing in unstructured net-works and DHTs. Finally, we present the techniques which have been proposed for processing complex queries, e.g. top-k queries, in P2P systems, in particular in DHTs

INRIA a CCSD electronic archive server

Peersum : Gestion des résumés de données dans les systèmes P2P

Author: Hayek Rabab
Mouaddib Noureddine
Raschia Guillaume
Valduriez Patrick
Publication venue: HAL CCSD
Publication date: 01/11/2007
Field of study

Base de Données Avancées (BDA)National audienceSharing huge, massively distributed databases in P2P systems is inherently difficult. As the amount of stored data increases, data localization techniques become no longer sufficient. A practical approach is to rely on compact database summaries rather than raw database records, whose access is costly in large P2P systems. In this paper, we consider summaries that are synthetic, multidimensional views with two main virtues. First, they can be directly queried and used to approximately answer a query without exploring the original data. Second, as semantic indexes, they support locating relevant nodes based on data content. The main contribution of this paper is to define an efficient algorithm for partitioning an unstructured P2P network into domains, in order to optimally distribute summaries in the network. Then, we propose a distributed algorithm for maintaining a summary in a given domain. Our performance evaluation shows that the cost of query routing is minimized, while incurring a low cost of summary maintenance

INRIA a CCSD electronic archive server

Managing Linguistic Data Summaries in Advanced P2P Applications

Author: D. Comer
E.H. Ruspini
J. Han
J.L. Bentley
L.A. Zadesh
W.H. Press
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2010
Field of study

chapitre... à corrigerAs the amount of stored data increases, data localization techniques become no longer sufficient in P2P systems. A practical approach is to rely on compact database summaries rather than raw database records, whose access is costly in large P2P systems. In this chapter, we describe a solution for managing linguistic data summaries in advanced P2P applications which are dealing with semantically rich data. The produced summaries are synthetic, multidimensional views over relational tables. The novelty of this proposal relies on the double summary exploitation in distributed P2P systems. First, as semantic indexes, they support locating relevant nodes based on their data descriptions. Second, due to their intelligibility, these summaries can be directly queried and thus approximately answer a query without the need for exploring original data. The proposed solution consists first in defining a summary model for hierarchical P2P systems. Second, appropriate algorithms for summary creation and maintenance are presented. A query processing mechanism, which relies on summary querying, is then proposed to demonstrate the benefits that might be obtained from summary exploitation

Crossref

INRIA a CCSD electronic archive server