16 research outputs found
An Efficient Architecture for Information Retrieval in P2P Context Using Hypergraph
Peer-to-peer (P2P) Data-sharing systems now generate a significant portion of
Internet traffic. P2P systems have emerged as an accepted way to share enormous
volumes of data. Needs for widely distributed information systems supporting
virtual organizations have given rise to a new category of P2P systems called
schema-based. In such systems each peer is a database management system in
itself, ex-posing its own schema. In such a setting, the main objective is the
efficient search across peer databases by processing each incoming query
without overly consuming bandwidth. The usability of these systems depends on
successful techniques to find and retrieve data; however, efficient and
effective routing of content-based queries is an emerging problem in P2P
networks. This work was attended as an attempt to motivate the use of mining
algorithms in the P2P context may improve the significantly the efficiency of
such methods. Our proposed method based respectively on combination of
clustering with hypergraphs. We use ECCLAT to build approximate clustering and
discovering meaningful clusters with slight overlapping. We use an algorithm
MTMINER to extract all minimal transversals of a hypergraph (clusters) for
query routing. The set of clusters improves the robustness in queries routing
mechanism and scalability in P2P Network. We compare the performance of our
method with the baseline one considering the queries routing problem. Our
experimental results prove that our proposed methods generate impressive levels
of performance and scalability with with respect to important criteria such as
response time, precision and recall.Comment: 2o pages, 8 figure
Design of PeerSum: a Summary Service for P2P Applications
International audienceSharing huge databases in distributed systems is inherently difficult. As the amount of stored data increases, data localization techniques become no longer sufficient. A more efficient approach is to rely on compact database summaries rather than raw database records, whose access is costly in large distributed systems. In this paper, we propose PeerSum, a new service for managing summaries over shared data in large P2P and Grid applications. Our summaries are synthetic, multidimensional views with two main virtues. First, they can be directly queried and used to approximately answer a query without exploring the original data. Second, as semantic indexes, they support locating relevant nodes based on data content. Our main contribution is to define a summary model for P2P systems, and the algorithms for summary management. Our performance evaluation shows that the cost of query routing is minimized, while incurring a low cost of summary maintenance
Summary Management in P2P Systems
International audienceSharing huge, massively distributed databases in P2P systems is inherently difficult. As the amount of stored data increases, data localization techniques become no longer suf- ficient. A practical approach is to rely on compact database summaries rather than raw database records, whose access is costly in large P2P systems. In this paper, we consider summaries that are synthetic, multidimensional views with two main virtues. First, they can be directly queried and used to approximately answer a query without exploring the original data. Second, as semantic indexes, they support locating relevant nodes based on data content. Our main contribution is to define a summary model for P2P systems, and the appropriate algorithms for summary management. Our performance evaluation shows that the cost of query routing is minimized, while incurring a low cost of summary maintenance
PeerSum: a Summary Service for P2P Applications
International audienceSharing huge databases in distributed systems is inherently difficult. As the amount of stored data increases, data localization techniques become no longer sufficient. A practical approach is to rely on compact database summaries rather than raw database records, whose access is costly in large distributed systems. In this paper, we propose PeerSum, a new service for managing summaries over shared data in large P2P and Grid applications. Our summaries are synthetic, multidimensional views with two main virtues. First, they can be directly queried and used to approximately answer a query without exploring the original data. Second, as semantic indexes, they support locating relevant nodes based on data content. Our main contribution is to define a summary model for P2P systems, and the algorithms for summary management. Our performance evaluation shows that the cost of query routing is minimized, while incurring a low cost of summary maintenance