Search CORE

28 research outputs found

GridVine: an Infrastructure for Peer Information Management

Author: Aberer Karl
Agarwal Suchit
Cudré-Mauroux Philippe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/10/2007
Field of study

GridVine is a semantic overlay infrastructure based on a peer-to-peer (P2P) access structure. Built following the principle of data independence, it separates a logical layer — in which data, schemas, and schema mappings are managed — from a physical layer consisting of a structured P2P network supporting decentralized indexing, key load-balancing, and efficient routing. The system is decentralized, yet fosters semantic interoperability through pair-wise schema mappings and query reformulation. GridVine’s heterogeneous but semantically related information sources can be queried transparently using iterative query reformulation. The authors discuss a reference implementation of the system and several mechanisms for resolving queries collaboratively

Infoscience - École polytechnique fédérale de Lausanne

PicShark: mitigating metadata scarcity through large-scale P2P collaboration

Author: Aberer Karl
Budura Adriana
Cudré-Mauroux Philippe
Hauswirth Manfred
Publication venue
Publication date: 18/06/2018
Field of study

With the commoditization of digital devices, personal information and media sharing is becoming a key application on the pervasive Web. In such a context, data annotation rather than data production is the main bottleneck. Metadata scarcity represents a major obstacle preventing efficient information processing in large and heterogeneous communities. However, social communities also open the door to new possibilities for addressing local metadata scarcity by taking advantage of global collections of resources. We propose to tackle the lack of metadata in large-scale distributed systems through a collaborative process leveraging on both content and metadata. We develop a community-based and self-organizing system called PicShark in which information entropy—in terms of missing metadata—is gradually alleviated through decentralized instance and schema matching. Our approach focuses on semi-structured metadata and confines computationally expensive operations to the edge of the network, while keeping distributed operations as simple as possible to ensure scalability. PicShark builds on structured Peer-to-Peer networks for distributed look-up operations, but extends the application of self-organization principles to the propagation of metadata and the creation of schema mappings. We demonstrate the practical applicability of our method in an image sharing scenario and provide experimental evidences illustrating the validity of our approac

Irish Universities

RERO DOC Digital Library

PicShark: Mitigating Metadata Scarcity Through Large-Scale P2P Collaboration

Author: Aberer Karl
Budura Adriana
Cudre-Mauroux Philippe
Hauswirth Manfred
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/06/2008
Field of study

Abstract With the commoditization of digital devices, personal information and media sharing is becoming a key application on the pervasive Web. In such a context, data annotation rather than data production is the main bottleneck. Metadata scarcity represents a major obstacle preventing effcient information processing in large and heterogeneous communities. However, social communities also open the door to new possibilities for addressing local metadata scarcity by taking advantage of global collections of resources. We propose to tackle the lack of metadata in large-scale distributed systems through a collaborative process leveraging on both content and metadata. We develop a community-based and self-organizing system called PicShark in which information entropy in terms of missing metadata is gradually alleviated through decentralized instance and schema matching. Our approach focuses on semi- structured metadata and confines computationally expensive operations to the edge of the network, while keeping distributed operations as simple as possible to ensure scalability. PicShark builds on structured Peer-to-Peer networks for distributed look-up operations, but extends the application of self-organization principles to the propagation of metadata and the creation of schema mappings. We demonstrate the practical applicability of our method in an image sharing scenario and provide experimental evidences illustrating the validity of our approach

Infoscience - École polytechnique fédérale de Lausanne

Access to Research at National University of Ireland, Galway

An Efficient Architecture for Information Retrieval in P2P Context Using Hypergraph

Author: Durand Nicolas
Hajjar Mohammad
Ismail Anis
Nachouki Gilles
Quafafou Mohamed
Publication venue
Publication date: 01/01/2011
Field of study

Peer-to-peer (P2P) Data-sharing systems now generate a significant portion of Internet traffic. P2P systems have emerged as an accepted way to share enormous volumes of data. Needs for widely distributed information systems supporting virtual organizations have given rise to a new category of P2P systems called schema-based. In such systems each peer is a database management system in itself, ex-posing its own schema. In such a setting, the main objective is the efficient search across peer databases by processing each incoming query without overly consuming bandwidth. The usability of these systems depends on successful techniques to find and retrieve data; however, efficient and effective routing of content-based queries is an emerging problem in P2P networks. This work was attended as an attempt to motivate the use of mining algorithms in the P2P context may improve the significantly the efficiency of such methods. Our proposed method based respectively on combination of clustering with hypergraphs. We use ECCLAT to build approximate clustering and discovering meaningful clusters with slight overlapping. We use an algorithm MTMINER to extract all minimal transversals of a hypergraph (clusters) for query routing. The set of clusters improves the robustness in queries routing mechanism and scalability in P2P Network. We compare the performance of our method with the baseline one considering the queries routing problem. Our experimental results prove that our proposed methods generate impressive levels of performance and scalability with with respect to important criteria such as response time, precision and recall.Comment: 2o pages, 8 figure

arXiv.org e-Print Archive

HAL AMU

Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks

Author: Idreos S. (Stratos)
Koubarakis M. (Manolis)
Liarou E. (Erietta)
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2006
Field of study

We study the problem of evaluating conjunctive queries com- posed of triple patterns over RDF data stored in distributed hash tables. Our goal is to develop algorithms that scale to large amounts of RDF data, distribute the query processing load evenly and incur little network traﬃc. We present and evaluate two novel query processing algorithms with these possibly conﬂicting goals in mind. We discuss the various tradeoﬀs that occur in our setting through a detailed experimental eval- uation of the proposed algorithms

CWI's Institutional Repository

RDF Data Indexing and Retrieval: A survey of Peer-to-Peer based solutions

Author: Baude Françoise
Bongiovanni Francesco
Filali Imen
Huet Fabrice
Publication venue: HAL CCSD
Publication date: 26/11/2010
Field of study

The Semantic Web enables the possibility to model, create and query resources found on the Web. Enabling the full potential of its technologies at the Internet level requires infrastructures that can cope with scalability challenges and support various types of queries. The attractive features of the Peer-to-Peer (P2P) communication model such as decentralization, scalability, fault-tolerance seems to be a natural solution to deal with these challenges. Consequently, the combination of the Semantic Web and the P2P model can be a highly innovative attempt to harness the strengths of both technologies and come up with a scalable infrastructure for RDF data storage and retrieval. In this respect, this survey details the research works that adopt this combination and gives an insight on how to deal with the RDF data at the indexing and querying levels.Le Web Sémantique permet de modéliser, créer et faire des requêtes sur les ressources disponibles sur le Web. Afin de permettre à ses technologies d'exploiter leurs potentiels à l'échelle de l'Internet, il est nécessaire qu'elles reposent sur des infrastructures qui puissent passer à l'échelle ainsi que de répondre aux exigences d'expressivité des types de requêtes qu'elles offrent. Les bonnes propriétés qu'offrent les dernières générations de systèmes pair-à- pair en termes de décentralisation, de tolérance aux pannes ainsi que de passage à l'échelle en font d'eux des candidats prometteurs. La combinaison du modèle pair-à-pair et des technologies du Web Sémantique est une tentative innovante ayant pour but de fournir une infrastructure capable de passer à l'échelle et pouvant stocker et rechercher des données de type RDF. Dans ce contexte, ce rapport présente un état de l'art et discute en détail des travaux autour de systèmes pair-à-pair qui traitent des données de type RDF à large échelle. Nous détaillons leurs mécanismes d'indexation de données ainsi que le traitement des divers types de requêtes offerts

HAL-UNICE

INRIA a CCSD electronic archive server

A schema-based peer-to-peer infrastructure for digital library networks

Author: Siberski Wolf
Publication venue: Hannover : Gottfried Wilhelm Leibniz Universität Hannover
Publication date: 01/01/2006
Field of study

[no abstract

Institutionelles Repositorium der Leibniz Universität Hannover