274 research outputs found

    Query processing in peer-to-peer based data management system

    No full text
    Ph.DDOCTOR OF PHILOSOPH

    The essence of P2P: A reference architecture for overlay networks

    Get PDF
    The success of the P2P idea has created a huge diversity of approaches, among which overlay networks, for example, Gnutella, Kazaa, Chord, Pastry, Tapestry, P-Grid, or DKS, have received specific attention from both developers and researchers. A wide variety of algorithms, data structures, and architectures have been proposed. The terminologies and abstractions used, however, have become quite inconsistent since the P2P paradigm has attracted people from many different communities, e.g., networking, databases, distributed systems, graph theory, complexity theory, biology, etc. In this paper we propose a reference model for overlay networks which is capable of modeling different approaches in this domain in a generic manner. It is intended to allow researchers and users to assess the properties of concrete systems, to establish a common vocabulary for scientific discussion, to facilitate the qualitative comparison of the systems, and to serve as the basis for defining a standardized API to make overlay networks interoperable

    UniStore: Querying a DHT-based Universal Storage

    Get PDF
    In recent time, the idea of collecting and combining large public data sets and services became more and more popular. The special characteristics of such systems and the requirements of the participants demand for strictly decentralized solutions. However, this comes along with several ambitious challenges a corresponding system has to overcome. In this demonstration paper, we present a light-weight distributed universal storage capable of dealing with those challenges, and providing a powerful and flexible way of building Internet-scale public data management systems. We introduce our approach based on a triple storage on top of a DHT overlay system, based on the ideas of a universal relation model and RDF, outline solved challenges and open issues, and present usage as well as demonstration aspects of the platform

    A web-based approach to engineering adaptive collaborative applications

    Get PDF
    Current methods employed to develop collaborative applications have to make decisions and speculate about the environment in which the application will operate within, the network infrastructure that will be used and the device type the application will operate on. These decisions and assumptions about the environment in which collaborative applications were designed to work are not ideal. These methods produce collaborative applications that are characterised as being inflexible, working on homogeneous networks and single platforms, requiring pre-existing knowledge of the data and information types they need to use and having a rigid choice of architecture. On the other hand, future collaborative applications are required to be flexible; to work in highly heterogeneous environments; be adaptable to work on different networks and on a range of device types. This research investigates the role that the Web and its various pervasive technologies along with a component-based Grid middleware can play to address these concerns. The aim is to develop an approach to building adaptive collaborative applications that can operate on heterogeneous and changing environments. This work proposes a four-layer model that developers can use to build adaptive collaborative applications. The four-layer model is populated with Web technologies such as Scalable Vector Graphics (SVG), the Resource Description Framework (RDF), Protocol and RDF Query Language (SPARQL) and Gridkit, a middleware infrastructure, based on the Open Overlays concept. The Middleware layer (the first layer of the four-layer model) addresses network and operating system heterogeneity, the Group Communication layer enables collaboration and data sharing, while the Knowledge Representation layer proposes an interoperable RDF data modelling language and a flexible storage facility with an adaptive architecture for heterogeneous data storage. And finally there is the Presentation and Interaction layer which proposes a framework (Oea) for scalable and adaptive user interfaces. The four layer model has been successfully used to build a collaborative application, called Wildfurt that overcomes challenges facing collaborative applications. This research has demonstrated new applications for cutting-edge Web technologies in the area of building collaborative applications. SVG has been used for developing superior adaptive and scalable user interfaces that can operate on different device types. RDF and RDFS, have also been used to design and model collaborative applications providing a mechanism to define classes and properties and the relationships between them. A flexible and adaptable storage facility that is able to change its architecture based on the surrounding environments and requirements has also been achieved by combining the RDF technology with the Open Overlays middleware, Gridkit

    Research issues in peer-to-peer data management

    Get PDF
    Data management in Peer-to-Peer (P2P) systems is a complicated and challenging issue due to the scale of the network and highly transient population of peers. In this paper, we identify important research problems in P2P data management, and describe briefly some methods that have appeared in the literature addressing those problems. We also discuss some open research issues and directions regarding data management in P2P systems. ©2007 IEEE

    Modular P2P-Based Approach for RDF Data Storage and Retrieval

    Get PDF
    International audienceOne of the key elements of the Semantic Web is the Resource Description Framework (RDF). Efficient storage and retrieval of RDF data in large scale settings is still challenging and existing solutions are monolithic and thus not very flexible from a software engineering point of view. In this paper, we propose a modular system, based on the scalable Content-Addressable Network (CAN), which gives the possibility to store and retrieve RDF data in large scale settings. We identified and isolated key components forming such system in our design architecture. We have evaluated our system using the Grid'5000 testbed over 300 peers on 75 machines and the outcome of these micro-benchmarks show interesting results in terms of scalability and concurrent queries

    Data Sharing in P2P Systems

    Get PDF
    To appear in Springer's "Handbook of P2P Networking"In this chapter, we survey P2P data sharing systems. All along, we focus on the evolution from simple file-sharing systems, with limited functionalities, to Peer Data Management Systems (PDMS) that support advanced applications with more sophisticated data management techniques. Advanced P2P applications are dealing with semantically rich data (e.g. XML documents, relational tables), using a high-level SQL-like query language. We start our survey with an overview over the existing P2P network architectures, and the associated routing protocols. Then, we discuss data indexing techniques based on their distribution degree and the semantics they can capture from the underlying data. We also discuss schema management techniques which allow integrating heterogeneous data. We conclude by discussing the techniques proposed for processing complex queries (e.g. range and join queries). Complex query facilities are necessary for advanced applications which require a high level of search expressiveness. This last part shows the lack of querying techniques that allow for an approximate query answering

    Data Management in the APPA System

    Get PDF
    International audienceCombining Grid and P2P technologies can be exploited to provide high-level data sharing in large-scale distributed environments. However, this combination must deal with two hard problems: the scale of the network and the dynamic behavior of the nodes. In this paper, we present our solution in APPA (Atlas Peer-to-Peer Architecture), a data management system with high-level services for building large-scale distributed applications. We focus on data availability and data discovery which are two main requirements for implementing large-scale Grids. We have validated APPA's services through a combination of experimentation over Grid5000, which is a very large Grid experimental platform, and simulation using SimJava. The results show very good performance in terms of communication cost and response time
    corecore