11 research outputs found

    Scalable storage for a DBMS using transparent distribution

    Get PDF
    Scalable Distributed Data Structures (SDDSs) provide a self-managing and self-organizing data storage of potentially unbounded size. This stands in contrast to common distribution schemas deployed in conventional distributed DBMS. SDDSs, however, have mostly been used in synthetic scenarios to investigate their properties. In this paper we concentrate on the integration of the LH* SDDS into our efficient and extensible DBMS, called Monet. We show that this merge permits processing very large sets of distributed data. In our implementation we extended the relational algebra interpreter in such a way that access to data, whether it is distributed or locally stored, is transparent to the user. The on-the-fly optimization of operations --- heavily used in Monet --- to deploy different strategies and scenarios inside the primary operators associated with an SDDS adds self-adaptiveness to the query system; it dynamically adopts itself to unforeseen situations. We illustrate the performance efficiency by experiments on a network of workstations. The transparent integration of SDDSs opens new perspectives for very large self-managing database systems

    Distributed Search Trees: Fault Tolerance in an Asynchronous Environment

    Get PDF
    We propose a distributed dictionary that allows insert and search operations and that tolerates arbitrary single server crashes. The distinctive feature of our model is that the crash of a server cannot be detected. This is in contrast to all other proposals of distributed fault-tolerant search structures presented thus far. It reflects the real situation in the internet more accurately, and is in general more suitable to complex overall conditions. This makes our solution fundamentally different from all previous ones, but also more complicated. We present in detail the algorithms for searching, insertion, and graceful recovery of crashed server

    Distributed search trees: Fault tolerance in an asynchronous environment

    Get PDF
    ISSN:1432-4350ISSN:1433-049

    A self-organizing access structure for P2P information systems

    Get PDF
    Peer-To-Peer systems are driving a major paradigm shift in the era of genuinely distributed computing. Gnutella is a good example of a Peer-To-Peer success story: a rather simple software enables Internet users to freely exchange files, such as MP3 music files. But it shows up also some of the limitations of current P2P information systems with respect to their ability to manage data efficiently. In this paper we introduce P-Grid, a scalable access structure that is specifically designed for Peer-To-Peer information systems. P-Grids are constructed and maintained by using randomized algorithms strictly based on local interactions, provide reliable data access even with unreliable peers, and scale gracefully both in storage and communication cost. Keywords: Peer-To-Peer computing, Distributed Indexing, Distributed Databases, Randomized Algorithms

    Distributing a search tree among a growing number of processors

    No full text

    Gestion des données distribuées avec le langage de règles Webdamlog

    Get PDF
    Notre but est de permettre à un utilisateur du Web d organiser la gestionde ses données distribuées en place, c est à dire sans l obliger à centraliserses données chez un unique hôte. Par conséquent, notre système diffèrede Facebook et des autres systèmes centralisés, et propose une alternativepermettant aux utilisateurs de lancer leurs propres pairs sur leurs machinesgérant localement leurs données personnelles et collaborant éventuellementavec des services Web externes.Dans ma thèse, je présente Webdamlog, un langage dérivé de datalogpour la gestion de données et de connaissances distribuées. Le langage étenddatalog de plusieurs manières, principalement avec une nouvelle propriété ladélégation, autorisant les pairs à échanger non seulement des faits (les données)mais aussi des règles (la connaissance). J ai ensuite mené une étude utilisateurpour démontrer l utilisation du langage. Enfin je décris le moteur d évaluationde Webdamlog qui étend un moteur d évaluation de datalog distribué nomméBud, en ajoutant le support de la délégation et d autres innovations tellesque la possibilité d avoir des variables pour les noms de pairs et des relations.J aborde de nouvelles techniques d optimisation, notamment basées sur laprovenance des faits et des règles. Je présente des expérimentations quidémontrent que le coût du support des nouvelles propriétés de Webdamlogreste raisonnable même pour de gros volumes de données. Finalement, jeprésente l implémentation d un pair Webdamlog qui fournit l environnementpour le moteur. En particulier, certains adaptateurs permettant aux pairsWebdamlog d échanger des données avec d autres pairs sur Internet. Pourillustrer l utilisation de ces pairs, j ai implémenté une application de partagede photos dans un réseau social en Webdamlog.Our goal is to enable aWeb user to easily specify distributed data managementtasks in place, i.e. without centralizing the data to a single provider. Oursystem is therefore not a replacement for Facebook, or any centralized system,but an alternative that allows users to launch their own peers on their machinesprocessing their own local personal data, and possibly collaborating with Webservices.We introduce Webdamlog, a datalog-style language for managing distributeddata and knowledge. The language extends datalog in a numberof ways, notably with a novel feature, namely delegation, allowing peersto exchange not only facts but also rules. We present a user study thatdemonstrates the usability of the language. We describe a Webdamlog enginethat extends a distributed datalog engine, namely Bud, with the supportof delegation and of a number of other novelties of Webdamlog such as thepossibility to have variables denoting peers or relations. We mention noveloptimization techniques, notably one based on the provenance of facts andrules. We exhibit experiments that demonstrate that the rich features ofWebdamlog can be supported at reasonable cost and that the engine scales tolarge volumes of data. Finally, we discuss the implementation of a Webdamlogpeer system that provides an environment for the engine. In particular, a peersupports wrappers to exchange Webdamlog data with non-Webdamlog peers.We illustrate these peers by presenting a picture management applicationthat we used for demonstration purposes.PARIS11-SCD-Bib. électronique (914719901) / SudocSudocFranceF

    A Content-Addressable Network for Similarity Search in Metric Spaces

    Get PDF
    Because of the ongoing digital data explosion, more advanced search paradigms than the traditional exact match are needed for contentbased retrieval in huge and ever growing collections of data produced in application areas such as multimedia, molecular biology, marketing, computer-aided design and purchasing assistance. As the variety of data types is fast going towards creating a database utilized by people, the computer systems must be able to model human fundamental reasoning paradigms, which are naturally based on similarity. The ability to perceive similarities is crucial for recognition, classification, and learning, and it plays an important role in scientific discovery and creativity. Recently, the mathematical notion of metric space has become a useful abstraction of similarity and many similarity search indexes have been developed. In this thesis, we accept the metric space similarity paradigm and concentrate on the scalability issues. By exploiting computer networks and applying the Peer-to-Peer communication paradigms, we build a structured network of computers able to process similarity queries in parallel. Since no centralized entities are used, such architectures are fully scalable. Specifically, we propose a Peer-to-Peer system for similarity search in metric spaces called Metric Content-Addressable Network (MCAN) which is an extension of the well known Content-Addressable Network (CAN) used for hash lookup. A prototype implementation of MCAN was tested on real-life datasets of image features, protein symbols, and text — observed results are reported. We also compared the performance of MCAN with three other, recently proposed, distributed data structures for similarity search in metric spaces
    corecore