5 research outputs found

    Efficient Processing of Continuous Join Queries using Distributed Hash Tables

    Get PDF
    International audienceThis paper addresses the problem of computing approximate answers to continuous join queries. We present a new method, called DHTJoin, which combines hash-based placement of tuples in a Distributed Hash Table (DHT) and dissemination of queries using a gossip style protocol. We provide a performance evaluation of DHTJoin which shows that DHTJoin can achieve significant performance gains in terms of network traffic

    Continuous Multi-Way Joins over Distributed Hash Tables

    Get PDF
    This paper studies the problem of evaluating continuous multi-way joins on top of Distributed Hash Tables (DHTs). We present a novel algorithm, called recursive join (RJoin), that takes into account various parameters crucial in a distributed setting i.e., network traffic, query processing load distribution, storage load distribution etc. The key idea of RJoin is incremental evaluation: as relevant tuples arrive continuously, a given multi-way join is rewritten continuously into a join with fewer join operators, and is assigned continuously to different nodes of the network. In this way, RJoin distributes the responsibility of evaluating a continuous multi-way join to many network nodes by assigning parts of the evaluation of each binary join to a different node depending on the values of the join attributes. The actual nodes to be involved are decided by RJoin dynamically after taking into account the rate of incoming tuples with values equal to the values of the joined attributes. RJoin also supports sliding window joins which is a crucial feature, especially for long join paths, since it provides a mechanism to reduce the query processing state and thus keep the cost of handling incoming tuples stable. In addition, RJoin is able to handle message delays due to heavy network traffic. We present a detailed mathematical and experimental analysis of RJoin and study the performance tradeoffs that occur

    DHTJoin: Processing Continuous Join Queries Using DHT Networks

    Get PDF
    International audienceContinuous query processing in data stream management systems (DSMS) has received considerable attention recently. Many applications share the same need for processing data streams in a continuous fashion. For most distributed streaming applications, the centralized processing of continuous queries over distributed data is simply not viable. This paper addresses the problem of computing approximate answers to continuous join queries over distributed data streams. We present a new method, called DHTJoin, which combines hash-based placement of tuples in a Distributed Hash Table (DHT) and dissemination of queries by exploiting the embedded trees in the underlying DHT, thereby incuring little overhead. DHTJoin also deals with join attribute value skew which may hurt load balancing and result completeness. We provide a performance evaluation of DHTJoin which shows that it can achieve significant performance gains in terms of network traffic

    Επεξεργασία συνεχών επερωτήσεων σε σημασιολογικά δεδομένα πάνω από κατανεμημένους πίνακες κατακερματισμού

    Get PDF
    Στην παρούσα εργασία παρουσιάζω την υλοποίηση ενός συστήματος επεξεργασίας συνεχών επερωτήσεων σε σημασιολογικά δεδομένα πάνω από κατανεμημένους πίνακες κατακερματισμού. Το σύστημα εκτελείται πάνω από κατανεμημένο δίκτυο, κι υλοποιεί τον αλγόριθμο ευρετηρίασης των εισερχόμενων επερωτήσεων σε κόμβους Rjoin, ενώ χρησιμοποιεί και τους αλγορίθμους CQC και CSBV για σύγκριση σε ταχύτητα και πλήθος μηνυμάτων. Το σύστημα κατανεμημένων πινάκων κατακερματισμού Pastry παρέχει τις λειτουργίες επικοινωνίας και μεταφοράς δεδομένων μεταξύ των κόμβων.In this work I present the implementation of a system for processing continuous queries on semantic data over distributed hash tables. The system is executed on a distributed network, and implements the RJoin algorithm for indexing incoming queries to nodes while also using the CQC and CSBV algorithms for speed and communication costs comparison. Communication and data transfer services are supplied by the distributed hash table system Pastry
    corecore