14 research outputs found

    Distributed GraphLab: A Framework for Machine Learning in the Cloud

    Full text link
    While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill this critical void, we introduced the GraphLab abstraction which naturally expresses asynchronous, dynamic, graph-parallel computation while ensuring data consistency and achieving a high degree of parallel performance in the shared-memory setting. In this paper, we extend the GraphLab framework to the substantially more challenging distributed setting while preserving strong data consistency guarantees. We develop graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency. We also introduce fault tolerance to the GraphLab abstraction using the classic Chandy-Lamport snapshot algorithm and demonstrate how it can be easily implemented by exploiting the GraphLab abstraction itself. Finally, we evaluate our distributed implementation of the GraphLab abstraction on a large Amazon EC2 deployment and show 1-2 orders of magnitude performance gains over Hadoop-based implementations.Comment: VLDB201

    On Improving Distributed Pregel-like Graph Processing Systems

    Get PDF
    The considerable interest in distributed systems that can execute algorithms to process large graphs has led to the creation of many graph processing systems. However, existing systems suffer from two major issues: (1) poor performance due to frequent global synchronization barriers and limited scalability; and (2) lack of support for graph algorithms that require serializability, the guarantee that parallel executions of an algorithm produce the same results as some serial execution of that algorithm. Many graph processing systems use the bulk synchronous parallel (BSP) model, which allows graph algorithms to be easily implemented and reasoned about. However, BSP suffers from poor performance due to stale messages and frequent global synchronization barriers. While asynchronous models have been proposed to alleviate these overheads, existing systems that implement such models have limited scalability or retain frequent global barriers and do not always support graph mutations or algorithms with multiple computation phases. We propose barrierless asynchronous parallel (BAP), a new computation model that overcomes the limitations of existing asynchronous models by reducing both message staleness and global synchronization while retaining support for graph mutations and algorithms with multiple computation phases. We present GiraphUC, which implements our BAP model in the open source distributed graph processing system Giraph, and evaluate it at scale to demonstrate that BAP provides efficient and transparent asynchronous execution of algorithms that are programmed synchronously. Secondly, very few systems provide serializability, despite the fact that many graph algorithms require it for accuracy, correctness, or termination. To address this deficiency, we provide a complete solution that can be implemented on top of existing graph processing systems to provide serializability. Our solution formalizes the notion of serializability and the conditions under which it can be provided for graph processing systems. We propose a partition-based synchronization technique that enforces these conditions efficiently to provide serializability. We implement this technique into Giraph and GiraphUC to demonstrate that it is configurable, transparent to algorithm developers, and more performant than existing techniques.4 month

    Methods to Improve Applicability and Efficiency of Distributed Data-Centric Compute Frameworks

    Get PDF
    The success of modern applications depends on the insights they collect from their data repositories. Data repositories for such applications currently exceed exabytes and are rapidly increasing in size, as they collect data from varied sources - web applications, mobile phones, sensors and other connected devices. Distributed storage and data-centric compute frameworks have been invented to store and analyze these large datasets. This dissertation focuses on extending the applicability and improving the efficiency of distributed data-centric compute frameworks

    Dynamic re-optimization techniques for stream processing engines and object stores

    Get PDF
    Large scale data storage and processing systems are strongly motivated by the need to store and analyze massive datasets. The complexity of a large class of these systems is rooted in their distributed nature, extreme scale, need for real-time response, and streaming nature. The use of these systems on multi-tenant, cloud environments with potential resource interference necessitates fine-grained monitoring and control. In this dissertation, we present efficient, dynamic techniques for re-optimizing stream-processing systems and transactional object-storage systems.^ In the context of stream-processing systems, we present VAYU, a per-topology controller. VAYU uses novel methods and protocols for dynamic, network-aware tuple-routing in the dataflow. We show that the feedback-driven controller in VAYU helps achieve high pipeline throughput over long execution periods, as it dynamically detects and diagnoses any pipeline-bottlenecks. We present novel heuristics to optimize overlays for group communication operations in the streaming model.^ In the context of object-storage systems, we present M-Lock, a novel lock-localization service for distributed transaction protocols on scale-out object stores to increase transaction throughput. Lock localization refers to dynamic migration and partitioning of locks across nodes in the scale-out store to reduce cross-partition acquisition of locks. The service leverages the observed object-access patterns to achieve lock-clustering and deliver high performance. We also present TransMR, a framework that uses distributed, transactional object stores to orchestrate and execute asynchronous components in amorphous data-parallel applications on scale-out architectures

    Building Scalable and Consistent Distributed Databases Under Conflicts

    Get PDF
    Distributed databases, which rely on redundant and distributed storage across multiple servers, are able to provide mission-critical data management services at large scale. Parallelism is the key to the scalability of distributed databases, but concurrent queries having conflicts may block or abort each other when strong consistency is enforced using rigorous concurrency control protocols. This thesis studies the techniques of building scalable distributed databases under strong consistency guarantees even in the face of high contention workloads. The techniques proposed in this thesis share a common idea, conflict mitigation, meaning mitigating conflicts by rescheduling operations in the concurrency control in the first place instead of resolving contending conflicts. Using this idea, concurrent queries under conflicts can be executed with high parallelism. This thesis explores this idea on both databases that support serializable ACID (atomic, consistency, isolation, durability) transactions, and eventually consistent NoSQL systems. First, the epoch-based concurrency control (ECC) technique is proposed in ALOHA-KV, a new distributed key-value store that supports high performance read-only and write-only distributed transactions. ECC demonstrates that concurrent serializable distributed transactions can be processed in parallel with low overhead even under high contention. With ECC, a new atomic commitment protocol is developed that only requires amortized one round trip for a distributed write-only transaction to commit in the absence of failures. Second, a novel paradigm of serializable distributed transaction processing is developed to extend ECC with read-write transaction processing support. This paradigm uses a newly proposed database operator, functors, which is a placeholder for the value of a key, which can be computed asynchronously in parallel with other functor computations of the same or other transactions. Functor-enabled ECC achieves more fine-grained concurrency control than transaction level concurrency control, and it never aborts transactions due to read-write or write-write conflicts but allows transactions to fail due to logic errors or constraint violations while guaranteeing serializability. Lastly, this thesis explores consistency in the eventually consistent system, Apache Cassandra, for an investigation of the consistency violation, referred to as "consistency spikes". This investigation shows that the consistency spikes exhibited by Cassandra are strongly correlated with garbage collection, particularly the "stop-the-world" phase in the Java virtual machine. Thus, delaying read operations arti cially at servers immediately after a garbage collection pause can virtually eliminate these spikes. All together, these techniques allow distributed databases to provide scalable and consistent storage service

    A Framework for Parallel Programming of Graph Applications Using Optimistic Concurrency Control

    Get PDF
    Στην παρούσα διπλωματική εργασία παρουσιάζουμε ένα framework που επιτρέπει στους χρήστες να αναπτύσσουν παράλληλα προγράμματα σε γράφους, υλοποιώντας διαδικασίες ως κλάσεις στην Java. Υποστηρίζονται γράφοι κάθε τύπου ενώ παρέχεται εύκολη πρόσβαση σε αυτούς μέσω ενός API που υλοποιήθηκε με χρήση του Redis [1]. Κάθε διαδικασία μπορεί να δημιουργήσει δευτερεύουσες διαδικασίες αφού ολοκληρωθεί και να επιστρέψει ένα αποτέλεσμα. Ταυτόχρονα, ο χρήστης μπορεί να ορίσει πώς μπορεί να συνδυαστεί αυτό το αποτέλεσμα με τις εξόδους των δευτερευουσών διαδικασιών για να παραχθεί η τελική έξοδος. Το framework χρησιμοποιεί αισιόδοξο έλεγχο ταυτοχρονισμού [2], ώστε να διασφαλίσει ότι δεν παραβιάζεται η ακεραιότητα των δεδομένων, γεγονός που αποκρύπτεται από το API πρόσβασης στους γράφους. Ένας διακομιστής, που καλείται manager server, χρησιμοποιείται για να εκτελεστούν οι διαδικασίες παράλληλα, αναθέτοντας την εκτέλεση τους σε άλλους διακομιστές, που ονομάζονται node servers. Ο manager server μπορεί να δεχτεί αιτήματα για εγγραφή ή διαγραφή ενός node server κατά τη διάρκεια της εκτέλεσης. Όλοι οι διακομιστές αναπτύχθηκαν σε Java 11 ως ανεξάρτητες εφαρμογές Spring Boot [3] και χτίστηκαν χρησιμοποιώντας Docker [4]. Ενδεχόμενες εξωτερικές εφαρμογές μπορούν να υποβάλουν αιτήματα στον manager server για την εκτέλεση διαδικασιών ή να αιτηθούν την δημιουργία διαδικασιών σε όλους τους κόμβους πολλαπλών γράφων. Κάθε node server αναλαμβάνει την εκτέλεση εργασιών και την παραγωγή της τελικής εξόδου των διεργασιών χρησιμοποιώντας πολλαπλά νήματα. Οι node servers εκτελούν τις διεργασίες ως ανεξάρτητες συναλλαγές και εφαρμόζουν τις αλλαγές μόνο αν εξασφαλιστεί ότι δεν υπάρχουν συγκρούσεις μεταξύ συναλλαγών. Σε αντίθετη περίπτωση η εκάστοτε συναλλαγή ματαιώνεται και η διεργασία εκτελείται ξανά. Τέλος, παρουσιάζουμε παραδείγματα χρήσης του framework και αξιολογούμε την απόδοσή του μέσα από ένα σύνολο πειραμάτων. Συμπεραίνουμε ότι το framework παρέχει σημαντικές βελτιώσεις στην απόδοση ακόμα και όταν οι συγκρούσεις είναι συχνές. Καθώς προσθέτουμε node servers, παρατηρούμε ότι ο έλεγχος για συγκρούσεις καθώς και οι εφαρμογή των αλλαγών αποτελούν το μεγαλύτερο μέρος του συνολικού χρόνου εκτέλεσης καθώς εκτελούνται από έναν node server την φορά.In this thesis, we present GOPPE, which is a framework that allows users to develop parallel graph applications by implementing executable tasks as Java classes. These tasks can manipulate graphs of any type through a provided storage API implemented using Redis [1]. Tasks can generate additional subtasks upon completion that are submitted automatically by the framework for parallel execution. It is also possible for tasks to produce a result object whereas the user can define a task output generation procedure that utilizes this result and the subtasks’ outputs to produce a final output. The framework internally makes use of optimistic concurrency control [2] to execute these tasks, ensuring that data integrity is never violated. Moreover, the storage API is designed to make concurrency control transparent to the user who is under the impression that he has full access to the underlying storage system. A dedicated server, namely the manager server, is used to orchestrate the parallel execution of tasks in a distributed manner by delegating requests to other registered servers, called the node servers. The manager server can accept requests to register or unregister node servers during runtime. All servers were developed in Java 11 as independent Spring Boot [3] applications and were built using Docker [4] for ease of deployment. External application programs can submit requests to the manager server to execute tasks and receive their output or request to generate tasks for all nodes in multiple graphs. Each node server implements a service used to execute tasks and generate task outputs using multiple threads. Node servers execute tasks as independent transactions and use a dedicated service to validate each transaction and commit in case of successful validation. When transaction conflicts are detected, the transaction is aborted and the task is executed again in a new transaction. Finally, we present example use cases and evaluate their performance through a set of experiments. We conclude that the framework provides considerable performance improvements as we add node servers, even when data contention is high. The total execution time is eventually dominated by time spent on the transaction validation and transaction commit phases since they are executed by one node server at a time
    corecore