12 research outputs found

    Improving the Performance of Big Data Analytics Platforms by Task and I/O Granularity Adjustment

    Get PDF
    Department of Computer Science and EngineeringWith the massive increase in the amount of semi-structured and unstructured web data, big data analytics platforms have emerged and started to evolve rapidly. Apache Hadoop has been developed for batch processing on a large dataset, and systems for interactive and general purpose applications have been developed alongside NoSQL databases. Numerous efforts have been made to improve the performance of Hadoop and NoSQL databases, including utilizing a new device called NVMM for NoSQL databases. Nonetheless, their performance is still far from satisfactory due to inadequate granularity for tasks and I/O. In this dissertation, we present novel techniques to improve the performance of Apache Hadoop and NVMM-based LSM-tree by adjusting task and I/O granularity. First, we analyze YARN container overhead and present dynamic input split size adjustment scheme, which can logically combine multiple HDFS blocks and increase the input size of each container, thereby enabling a single map wave and reducing the number of containers and their initialization overhead. Experimental results shows that we can avoid recurring container overhead by selecting the right size for input splits and reducing the number of containers. Second, we present a novel HDFS block coalescing scheme that mitigates the YARN con tainer overhead. Our assorted block coalescing scheme combines multiple HDFS blocks and creates large input splits of various sizes, reducing the number of containers and their initializa tion overhead. Our experimental study shows the block coalescing scheme significantly reduces the container overhead while it achieves good load balancing and job scheduling fairness without impairing the degree of overlap between map phase and reduce phase. Third, we discuss design choice of using NVMM for indexing structure in NoSQL databases and present ZipperDB, a key-value store that redesigns LSM-tree for byte-addressable persistent memory. To benefit from the byte-addressability of persistent memory, ZipperDB employs byte addressable persistent SkipLists and performs Zipper Compaction, a novel in-place compaction algorithm that merges two adjacent persistent SkipLists without compromising the failure atomicity. The byte-addressable compaction helps mitigate the write amplification problem, which is known to be the root cause of the write stall problem in LSM-tree. Finally, we present ListDB, a write-optimized key-value store for NVMM to overcome the gap between DRAM and NVMM write latencies and thereby, resolve the write stall problem. ListDB consists of three novel techniques: (i) byte-addressable Index-Unified Logging, which incrementally converts write-ahead logs into SkipLists, (ii) Braided SkipList, a simple NUMA aware SkipList that effectively reduces the NUMA effects of NVMM, and (iii) NUMA-aware Zipper Compaction. Using the three techniques, ListDB makes background flush and com paction fast enough to resolve the infamous write stall problem and shows 1.6x and 25x higher write throughputs than PACTree and Intel Pmem-RocksDB, respectively.ope

    Efficient data reconfiguration for today's cloud systems

    Get PDF
    Performance of big data systems largely relies on efficient data reconfiguration techniques. Data reconfiguration operations deal with changing configuration parameters that affect data layout in a system. They could be user-initiated like changing shard key, block size in NoSQL databases, or system-initiated like changing replication in distributed interactive analytics engine. Current data reconfiguration schemes are heuristics at best and often do not scale well as data volume grows. As a result, system performance suffers. In this thesis, we show that {\it data reconfiguration mechanisms can be done in the background by using new optimal or near-optimal algorithms coupling them with performant system designs}. We explore four different data reconfiguration operations affecting three popular types of systems -- storage, real-time analytics and batch analytics. In NoSQL databases (storage), we explore new strategies for changing table-level configuration and for compaction as they improve read/write latencies. In distributed interactive analytics engines, a good replication algorithm can save costs by judiciously using memory that is sufficient to provide the highest throughput and low latency for queries. Finally, in batch processing systems, we explore prefetching and caching strategies that can improve the number of production jobs meeting their SLOs. All these operations happen in the background without affecting the fast path. Our contributions in each of the problems are two-fold -- 1) we model the problem and design algorithms inspired from well-known theoretical abstractions, 2) we design and build a system on top of popular open source systems used in companies today. Finally, using real-life workloads, we evaluate the efficacy of our solutions. Morphus and Parqua provide several 9s of availability while changing table level configuration parameters in databases. By halving memory usage in distributed interactive analytics engine, Getafix reduces cost of deploying the system by 10 million dollars annually and improves query throughput. We are the first to model the problem of compaction and provide formal bounds on their runtime. Finally, NetCachier helps 30\% more production jobs to meet their SLOs compared to existing state-of-the-art

    Multi-constraint scheduling of MapReduce workloads

    Get PDF
    In recent years there has been an extraordinary growth of large-scale data processing and related technologies in both, industry and academic communities. This trend is mostly driven by the need to explore the increasingly large amounts of information that global companies and communities are able to gather, and has lead the introduction of new tools and models, most of which are designed around the idea of handling huge amounts of data. A good example of this trend towards improved large-scale data processing is MapReduce, a programming model intended to ease the development of massively parallel applications, and which has been widely adopted to process large datasets thanks to its simplicity. While the MapReduce model was originally used primarily for batch data processing in large static clusters, nowadays it is mostly deployed along with other kinds of workloads in shared environments in which multiple users may be submitting concurrent jobs with completely different priorities and needs: from small, almost interactive, executions, to very long applications that take hours to complete. Scheduling and selecting tasks for execution is extremely relevant in MapReduce environments since it governs a job's opportunity to make progress and determines its performance. However, only basic primitives to prioritize between jobs are available at the moment, constantly causing either under or over-provisioning, as the amount of resources needed to complete a particular job are not obvious a priori. This thesis aims to address both, the lack of management capabilities and the increased complexity of the environments in which MapReduce is executed. To that end, new models and techniques are introduced in order to improve the scheduling of MapReduce in the presence of different constraints found in real-world scenarios, such as completion time goals, data locality, hardware heterogeneity, or availability of resources. The focus is on improving the integration of MapReduce with the computing infrastructures in which it usually runs, allowing alternative techniques for dynamic management and provisioning of resources. More specifically, it is focused in three scenarios that are incremental in its scope. First, it studies the prospects of using high-level performance criteria to manage and drive the performance of MapReduce applications, taking advantage of the fact that MapReduce is executed in controlled environments in which the status of the cluster is known. Second, it examines the feasibility and benefits of making the MapReduce runtime more aware of the underlying hardware and the characteristics of applications. And finally, it also considers the interaction between MapReduce and other kinds of workloads, proposing new techniques to handle these increasingly complex environments. Following these three items described above, this thesis contributes to the management of MapReduce workloads by 1) proposing a performance model for MapReduce workloads and a scheduling algorithm that leverages the proposed model and is able to adapt depending on the various needs of its users in the presence of completion time constraints; 2) proposing a new resource model for MapReduce and a placement algorithm aware of the underlying hardware as well as the characteristics of the applications, capable of improving cluster utilization while still being guided by job performance metrics; and 3) proposing a model for shared environments in which MapReduce is executed along with other kinds of workloads such as transactional applications, and a scheduler aware of these workloads and its expected demand of resources, capable of improving resource utilization across machines while observing completion time goals

    Updatable Learned Indexes Meet Disk-Resident DBMS -- From Evaluations to Design Choices

    Full text link
    Although many updatable learned indexes have been proposed in recent years, whether they can outperform traditional approaches on disk remains unknown. In this study, we revisit and implement four state-of-the-art updatable learned indexes on disk, and compare them against the B+-tree under a wide range of settings. Through our evaluation, we make some key observations: 1) Overall, the B+-tree performs well across a range of workload types and datasets. 2) A learned index could outperform B+-tree or other learned indexes on disk for a specific workload. For example, PGM achieves the best performance in write-only workloads while LIPP significantly outperforms others in lookup-only workloads. We further conduct a detailed performance analysis to reveal the strengths and weaknesses of these learned indexes on disk. Moreover, we summarize the observed common shortcomings in five categories and propose four design principles to guide future design of on-disk, updatable learned indexes: (1) reducing the index's tree height, (2) better data structures to lower operation overheads, (3) improving the efficiency of scan operations, and (4) more efficient storage layout.Comment: 22 page

    Towards Lifelong Reasoning with Sparse and Compressive Memory Systems

    Get PDF
    Humans have a remarkable ability to remember information over long time horizons. When reading a book, we build up a compressed representation of the past narrative, such as the characters and events that have built up the story so far. We can do this even if they are separated by thousands of words from the current text, or long stretches of time between readings. During our life, we build up and retain memories that tell us where we live, what we have experienced, and who we are. Adding memory to artificial neural networks has been transformative in machine learning, allowing models to extract structure from temporal data, and more accurately model the future. However the capacity for long-range reasoning in current memory-augmented neural networks is considerably limited, in comparison to humans, despite the access to powerful modern computers. This thesis explores two prominent approaches towards scaling artificial memories to lifelong capacity: sparse access and compressive memory structures. With sparse access, the inspection, retrieval, and updating of only a very small subset of pertinent memory is considered. It is found that sparse memory access is beneficial for learning, allowing for improved data-efficiency and improved generalisation. From a computational perspective - sparsity allows scaling to memories with millions of entities on a simple CPU-based machine. It is shown that memory systems that compress the past to a smaller set of representations reduce redundancy and can speed up the learning of rare classes and improve upon classical data-structures in database systems. Compressive memory architectures are also devised for sequence prediction tasks and are observed to significantly increase the state-of-the-art in modelling natural language

    Semantic stream computing for large data-set analytics

    Get PDF
    One of the most interesting research fields in IT industries is the quest of software able to deal efficiently with Big Data. In this thesis, we have studied Big Data especially about their spread in healthcare, where data are generated at a high rate and the use of computer science can save lives

    Robust Stream Indexing

    Get PDF
    Kontinuierliche Datenströme stehen im Zentrum von vielen anspruchsvollen und komplexen Anwendungen. Neben der Online-Verarbeitung durch ein Datenstromsystem mĂŒssen Datenströme auch langfristig in einer Datenbank gespeichert werden. Moderne Hardware kann Datenströme meist mit sehr hohem Durchsatz und geringer Latenz persistieren. Allerdings mĂŒssen Teile des Datenstroms auch effizient abgerufen werden können, um Wissen aus den Daten zu extrahieren. Jahrzehntelange Forschung hat zu einer unglaublichen Vielfalt an Indexstrukturen gefĂŒhrt, die fĂŒr viele spezifische Anwendungen Anfragekosten reduzieren können. Obwohl die Effizienz von Datenstrom-Indexstrukturen erheblich verbessert wurde, ist die Steigerung ihrer Robustheit nach wie vor eine große Herausforderung, da das kontinuierliche Eintreffen von Daten eine stĂ€ndige Wartung von Indexstrukturen zur Folge hat. Diese Wartung verbraucht Ressourcen, was zu einer geringeren oder schwankenden Leistung von regulĂ€ren EinfĂŒge- und Anfrageoperationen fĂŒhrt. Eine Steigerung der Robustheit kann die Betriebskosten erheblich senken und die Benutzbarkeit verbessern. Das Hauptziel dieser Arbeit ist daher, die Robustheit von Datenstrom-Indexierung zu verbessern. B-BĂ€ume sind gut erforschte und weit verbreitete Indexstrukturen. Da sie ein zentraler Bestandteil vieler Datenbanksysteme sind, hat die Verbesserung der Robustheit von B-BĂ€umen eine weitreichende Wirkung. Wenn kontinuierlich neue Daten in B-BĂ€ume eingefĂŒgt werden, kommt es zur Aufspaltung von Knoten. FĂŒr einen durch Bulk-Loading neu erstellten B-Baum treten diese Aufspaltung in Wellen auf, welche sich auf EinfĂŒgeoperation und Anfragen auswirken. In dieser Arbeit wird gezeigt, dass durch Anpassungen an Bulk-Loading-Algorithmen diese Wellen reduziert oder eliminiert werden können. Auf Datenströme optimierte Indexstrukturen, wie Log-Structured Merge-Trees, vermeiden Wellen von Knotenaufspaltungen, die in B-BĂ€umen auftreten. Da diese Indexstrukturen jedoch aus mehreren Komponenten bestehen, mĂŒssen die Komponenten durch eine Merge-Operation zusammengefĂŒhrt werden, um Anfragekosten gering zu halten. Dies fĂŒhrt zu periodisch auftretender ReorganisationsaktivitĂ€t. Als Alternative wird in dieser Arbeit Continuous Merging vorgestellt. Die Hauptidee ist ein kontinuierlicher Mergesort-Algorithmus, der zu einer robusteren Leistung von Datenstrom-Indexierung fĂŒhrt. Datenstrom-Indexstrukturen sind oft Teil eines komplexeren Datenbanksystems. ChronicleDB ist ein Ereignisdatenbanksystem, welches fĂŒr das Schreiben von zeitlichen Datenströmen optimiert ist. Die Verbesserungen an B-BĂ€umen und Continuous Merging werden mit dem Gesamtdesign von ChronicleDB in Verbindung gebracht. DarĂŒber hinaus werden in dieser Arbeit allgemeine Verbesserungen an ChronicleDB vorgenommen, welche die Besonderheiten von zeitlichen Daten ausnutzen. Die Ergebnisse fĂŒhren zu einem robusteren Ereignisdatenbanksystem

    Mechanisms of 3-dimensional organisation of the human genome

    Get PDF
    corecore