11 research outputs found

    Enhancing Data Processing on Clouds with Hadoop/HBase

    Get PDF
    In the current information age, large amounts of data are being generated and accumulated rapidly in various industrial and scientific domains. This imposes important demands on data processing capabilities that can extract sensible and valuable information from the large amount of data in a timely manner. Hadoop, the open source implementation of Google's data processing framework (MapReduce, Google File System and BigTable), is becoming increasingly popular and being used to solve data processing problems in various application scenarios. However, being originally designed for handling very large data sets that can be divided easily in parts to be processed independently with limited inter-task communication, Hadoop lacks applicability to a wider usage case. As a result, many projects are under way to enhance Hadoop for different application needs, such as data warehouse applications, machine learning and data mining applications, etc. This thesis is one such research effort in this direction. The goal of the thesis research is to design novel tools and techniques to extend and enhance the large-scale data processing capability of Hadoop/HBase on clouds, and to evaluate their effectiveness in performance tests on prototype implementations. Two main research contributions are described. The first contribution is a light-weight computational workflow system called "CloudWF" for Hadoop. The second contribution is a client library called "HBaseSI" supporting transactional snapshot isolation (SI) in HBase, Hadoop's database component. CloudWF addresses the problem of automating the execution of scientific workflows composed of both MapReduce and legacy applications on clouds with Hadoop/HBase. CloudWF is the first computational workflow system built directly using Hadoop/HBase. It uses novel methods in handling workflow directed acyclic graph decomposition, storing and querying dependencies in HBase sparse tables, transparent file staging, and decentralized workflow execution management relying on the MapReduce framework for task scheduling and fault tolerance. HBaseSI addresses the problem of maintaining strong transactional data consistency in HBase tables. This is the first SI mechanism developed for HBase. HBaseSI uses novel methods in handling distributed transactional management autonomously by individual clients. These methods greatly simplify the design of HBaseSI and can be generalized to other column-oriented stores with similar architecture as HBase. As a result of the simplicity in design, HBaseSI adds low overhead to HBase performance and directly inherits many desirable properties of HBase. HBaseSI is non-intrusive to existing HBase installations and user data, and is designed to work with a large cloud in terms of data size and the number of nodes in the cloud

    Enabling distributed key-value stores with low latency-impact snapshot support

    Get PDF
    Current distributed key-value stores generally provide greater scalability at the expense of weaker consistency and isolation. However, additional isolation support is becoming increasingly important in the environments in which these stores are deployed, where different kinds of applications with different needs are executed, from transactional workloads to data analytics. While fully-fledged ACID support may not be feasible, it is still possible to take advantage of the design of these data stores, which often include the notion of multiversion concurrency control, to enable them with additional features at a much lower performance cost and maintaining its scalability and availability. In this paper we explore the effects that additional consistency guarantees and isolation capabilities may have on a state of the art key-value store: Apache Cassandra. We propose and implement a new multiversioned isolation level that provides stronger guarantees without compromising Cassandra's scalability and availability. As shown in our experiments, our version of Cassandra allows Snapshot Isolation-like transactions, preserving the overall performance and scalability of the system.This work is partially supported by the Ministry of Science and Technology of Spain and the European Union’s FEDER funds (TIN2007-60625, TIN2012-34557), by the Generalitat de Catalunya (2009-SGR-980), by the BSC-CNS Severo Ochoa program (SEV-2011-00067), by the HiPEAC European Network of Excellence (IST- 004408, FP7-ICT-217068, FP7-ICT-287759), and by IBM through the 2008 and 2010 IBM Faculty Award program.Peer ReviewedPostprint (author’s final draft

    Multi-Master Replication for Snapshot Isolation Databases

    Get PDF
    Lazy replication with snapshot isolation (SI) has emerged as a popular choice for distributed databases. However, lazy replication requires the execution of update transactions at one (master) site so that it is relatively easy for a total SI order to be determined for consistent installation of updates in the lazily replicated system. We propose a set of techniques that support update transaction execution over multiple partitioned sites, thereby allowing the master to scale. Our techniques determine a total SI order for update transactions over multiple master sites without requiring global coordination in the distributed system, and ensure that updates are installed in this order at all sites to provide consistent and scalable replication with SI. We have built our techniques into PostgreSQL and demonstrate their effectiveness through experimental evaluation.1 yea

    Consistency Models in Distributed Systems with Physical Clocks

    Get PDF
    Most existing distributed systems use logical clocks to order events in the implementation of various consistency models. Although logical clocks are straightforward to implement and maintain, they may affect the scalability, availability, and latency of the system when being used to totally order events in strong consistency models. They can also incur considerable overhead when being used to track and check the causal relationships among events in some weak consistency models. In this thesis we explore how to efficiently implement different consistency models using loosely synchronized physical clocks. Compared with logical clocks, physical clocks move forward at approximately the same speed and can be loosely synchronized with well-known standard protocols. Hence a group of physical clocks located at different servers can be used to order events in a distributed system at very low cost. We first describe Clock-SI, a fully distributed implementation of snapshot isolation for partitioned data stores. It uses the local physical clock at each partition to assign snapshot and commit timestamps to transactions. By avoiding a centralized service for timestamp management, Clock-SI improves the throughput, latency, and availability of the system. We then introduce Clock-RSM, which is a low-latency state machine replication protocol that provides linearizability. It totally orders state machine commands by assigning them physical timestamps obtained from the local replica. By eliminating the message step for command ordering in existing solutions, Clock-RSM reduces the latency of consistent geo-replication across multiple data centers. Finally, we present Orbe, which provides an efficient and scalable implementation of causal consistency for both partitioned and replicated data stores. Orbe builds an explicit total order, consistent with causality, among all operations using physical timestamps. It reduces the number of dependencies that have to be carried in update replication messages and checked on installation of replicated updates. As a result, Orbe improves the throughput of the system

    Multi-constraint scheduling of MapReduce workloads

    Get PDF
    In recent years there has been an extraordinary growth of large-scale data processing and related technologies in both, industry and academic communities. This trend is mostly driven by the need to explore the increasingly large amounts of information that global companies and communities are able to gather, and has lead the introduction of new tools and models, most of which are designed around the idea of handling huge amounts of data. A good example of this trend towards improved large-scale data processing is MapReduce, a programming model intended to ease the development of massively parallel applications, and which has been widely adopted to process large datasets thanks to its simplicity. While the MapReduce model was originally used primarily for batch data processing in large static clusters, nowadays it is mostly deployed along with other kinds of workloads in shared environments in which multiple users may be submitting concurrent jobs with completely different priorities and needs: from small, almost interactive, executions, to very long applications that take hours to complete. Scheduling and selecting tasks for execution is extremely relevant in MapReduce environments since it governs a job's opportunity to make progress and determines its performance. However, only basic primitives to prioritize between jobs are available at the moment, constantly causing either under or over-provisioning, as the amount of resources needed to complete a particular job are not obvious a priori. This thesis aims to address both, the lack of management capabilities and the increased complexity of the environments in which MapReduce is executed. To that end, new models and techniques are introduced in order to improve the scheduling of MapReduce in the presence of different constraints found in real-world scenarios, such as completion time goals, data locality, hardware heterogeneity, or availability of resources. The focus is on improving the integration of MapReduce with the computing infrastructures in which it usually runs, allowing alternative techniques for dynamic management and provisioning of resources. More specifically, it is focused in three scenarios that are incremental in its scope. First, it studies the prospects of using high-level performance criteria to manage and drive the performance of MapReduce applications, taking advantage of the fact that MapReduce is executed in controlled environments in which the status of the cluster is known. Second, it examines the feasibility and benefits of making the MapReduce runtime more aware of the underlying hardware and the characteristics of applications. And finally, it also considers the interaction between MapReduce and other kinds of workloads, proposing new techniques to handle these increasingly complex environments. Following these three items described above, this thesis contributes to the management of MapReduce workloads by 1) proposing a performance model for MapReduce workloads and a scheduling algorithm that leverages the proposed model and is able to adapt depending on the various needs of its users in the presence of completion time constraints; 2) proposing a new resource model for MapReduce and a placement algorithm aware of the underlying hardware as well as the characteristics of the applications, capable of improving cluster utilization while still being guided by job performance metrics; and 3) proposing a model for shared environments in which MapReduce is executed along with other kinds of workloads such as transactional applications, and a scheduler aware of these workloads and its expected demand of resources, capable of improving resource utilization across machines while observing completion time goals

    Migração de aplicações legads para bases de dados NOSQL

    Get PDF
    Dissertação de mestrado em Engenharia de InformáticaEnfrentando o atual crescimento exponencial do volume de dados originados pelos serviços Web, assiste-se hoje a uma revolução no mundo das bases de dados. De facto, a procura por soluções que permitam de forma escalável a persistência de grandes volumes de dados, levou ao recente aparecimento de soluções como o Dynamo ou a Cassandra, caracterizados pelos seus modelos de coerência e de uso. Com arquiteturas desenhadas para enfrentar cenários de falha, tal como novos modelos de dados construídos para abrigar a atual natureza dinâmica da informação, estas são hoje vistas como uma alternativa viável às tradicionais bases de dados relacionais. No entanto, a mudança para este novo paradigma é hoje um desafio. Os seus novos modelos de dados e interfaces de utilização obrigam a uma mudança radical de mentalidade dos programadores quando oriundos do modelo relacional. Tais soluções delegam também para o lado do programador novas responsabilidades no seu uso com a ausência de garantias transacionais, a perda das relações explicitamente expressas nos dados e o controlo de parâmetros como as definições de coerência por operação. Nesta dissertação pretendemos assim avaliar e tentar resolver esta separação entre os paradigmas, relacional e não relacional, observando através de um caso concreto quais as alterações exigidas no modelo e operações. Partindo do modelo tradicional, observa-se o modo como a mudança para uma solução não relacional afeta o desenvolvimento do caso de estudo ao nível do modelo, complexidade de implementação e desempenho. Com base nesta avaliação, propomos assim o desenvolvimento de uma solução de mapeamento de objetos. Esta fornecerá uma abstração da camada de dados subjacente permitindo ao programador uma mais fácil construção de aplicações escaláveis. Através do desenvolvimento deste componente, pretende-se assim a criação de uma solução que una a escalabilidade de uma base de dados não relacional e a interface de programação característica das soluções de mapeamento de objetos.As a result of the current exponential growth of the Web and associated data and services, we assist today to a profound revolution in database management systems. New database systems like Cassandra or Dynamo are emerging as a response to the need of large data storage systems. Based on architectures that embrace eventual failure scenarios and novel data models built to deal with the dynamic nature of Web data, these new systems represent today a viable alternative to relational databases Nonetheless, the change to these new systems doesn’t come without a cost. To the developer, these systems with their novel models and API represent a necessary change of mindset when departing from traditional databases. In fact, they imply new responsibilities for him, as he now faces the maintenance of data relations on the client side, lower transactional guaranties and the new complexity associated with factors such as the consistency definitions. The basis to this dissertation is then to evaluate and propose a solution to the gap between the new and the old paradigms. Departing from a relational solution we assess how the change to a non-relational product a ects the development of an actual use case in terms of the used model, programming complexity and performance. Based on this assessment, we then present an object mapping solution, that while abstracting the underlying data layer, o ers the developer a method for the construction of scalable systems. Through its development we expect to combine the scalability of a non relational database and the simple programming interface of object-relational mapping solutions

    Raphtory: Modelling, Maintenance and Analysis of Distributed Temporal Graphs.

    Get PDF
    PhD ThesesTemporal graphs capture the development of relationships within data throughout time. This model ts naturally within a streaming architecture, where new events can be inserted directly into the graph upon arrival from a data source and be compared to related entities or historical state. However, the majority of graph processing systems only consider traditional graph analysis on static data, whilst those which do expand past this often only support batched updating and delta analysis across graph snapshots. In this work we de ne a temporal property graph model and the semantics for updating it in both a distributed and non-distributed context. We have built Raphtory, a distributed temporal graph analytics platform which maintains the full graph history in memory, leveraging the de ned update semantics to insert streamed events directly into the model without batching or centralised ordering. In parallel with the ingestion, traditional and time-aware analytics may be performed on the most up-to-date version of the graph, as well as any point throughout its history. The depth of history viewed from the perspective of a time point may also be varied to explore both short and long term patterns within the data. Through this we extract novel insights over a variety of use cases, including phenomena never seen before in social networks. Finally, we demonstrate Raphtory's ability to scale both vertically and horizontally, handling consistent throughput in excess of 100,000 updates a second alongside the ingestion and maintenance of graphs built from billions of events