2,489 research outputs found

    A Survey on the Evolution of Stream Processing Systems

    Full text link
    Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. This survey provides a comprehensive overview of fundamental aspects of stream processing systems and their evolution in the functional areas of out-of-order data management, state management, fault tolerance, high availability, load management, elasticity, and reconfiguration. We review noteworthy past research findings, outline the similarities and differences between early ('00-'10) and modern ('11-'18) streaming systems, and discuss recent trends and open problems.Comment: 34 pages, 15 figures, 5 table

    The End of Slow Networks: It's Time for a Redesign

    Full text link
    Next generation high-performance RDMA-capable networks will require a fundamental rethinking of the design and architecture of modern distributed DBMSs. These systems are commonly designed and optimized under the assumption that the network is the bottleneck: the network is slow and "thin", and thus needs to be avoided as much as possible. Yet this assumption no longer holds true. With InfiniBand FDR 4x, the bandwidth available to transfer data across network is in the same ballpark as the bandwidth of one memory channel, and it increases even further with the most recent EDR standard. Moreover, with the increasing advances of RDMA, the latency improves similarly fast. In this paper, we first argue that the "old" distributed database design is not capable of taking full advantage of the network. Second, we propose architectural redesigns for OLTP, OLAP and advanced analytical frameworks to take better advantage of the improved bandwidth, latency and RDMA capabilities. Finally, for each of the workload categories, we show that remarkable performance improvements can be achieved

    Byzantine state machine replication for the masses

    Get PDF
    Tese de doutoramento, Informática (Ciência da Computação), Universidade de Lisboa, Faculdade de Ciências, 2018The state machine replication technique is a popular approach for building Byzantine fault-tolerant services. However, despite the widespread adoption of this paradigm for crash fault-tolerant systems, there are still few examples of this paradigm for real Byzantine fault-tolerant systems. Our view of this situation is that there is a lack of robust implementations of Byzantine fault-tolerant state machine replication middleware, and that the performance penalty is too high, specially for geo-replication. These hindrances are tightly coupled to the distributed protocols used for enforcing such resilience. This thesis has the objective of finding methodologies for enhancing robustness and performance of state machine replication systems. The first contribution is Mod-SMaRt, a modular protocol that preserves optimal latency in terms of the communications steps exchanged among processes. By being a modular protocol, it becomes simpler to validate and implement, thus resulting in greater robustness; by also preserving optimal message-exchanges among processes, the protocol is capable of delivering desirable performance. The second contribution is concerned with implementing Mod-SMaRt into BFTSMART, a reliable and high-performance codebase that was maintained and improved over the entire course of the PhD that offers multicore-awareness, reconfiguration support, and a flexible API. The third contribution presents WHEAT, a protocol derived from Mod-SMaRt that uses optimizations shown to be effective in reducing latency via a practical evaluation conducted in a geo distributed environment. We additionally conducted an evaluation of both BFT-SMART and WHEAT applied to a relational database middleware and an ordering service for a permissioned blockchain platform. These evaluations revealed encouraging results for both systems and validated our work conducted in the geo-distributed context.A técnica de replicação máquina de estados é um paradigma popular usado em vários sistemas distribuídos modernos. No entanto, apesar da adoção deste paradigma em sistemas reais tolerantes a faltas por paragem, ainda existem poucos exemplos de sistemas reais tolerantes a faltas bizantinas. Segundo a nossa experiência nesta área de investigação, isto deve-se ao fato de existirem poucas concretizações robustas para replicação máquina de estados tolerante a faltas bizantinas, assim como uma perda de desempenho demasiado elevada, especialmente em ambientes geo-replicados. A razão fundamental para a existência destes obstáculos vem dos protocolos distribuídos necessários para assegurar este tipo de resiliência. Esta tese tem como objetivo explorar metodologias para a robustez e eficiência da replicação máquina de estados. A primeira contribuição da tese é o algoritmo Mod-SMaRt, um protocolo modular que preserva latência ótima em termos de passos de comunicação executados pelos processos. Sendo um protocolo modular, torna-se mais simples de validar e concretizar, o que resulta em maior robustez; ao preservar troca de mensagens ótima entre processos, também é capaz de entregar um desempenho desejável. A segunda contribuição consiste em concretizar o protocolo Mod SMaRt na ferramenta BFT-SMART, uma biblioteca fiável de alto desempenho, mantida e melhorada ao longo de todo o período correspondente ao doutoramento, capaz de suportar arquiteturas multi-núcleo, reconfiguração do grupo de réplicas, e uma API de programação flexível. A terceira contribuição consiste em um protocolo derivado do Mod-SMaRt designado WHEAT, que usa otimizações que demostraram serem eficientes na redução da latência segundo uma avaliação prática em ambiente geo-replicado. Adicionalmente, foram também realizadas avaliações de ambos os protocolos quando aplicados num middleware para base de dados relacionais, e num serviço de ordenação para uma plataforma blockchain. Ambas as avaliações revelam resultados encorajadores para ambos os sistemas e validam o trabalho realizado em contexto geo-distribuído.Projeto IRCoC (PTDC/EEI-SCR/6970/2014); Comissão Europeia, FP7 (Seventh Framework Programme for Research and Technological Development), projetos FP7/2007-2013, ICT-25724

    Master of Science

    Get PDF
    thesisEfficient movement of massive amounts of data over high-speed networks at high throughput is essential for a modern-day in-memory storage system. In response to the growing needs of throughput and latency demands at scale, a new class of database systems was developed in recent years. The development of these systems was guided by increased access to high throughput, low latency network fabrics, and declining cost of Dynamic Random Access Memory (DRAM). These systems were designed with On-Line Transactional Processing (OLTP) workloads in mind, and, as a result, are optimized for fast dispatch and perform well under small request-response scenarios. However, massive server responses such as those for range queries and data migration for load balancing poses challenges for this design. This thesis analyzes the effects of large transfers on scale-out systems through the lens of a modern Network Interface Card (NIC). The present-day NIC offers new and exciting opportunities and challenges for large transfers, but using them efficiently requires smart data layout and concurrency control. We evaluated the impact of modern NICs in designing data layout by measuring transmit performance and full system impact by observing the effects of Direct Memory Access (DMA), Remote Direct Memory Access (RDMA), and caching improvements such as Intel® Data Direct I/O (DDIO). We discovered that use of techniques such as Zero Copy yield around 25% savings in CPU cycles and a 50% reduction in the memory bandwidth utilization on a server by using a client-assisted design with records that are not updated in place. We also set up experiments that underlined the bottlenecks in the current approach to data migration in RAMCloud and propose guidelines for a fast and efficient migration protocol for RAMCloud

    SoK: Consensus in the Age of Blockchains

    Get PDF
    The core technical component of blockchains is consensus: how to reach agreement among a distributed network of nodes. A plethora of blockchain consensus protocols have been proposed---ranging from new designs, to novel modifications and extensions of consensus protocols from the classical distributed systems literature. The inherent complexity of consensus protocols and their rapid and dramatic evolution makes it hard to contextualize the design landscape. We address this challenge by conducting a systematization of knowledge of blockchain consensus protocols. After first discussing key themes in classical consensus protocols, we describe: (i) protocols based on proof-of-work; (ii) proof-of-X protocols that replace proof-of-work with more energy-efficient alternatives; and (iii) hybrid protocols that are compositions or variations of classical consensus protocols. This survey is guided by a systematization framework we develop, to highlight the various building blocks of blockchain consensus design, along with a discussion on their security and performance properties. We identify research gaps and insights for the community to consider in future research endeavours

    Acquisition of computer research equipment

    Get PDF
    Issued as Final report, Project no. G-36-61
    corecore