4 research outputs found

    An Extensible Technique for High-Precision Testing of Recovery Code

    Get PDF
    Thorough testing of software systems requires ways to productively employ fault injection. We describe a technique for automatically identifying the errors exposed by shared libraries, finding good injection targets in program binaries, and producing corresponding injection scenarios. We present a framework for writing precise custom triggers that inject the desired faults--in the form of error return codes and corresponding side effects--at the boundary between shared libraries and applications. We incorporated these ideas in the LFI tool chain. With no developer assistance and no access to source code, this new version of LFI found 11 serious, previously unreported bugs in the BIND name server, the Git version control system, the MySQL database server, and the PBFT replication system. LFI achieved entirely automatically 35%-60% improvement in recovery-code coverage, without requiring any new tests. LFI can be downloaded from http://lfi.epfl.ch

    Uma abordagem para o teste de dependabilidade de sistemas MapReduce com base em casos de falha representativos

    Get PDF
    Resumo: Os sistemas MapReduce facilitam a utilização de um grande número de máquinas para processar uma grande quantidade de dados, e têm sido utilizados por diversas aplicações, que incluem desde ferramentas de pesquisa até sistemas comerciais e financeiros. Uma das principais características dos sistemas MapReduce é abstrair problemas relacionados ao ambiente distribuído, tais como a distribuição do processamento e a tolerância a falhas. Com isso, torna-se imprescindível garantir a dependabilidade dos sistemas MapReduce, ou seja, garantir que esses sistemas funcionem corretamente mesmo na presença de falhas. Por outro lado, a falta de determinismo de um ambiente distribuído e a falta de confiabilidade do ambiente físico, podem gerar erros nos sistemas MapReduce que sejam difíceis de serem encontrados, entendidos e corrigidos. Esta tese apresenta a primeira abordagem conhecida para o teste de dependabilidade para sistemas MapReduce. Este trabalho apresenta uma definição para o teste de dependabilidade, uma modelagem do mecanismo de tolerância a falhas do MapReduce, um processo para gerar casos de falha representativos a partir de um modelo, e uma plataforma de teste para automatizar a execução de casos de falha em um ambiente distribuído. Este trabalho ainda apresenta uma nova abordagem para modelar componentes distribuídos usando redes de Petri. Essa nova abordagem permite representar a dinâmica dos componentes e a independência de suas ações e estados. Resultados experimentais são apresentados e mostram que os casos de falha gerados a partir do modelo são representativos para o teste do sistema Hadoop, principal implementação de código aberto do MapReduce. Através dos experimentos, diversos erros são encontrados no Hadoop, e os resultados também comprovam que a plataforma de teste automatiza a execução dos casos de falha representativos. Além disso, a plataforma apresenta as propriedades requeridas para uma plataforma de teste, que são a controlabilidade, medição temporal, não-intrusividade, repetibilidade, e a eficácia na identificação de sistemas com erros

    Execution Synthesis: A Technique for Automating the Debugging of Software

    Get PDF
    Debugging real systems is hard, requires deep knowledge of the target code, and is time-consuming. Bug reports rarely provide sufficient information for debugging, thus forcing developers to turn into detectives searching for an explanation of how the program could have arrived at the reported failure state. This thesis introduces execution synthesis, a technique for automating this detective work: given a program and a bug report, execution synthesis automatically produces an execution of the program that leads to the reported bug symptoms. Using a combination of static analysis and symbolic execution, the technique “synthesizes” a thread schedule and various required program inputs that cause the bug to manifest. The synthesized execution can be played back deterministically in a regular debugger, like gdb. This is particularly useful in debugging concurrency bugs, because it transforms otherwise non-deterministic bugs into bugs that can be deterministically observed in a debugger. Execution synthesis requires no runtime recording, and no program or hardware modifications, thus incurring no runtime overhead. This makes it practical for use in production systems. This thesis includes a theoretical analysis of execution synthesis as well as empirical evidence that execution synthesis is successful in starting from mere bug reports and reproducing on its own concurrency and memory safety bugs in real systems, taking on the order of minutes. This thesis also introduces reverse execution synthesis, an automated debugging technique that takes a coredump obtained after a failure and automatically computes the suffix of an execution that leads to that coredump. Reverse execution synthesis generates the necessary information to then play back this suffix in a debugger deterministically as many times as needed to complete the debugging process. Since it synthesizes an execution suffix instead of the entire execution, reverse execution is particularly well suited for arbitrarily long executions in which the failure and its root cause occur within a short time span, so developers can use a short execution suffix to debug the problem. The thesis also shows how execution synthesis can be combined with recording techniques in order to automatically classify data races and to efficiently debug deadlock bugs

    Techniques for Identifying Elusive Corner-Case Bugs in Systems Software

    Get PDF
    Modern software is plagued by elusive corner-case bugs (e.g., security bugs). Because there are no scalable, automated ways of finding them, such bugs can remain hidden until software is deployed in production. This thesis proposes approaches to solve this problem. First, we present black-box and white-box fault injection mechanisms, which allow developers to test the behavior of their own code in the presence of failures in external components, e.g., in libraries, in the kernel, or in remote nodes of a distributed system. We describe how to make black-box fault injection more efficient, by prioritizing tests based on their estimated impact. For white-box testing, we proposed and implemented a technique to find Trojan messages in distributed systems, i.e., messages that are accepted as valid by receiver nodes, yet cannot be sent by any correct sender node. We show that Trojan messages can lead to subtle semantic bugs. We used fault injection techniques to find new bugs in systems such as the MySQL database, the Apache HTTP server, the FSP file service protocol suite, and the PBFT Byzantine-fault-tolerant replication library. Testing can find bugs and build confidence in the correctness of a system. However, exhaustive testing is often unfeasible, and therefore testing may not discover all bugs before a system is deployed. In the second part of this thesis, we describe how to automatically harden production systems, reducing the impact of any corner-case bugs missed by testing. We present a framework that reduces the overhead cost of instrumentation tools such as memory error detectors. Lowering the cost enables system developers to use such tools in production to harden their systems, reducing the impact of any remaining corner-case bugs. We used our framework to generate a version of the Linux kernel hardened with Address Sanitizer. Our hardened kernel has most of the benefit of full instrumentation: it detects the same vulnerabilities as full instrumentation (7 out of 11 privilege escalation exploits from 2013-2014 can be detected using instrumentation tools). Yet, it obtains these benefits at only a quarter of the overhead
    corecore