642 research outputs found

    Aikido: Accelerating shared data dynamic analyses

    Get PDF
    Despite a burgeoning demand for parallel programs, the tools available to developers working on shared-memory multicore processors have lagged behind. One reason for this is the lack of hardware support for inspecting the complex behavior of these parallel programs. Inter-thread communication, which must be instrumented for many types of analyses, may occur with any memory operation. To detect such thread communication in software, many existing tools require the instrumentation of all memory operations, which leads to significant performance overheads. To reduce this overhead, some existing tools resort to random sampling of memory operations, which introduces false negatives. Unfortunately, neither of these approaches provide the speed and accuracy programmers have traditionally expected from their tools. In this work, we present Aikido, a new system and framework that enables the development of efficient and transparent analyses that operate on shared data. Aikido uses a hybrid of existing hardware features and dynamic binary rewriting to detect thread communication with low overhead. Aikido runs a custom hypervisor below the operating system, which exposes per-thread hardware protection mechanisms not available in any widely used operating system. This hybrid approach allows us to benefit from the low cost of detecting memory accesses with hardware, while maintaining the word-level accuracy of a software-only approach. To evaluate our framework, we have implemented an Aikido-enabled vector clock race detector. Our results show that the Aikido enabled race-detector outperforms existing techniques that provide similar accuracy by up to 6.0x, and 76% on average, on the PARSEC benchmark suite.National Science Foundation (U.S.) (NSF grant CCF-0832997)National Science Foundation (U.S.) (DOE SC0005288)United States. Defense Advanced Research Projects Agency (DARPA HR0011-10- 9-0009

    ScALPEL: A Scalable Adaptive Lightweight Performance Evaluation Library for application performance monitoring

    Get PDF
    As supercomputers continue to grow in scale and capabilities, it is becoming increasingly difficult to isolate processor and system level causes of performance degradation. Over the last several years, a significant number of performance analysis and monitoring tools have been built/proposed. However, these tools suffer from several important shortcomings, particularly in distributed environments. In this paper we present ScALPEL, a Scalable Adaptive Lightweight Performance Evaluation Library for application performance monitoring at the functional level. Our approach provides several distinct advantages. First, ScALPEL is portable across a wide variety of architectures, and its ability to selectively monitor functions presents low run-time overhead, enabling its use for large-scale production applications. Second, it is run-time configurable, enabling both dynamic selection of functions to profile as well as events of interest on a per function basis. Third, our approach is transparent in that it requires no source code modifications. Finally, ScALPEL is implemented as a pluggable unit by reusing existing performance monitoring frameworks such as Perfmon and PAPI and extending them to support both sequential and MPI applications.Comment: 10 pages, 4 figures, 2 table

    Source level debugging of dynamically translated programs

    Get PDF
    The capability to debug a program at the source level is useful and often indispensable. Debuggers usesophisticated techniques to provide a source view of a program, even though what is executing on the hard-ware is machine code. Debugging techniques evolve with significant changes in programming languagesand execution environments. Recently, software dynamic translation (SDT) has emerged as a new execu-tion mechanism. SDT inserts a run-time software layer between the program and the host machine, provid-ing flexibility in execution and program monitoring. Increasingly popular technologies that use thismechanism include dynamic optimization, dynamic instrumentation, security checking, binary translation,and host machine virtualization. However, the run-time program modifications in a SDT environment posesignificant challenges to a source level debugger. Currently debugging techniques do not exist for softwaredynamic translators. This thesis is the first to provide techniques for source level debugging of dynamically translatedprograms. The thesis proposes a novel debugging framework, called Tdb, that addresses the difficult chal-lenge of maintaining and providing source level information for programs whose binary code changes asthe program executes. The proposed framework has a number of important features. First, it does notrequire or induce changes in the program being debugged. In other words, programs are debugged is theirdeployment environment. Second, the framework is portable and can be applied to virtually any SDT sys-tem. The framework requires minimal changes to an SDT implementation, usually just a few lines of code.Third, the framework can be integrated with existing debuggers, such as Gdb, and does not require changesto these debuggers. This improves usability and adoption, eliminating the learning curve associated with anew debugging environment. Finally, the proposed techniques are efficient. The runtime overhead of thedebugged programs is low and comparable to that of existing debuggers. Tdb's techniques have been implemented for three different dynamic translators, on two differenthardware platforms. The experimental results demonstrate that source level debugging of dynamicallytranslated programs is feasible, and our implemented systems are portable, usable, and efficient

    Using Execution Transactions To Recover From Buffer Overflow Attacks

    Get PDF
    We examine the problem of containing buffer overflow attacks in a safe and efficient manner. Briefly, we automatically augment source code to dynamically catch stack and heap-based buffer overflow and underflow attacks, and recover from them by allowing the program to continue execution. Our hypothesis is that we can treat each code function as a transaction that can be aborted when an attack is detected, without affecting the application's ability to correctly execute. Our approach allows us to selectively enable or disable components of this defensive mechanism in response to external events, allowing for a direct tradeoff between security and performance. We combine our defensive mechanism with a honeypot-like configuration to detect previously unknown attacks and automatically adapt an application's defensive posture at a negligible performance cost, as well as help determine a worm's signature. The main benefits of our scheme are its low impact on application performance, its ability to respond to attacks without human intervention, its capacity to handle previously unknown vulnerabilities, and the preservation of service availability. We implemented a stand-alone tool, DYBOC, which we use to instrument a number of vulnerable applications. Our performance benchmarks indicate a slow-down of 20% for Apache in full-protection mode, and 1.2% with partial protection. We validate our transactional hypothesis via two experiments: first, by applying our scheme to 17 vulnerable applications, successfully fixing 14 of them; second, by examining the behavior of Apache when each of 154 potentially vulnerable routines are made to fail, resulting in correct behavior in 139 of cases

    Handling Multi-Versioning in LLVM: Code Tracking and Cloning

    Get PDF
    International audienceInstrumentation by sampling, adaptive computing and dynamic optimization can be efficiently implemented using multiple versions of a code region. Ideally, compilers should automatically handle the generation of such multiple versions. In this work we discuss the problem of multi-versioning in the situation where each version requires a different intermediate representation. We expose the limits of nowadays compilers regarding these aspects and provide our solutions to overcome them, using the LLVM compiler as our research platform. The paper is focused on three main aspects: tracking code in LLVM IR, cloning, and communication between low-level and high-level representations. Aiming at performance and minimal impact on the behavior of the original code, we describe our strategies to guide the interaction between the newly inserted code and the optimization passes, from annotating code using metadata to inlining assembly code in LLVM IR. Our target is performing code instrumentation and optimization, with an interest in loops. We build a version in x86_64 assembly code to acquire low-level information, and multiple versions in LLVM IR for performing high-level code transformations. The selection mechanism consists in callbacks to a generic runtime system. Preliminary results on the SPEC CPU 2006 and the Pointer Intensive Benchmark suite show that our framework has a negligible overhead in most cases, when instrumenting the most time consuming loop nests
    • …
    corecore