100 research outputs found

    Detecting scale-induced overflow bugs in production HPC codes

    Get PDF

    Understanding Persistent-Memory Related Issues in the Linux Kernel

    Full text link
    Persistent memory (PM) technologies have inspired a wide range of PM-based system optimizations. However, building correct PM-based systems is difficult due to the unique characteristics of PM hardware. To better understand the challenges as well as the opportunities to address them, this paper presents a comprehensive study of PM-related issues in the Linux kernel. By analyzing 1,553 PM-related kernel patches in-depth and conducting experiments on reproducibility and tool extension, we derive multiple insights in terms of PM patch categories, PM bug patterns, consequences, fix strategies, triggering conditions, and remedy solutions. We hope our results could contribute to the development of robust PM-based storage systemsComment: ACM TRANSACTIONS ON STORAGE(TOS'23

    Partial replay of long-running applications

    Get PDF
    Bugs in deployed software can be extremely difficult to track down. Invasive logging techniques, such as logging all non-deterministic inputs, can incur substantial runtime overheads. This paper shows how symbolic analysis can be used to re-create path equivalent executions for very long running programs such as databases and web servers. The goal is to help developers debug such long-running programs by allowing them to walk through an execution of the last few requests or transactions leading up to an error. The challenge is to provide this functionality without the high runtime overheads associated with traditional replay techniques based on input logging or memory snapshots. Our approach achieves this by recording a small amount of information about program execution, such as the direction of branches taken, and then using symbolic analysis to reconstruct the execution of the last few inputs processed by the application, as well as the state of memory before these inputs were executed. We implemented our technique in a new tool called bbr. In this paper, we show that it can be used to replay bugs in long-running single-threaded programs starting from the middle of an execution. We show that bbr incurs low recording overhead (avg. of 10%) during program execution, which is much less than existing replay schemes. We also show that it can reproduce real bugs from web servers, database systems, and other common utilities

    Adonis: Practical and Efficient Control Flow Recovery through OS-Level Traces

    Get PDF
    Control flow recovery is critical to promise the software quality, especially for large-scale software in production environment. However, the efficiency of most current control flow recovery techniques is compromised due to their runtime overheads along with deployment and development costs. To tackle this problem, we propose a novel solution, Adonis, which harnesses OS-level traces, such as dynamic library calls and system call traces, to efficiently and safely recover control flows in practice. Adonis operates in two steps: it first identifies the call-sites of trace entries, then it executes a pair-wise symbolic execution to recover valid execution paths. This technique has several advantages. First, Adonis does not require the insertion of any probes into existing applications, thereby minimizing runtime cost. Second, given that OS-level traces are hardware-independent, Adonis can be implemented across various hardware configurations without the need for hardware-specific engineering efforts, thus reducing deployment cost. Third, as Adonis is fully automated and does not depend on manually created logs, it circumvents additional development cost. We conducted an evaluation of Adonis on representative desktop applications and real-world IoT applications. Adonis can faithfully recover the control flow with 86.8% recall and 81.7% precision. Compared to the state-of-the-art log-based approach, Adonis can not only cover all the execution paths recovered, but also recover 74.9% of statements that cannot be covered. In addition, the runtime cost of Adonis is 18.3× lower than the instrument-based approach; the analysis time and storage cost (indicative of the deployment cost) of Adonis is 50× smaller and 443× smaller than the hardware-based approach, respectively. To facilitate future replication and extension of this work, we have made the code and data publicly available

    DeepWukong: Statically Detecting Software Vulnerabilities Using Deep Graph Neural Network

    Full text link
    Static bug detection has shown its effectiveness in detecting well-defined memory errors, e.g., memory leaks, buffer overflows, and null dereference. However, modern software systems have a wide variety of vulnerabilities. These vulnerabilities are extremely complicated with sophisticated programming logic, and these bugs are often caused by different bad programming practices, challenging existing bug detection solutions. It is hard and labor-intensive to develop precise and efficient static analysis solutions for different types of vulnerabilities, particularly for those that may not have a clear specification as the traditional well-defined vulnerabilities. This article presents DeepWukong, a new deep-learning-based embedding approach to static detection of software vulnerabilities for C/C++ programs. Our approach makes a new attempt by leveraging advanced recent graph neural networks to embed code fragments in a compact and low-dimensional representation, producing a new code representation that preserves high-level programming logic (in the form of control-and data-flows) together with the natural language information of a program. Our evaluation studies the top 10 most common C/C++ vulnerabilities during the past 3 years. We have conducted our experiments using 105,428 real-world programs by comparing our approach with four well-known traditional static vulnerability detectors and three state-of-the-art deep-learning-based approaches. The experimental results demonstrate the effectiveness of our research and have shed light on the promising direction of combining program analysis with deep learning techniques to address the general static code analysis challenges

    Understanding Performance Inefficiencies In Native And Managed Languages

    Get PDF
    Production software packages have become increasingly complex with millions of lines of code, sophisticated control and data flow, and references to a hierarchy of external libraries. This complexity often introduces performance inefficiencies across software stacks, making it practically impossible for users to pinpoint them manually. Performance profiling tools (a.k.a. profilers) abound in the tools community to aid software developers in understanding program behavior. Classical profiling techniques focus on identifying hotspots. The hotspot analysis is indispensable; however, it can hardly diagnose whether a resource is being used in a productive manner that contributes to the overall efficiency of a program. Consequently, a significant burden is on developers to make a judgment call on whether there is scope to optimize a hotspot. Derived metrics, e.g., cache miss ratio, offer slightly better intuition into hotspots but are still not panaceas. Hence, there is a need for profilers that investigate resource wastage instead of usage. To overcome the critical missing pieces in prior work and complement existing profilers, we propose novel fine- and coarse-grained profilers to pinpoint varieties of performance inefficiencies and provide optimization guidance for a wide range of software covering benchmarks, enterprise applications, and large-scale parallel applications running on supercomputers and data centers. Fine-grained profilers are indispensable to understand performance inefficiencies comprehensively. We propose a whole-program profiler called LoadSpy, which works on binary executables to detect and quantify wasteful memory operations in their context and scope. Our observation, which is justified by myriad case studies, is that wasteful memory operations are often an indicator of various forms of performance inefficiencies, such as suboptimal choices of algorithms or data structures, missed compiler optimizations, and developers’ inattention to performance. Guided by LoadSpy, we are able to optimize a large number of well-known benchmarks and real-world applications, yielding significant speedups. Despite deep performance insights offered by fine-grained profilers, the high overhead keeps them away from widespread adoption, particularly in production. By contrast, coarse-grained profilers introduce low overhead at the cost of poor performance insights. Hence, another research topic is how we benefit from both, that is, the combination of deep insights of fine-grained profilers and low overhead of coarse-grained ones. The first effort to do so is proposing a lightweight profiler called JXPerf. It abandons heavyweight instrumentation by combining hardware performance monitoring units and debug registers available in commodity CPUs to detect wasteful memory operations. Compared with LoadSpy, JXPerf reduces the runtime overhead from 10x to 7% on average. The lightweight nature makes it useful in production. Another effort is proposing a lightweight profiler called FVSampler, the first nonintrusive profiler to study function execution variance

    Pinpointing Software Inefficiencies With Profiling

    Get PDF
    Complex codebases with several layers of abstractions have abundant inefficiencies that affect the performance. These inefficiencies arise due to various causes such as developers\u27 inattention to performance, inappropriate choice of algorithms and inefficient code generation among others. To eliminate the redundancies, lots of work has been done during the compiling phase. However, not all redundancies can be easily detected or eliminated with compiler optimization passes due to aliasing, limited optimization scopes, and insensitivity to input and execution contexts act as severe deterrents to static program analysis. There are also profiling tools which can reveal how resources are used. However, they can hard to distinguish whether the resource is worth fully used. More profiling tools are in needed to diagnose resource wastage and pinpoint inefficiencies. We have developed three tools to pinpoint different types of inefficiencies in different granularity. We build Runtime Value Numbering (RVN), a dynamic fine-grained profiler to pinpoint and quantify redundant computations in an execution. It is based on the classical value numbering technique but works at runtime instead of compile-time. We developed RedSpy, a fine-grained profiler to pinpoint and quantify value redundancies in program executions. Value redundancy may happen overtime at the same locations or in adjacent locations, and thus it has temporal and spatial locality. RedSpy identifies both temporal and spatial value locality. Furthermore, RedSpy is capable of identifying values that are approximately the same, enabling optimization opportunities in HPC codes that often use floating-point computations. RVN and RedSpy are both instrumentation based tools. They provide comprehensive result while introducing high space and time overhead. Our lightweight framework, Witch, samples consecutive accesses to the same memory location by exploiting two ubiquitous hardware features: the performance monitoring units (PMU) and debug registers. Witch performs no instrumentation. Hence, witchcraft - tools built atop Witch - can detect a variety of software inefficiencies while introducing negligible slowdown and insignificant memory consumption and yet maintaining accuracy comparable to exhaustive instrumentation tools. Witch allowed us to scale our analysis to a large number of codebases. All the tools work on fully optimized binary executable and provide insightful optimization guidance by apportioning redundancies to their provenance - source lines and full calling contexts. We apply RVN, RedSpy, and Witch on programs that were optimization targets for decades and guided by the tools, we were able to eliminate redundancies that resulted in significant speedups

    Hardware-Assisted Processor Tracing for Automated Bug Finding and Exploit Prevention

    Get PDF
    The proliferation of binary-only program analysis techniques like fuzz testing and symbolic analysis have lead to an acceleration in the number of publicly disclosed vulnerabilities. Unfortunately, while bug finding has benefited from recent advances in automation and a decreasing barrier to entry, bug remediation has received less attention. Consequently, analysts are publicly disclosing bugs faster than developers and system administrators can mitigate them. Hardware-supported processor tracing within commodity processors opens new doors to observing low-level behaviors with efficiency, transparency, and integrity that can close this automation gap. Unfortunately, several trade-offs in its design raise serious technical challenges that have limited widespread adoption. Specifically, modern processor traces only capture control flow behavior, yield high volumes of data that can incur overhead to sift through, and generally introduce a semantic gap between low-level behavior and security relevant events. To solve the above challenges, I propose control-oriented record and replay, which combines concrete traces with symbolic analysis to uncover vulnerabilities and exploits. To demonstrate the efficacy and versatility of my approach, I first present a system called ARCUS, which is capable of analyzing processor traces flagged by host-based monitors to detect, localize, and provide preliminary patches to developers for memory corruption vulnerabilities. ARCUS has detected 27 previously known vulnerabilities alongside 4 novel cases, leading to the issuance of several advisories and official developer patches. Next, I present MARSARA, a system that protects the integrity of execution unit partitioning in data provenance-based forensic analysis. MARSARA prevents several expertly crafted exploits from corrupting partitioned provenance graphs while incurring little overhead compared to prior work. Finally, I present Bunkerbuster, which extends the ideas from ARCUS and MARSARA into a system capable of proactively hunting for bugs across multiple end-hosts simultaneously, resulting in the discovery and patching of 4 more novel bugs.Ph.D
    • …
    corecore