20 research outputs found

    Inferring Lower Bounds for Runtime Complexity

    Get PDF
    We present the first approach to deduce lower bounds for innermost runtime complexity of term rewrite systems (TRSs) automatically. Inferring lower runtime bounds is useful to detect bugs and to complement existing techniques that compute upper complexity bounds. The key idea of our approach is to generate suitable families of rewrite sequences of a TRS and to find a relation between the length of such a rewrite sequence and the size of the first term in the sequence. We implemented our approach in the tool AProVE and evaluated it by extensive experiments

    Nyx-Net: Network Fuzzing with Incremental Snapshots

    Get PDF
    Coverage-guided fuzz testing (“fuzzing”) has become main- stream and we have observed lots of progress in this research area recently. However, it is still challenging to efficiently test network services with existing coverage-guided fuzzing methods. In this paper, we introduce the design and implementation of Nyx-Net, a novel snapshot-based fuzzing approach that can successfully fuzz a wide range of targets spanning servers, clients, games, and even Firefox’s Inter- Process Communication (IPC) interface. Compared to state- of-the-art methods, Nyx-Net improves test throughput by up to 300x and coverage found by up to 70%. Additionally, Nyx-Net is able to find crashes in two of ProFuzzBench’s targets that no other fuzzer found previously. When using Nyx-Net to play the game Super Mario, Nyx-Net shows speedups of 10-30x compared to existing work. Moreover, Nyx-Net is able to find previously unknown bugs in servers such as Lighttpd, clients such as MySQL client, and even Firefox’s IPC mechanism—demonstrating the strength and versatility of the proposed approach. Lastly, our prototype implementation was awarded a $20.000 bug bounty for enabling fuzzing on previously unfuzzable code in Firefox and solving a long-standing problem at Mozilla

    Cupid: Automatic Fuzzer Selection for Collaborative Fuzzing

    Get PDF
    Combining the strengths of individual fuzzing methods is an appealing idea to find software faults more efficiently, especially when the computing budget is limited. In prior work, EnFuzz introduced the idea of ensemble fuzzing and devised three heuristics to classify properties of fuzzers in terms of diversity. Based on these heuristics, the authors manually picked a combination of different fuzzers that collaborate. In this paper, we generalize this idea by collecting and applying empirical data from single, isolated fuzzer runs to automatically identify a set of fuzzers that complement each other when executed collaboratively. To this end, we present Cupid, a collaborative fuzzing framework allowing automated, data-driven selection of multiple complementary fuzzers for parallelized and distributed fuzzing. We evaluate the automatically selected target-independent combination of fuzzers by Cupid on Google's fuzzer-test-suite, a collection of real-world binaries, as well as on the synthetic Lava-M dataset. We find that Cupid outperforms two expert-guided, target-specific and hand-picked combinations on Google's fuzzer-test-suite in terms of branch coverage, and improves bug finding on Lava-M by 10%. Most importantly, we improve the latency for obtaining 95% and 99% of the coverage by 90% and 64%, respectively. Furthermore, Cupid reduces the amount of CPU hours needed to find a high-performing combination of fuzzers by multiple orders of magnitude compared to an exhaustive evaluation

    Typestate-Guided Fuzzer for Discovering Use-after-Free Vulnerabilities

    Get PDF
    © 2020 Association for Computing Machinery. Existing coverage-based fuzzers usually use the individual control flow graph (CFG) edge coverage to guide the fuzzing process, which has shown great potential in finding vulnerabilities. However, CFG edge coverage is not effective in discovering vulnerabilities such as use-after-free (UaF). This is because, to trigger UaF vulnerabilities, one needs not only to cover individual edges, but also to traverse some (long) sequence of edges in a particular order, which is challenging for existing fuzzers. To this end, we propose to model UaF vulnerabilities as typestate properties, and develop a typestateguided fuzzer, named UAFL, for discovering vulnerabilities violating typestate properties. Given a typestate property, we first perform a static typestate analysis to find operation sequences potentially violating the property. Our fuzzing process is then guided by the operation sequences in order to progressively generate test cases triggering property violations. In addition, we also employ an information flow analysis to improve the efficiency of the fuzzing process. We have performed a thorough evaluation of UAFL on 14 widely-used real-world programs. The experiment results show that UAFL substantially outperforms the state-of-the-art fuzzers, including AFL, AFLFast, FairFuzz, MOpt, Angora and QSYM, in terms of the time taken to discover vulnerabilities. We have discovered 10 previously unknown vulnerabilities, and received 5 new CVEs

    Automatically proving termination and memory safety for programs with pointer arithmetic

    Get PDF
    While automated verification of imperative programs has been studied intensively, proving termination of programs with explicit pointer arithmetic fully automatically was still an open problem. To close this gap, we introduce a novel abstract domain that can track allocated memory in detail. We use it to automatically construct a symbolic execution graph that over-approximates all possible runs of a program and that can be used to prove memory safety. This graph is then transformed into an integer transition system, whose termination can be proved by standard techniques. We implemented this approach in the automated termination prover AProVE and demonstrate its capability of analyzing C programs with pointer arithmetic that existing tools cannot handle

    Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

    Full text link
    This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety benchmark to date, CyberSecEval provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their level of compliance when asked to assist in cyberattacks. Through a case study involving seven models from the Llama 2, Code Llama, and OpenAI GPT large language model families, CyberSecEval effectively pinpointed key cybersecurity risks. More importantly, it offered practical insights for refining these models. A significant observation from the study was the tendency of more advanced models to suggest insecure code, highlighting the critical need for integrating security considerations in the development of sophisticated LLMs. CyberSecEval, with its automated test case generation and evaluation pipeline covers a broad scope and equips LLM designers and researchers with a tool to broadly measure and enhance the cybersecurity safety properties of LLMs, contributing to the development of more secure AI systems

    Analyzing program termination and complexity automatically with AProVE

    Get PDF
    In this system description, we present the tool AProVE for automatic termination and complexity proofs of Java, C, Haskell, Prolog, and rewrite systems. In addition to classical term rewrite systems (TRSs), AProVE also supports rewrite systems containing built-in integers (int-TRSs). To analyze programs in high-level languages, AProVE automatically converts them to (int-)TRSs. Then, a wide range of techniques is employed to prove termination and to infer complexity bounds for the resulting rewrite systems. The generated proofs can be exported to check their correctness using automatic certifiers. To use AProVE in software construction, we present a corresponding plug-in for the popular Eclipse software development environment

    Algorithmic improvements for feedback-driven fuzzing

    No full text
    Randomisiertes Testen oder "Fuzzing" bezeichnet den Prozess, pseudo-zufällige Eingaben an ein Programm zu übergeben, um Fehler auszulösen. In den letzten Jahren hat dieser Ansatz eine große Zahl Fehler und Sicherheitslücken gefunden. Programmkonstrukte wie spezielle, festgelegte Kennzahlen, hoch-strukturierte Eingabeformate oder Quersummen stelle immer noch ein Problem dar. Die zufälligen Veränderungen, die Fuzzer an den Eingaben vornehmen, sind nicht in der Lage, den Abstand zwischen zwei verschieden, interessanten Eingaben zu überwinden. In dieser Arbeit stellen wir verschiedene neue Methoden vor, um Programme effizient zu testen, die solche problematischen Strukturen enthalten. Die Fuzzer die wir im Rahmen dieser Abschlussarbeit entwickelt haben, haben eine Vielzahl von Fehlern und Sicherheitslücken in Software gefunden, welche (indirekt) von fast jedem Menschen täglich benutzt wird

    Learning Input Tokens for Effective Fuzzing

    Get PDF
    Modern fuzzing tools like AFL operate at a lexical level: They explore the input space of tested programs one byte after another. For inputs with complex syntactical properties, this is very inefficient, as keywords and other tokens have to be composed one character at a time. Fuzzers thus allow to specify dictionaries listing possible tokens the input can be composed from; such dictionaries speed up fuzzers dramatically. Also, fuzzers make use of dynamic tainting to track input tokens and infer values that are expected in the input validation phase. Unfortunately, such tokens are usually implicitly converted to program specific values which causes a loss of the taints attached to the input data in the lexical phase. In this paper we present a technique to extend dynamic tainting to not only track explicit data flows but also taint implicitly converted data without suffering from taint explosion. This extension makes it possible to augment existing techniques and automatically infer a set of tokens and seed inputs for the input language of a program given nothing but the source code. Specifically targeting the lexical analysis of an input processor, our lFuzzer test generator systematically explores branches of the lexical analysis, producing a set of tokens that fully cover all decisions seen. The resulting set of tokens can be directly used as a dictionary for fuzzing. Along with the token extraction seed inputs are generated which give further fuzzing processes a head start. In our experiments, the lFuzzer-AFL combination achieves up to 17% more coverage on complex input formats like JSON, LISP, tinyC, and JavaScript compared to AFL

    AProVE : Termination and Memory Safety of C Programs ; (competition contribution)

    No full text
    corecore