125 research outputs found

    Practical memory leak detector based on parameterized procedural summaries

    Full text link
    We present a static analyzer that detects memory leaks in C pro-grams. It achieves relatively high accuracy at a relatively low cost on SPEC2000 benchmarks and several open-source software pack-ages, demonstrating its practicality and competitive edge against other reported analyzers: for a set of benchmarks totaling 1,777 KLOCs, it found 332 bugs with 47 additional false positives (a 12.4 % false-positive ratio), and the average analysis speed was 720 LOC/sec. We separately analyze each procedureโ€™s memory behavior into a summary that is used in analyzing its call sites. Each procedural summary is parameterized by the procedureโ€™s call context so that it can be instantiated at different call sites. What information to cap-ture in each procedural summary has been carefully tuned so that the summary should not lose any common memory-leak-related be-haviors in real-world C programs. Because each procedure is summarized by conventional fixpoint iteration over the abstract semantics (ร  la abstract interpretation), the analyzer naturally handles arbitrary call cycles from direct or indirect recursive calls

    Evidence-enabled verification for the Linux kernel

    Get PDF
    Formal verification of large software has been an elusive target, riddled with problems of low accuracy and high computational complexity. With growing dependence on software in embedded and cyber-physical systems where vulnerabilities and malware can lead to disasters, an efficient and accurate verification has become a crucial need. The verification should be rigorous, computationally efficient, and automated enough to keep the human effort within reasonable limits, but it does not have to be completely automated. The automation should actually enable and simplify human cross-checking which is especially important when the stakes are high. Unfortunately, formal verification methods work mostly as automated black boxes with very little support for cross-checking. This thesis is about a different way to approach the software verification problem. It is about creating a powerful fusion of automation and human intelligence by incorporating algorithmic innovations to address the major challenges to advance the state of the art for accurate and scalable software verification where complete automation has remained intractable. The key is a mathematically rigorous notion of verification-critical evidence that the machine abstracts from software to empower human to reason with. The algorithmic innovation is to discover the patterns the developers have applied to manage complexity and leverage them. A pattern-based verification is crucial because the problem is intractable otherwise. We call the overall approach Evidence-Enabled Verification (EEV). This thesis presents the EEV with two challenging applications: (1) EEV for Lock/Unlock Pairing to verify the correct pairing of mutex lock and spin lock with their corresponding unlocks on all feasible execution paths, and (2) EEV for Allocation/Deallocation Pairing to verify the correct pairing of memory allocation with its corresponding deallocations on all feasible execution paths. We applied the EEV approach to verify recent versions of the Linux kernel. The results include a comparison with the state-of-the-art Linux Driver Verification (LDV) tool, effectiveness of the proposed visual models as verification-critical evidence, representative examples of verification, the discovered bugs, and limitations of the proposed approach

    Efficient and linear static approach for finding the memory leak in C

    Get PDF
    Code analysis has discovered that memory leaks are common in the C programming language. In the literature, there exist various approaches for statically analyzing and detecting memory leaks. The complexity and diversity of memory leaks make it difficult to find an approach that is both effective and simple. In embedded systems, costly resources like memory become limited as the systemโ€™s size diminishes. As a result, memory must be handled effectively and efficiently too. To obtain precise analysis, we propose a novel approach that works in a phase-wise manner. Instead of examining all possible paths for finding memory leaks, we use a program slicing to check for a potential memory leak. We introduce a source-sink flow graph (SSFG) based on source-sink properties of memory allocation-deallocation within the C code. To achieve simplicity in analysis, we also reduce the complexity of analysis in linear time. In addition, we utilize a constraint solver to improve the effectiveness of our approach. To evaluate the approach, we perform manual scanning on various test cases: link list applications, Juliet test cases, and common vulnerabilities and exposures found in 2021. The results show the efficiency of the proposed approach by preparing the SSFG with linear complexity

    Empirical study of inter-procedural data flow (IDF) patterns for memory leak analysis in Linux

    Get PDF
    Analysis of inter-procedural data flow (IDF) is a commonly encountered challenge for verifying safety and security properties of large software. In order to address this challenge, a pragmatic approach is to identify IDF patterns that are known to occur in practice, and develop algorithms to detect and handle those patterns correctly. We perform an empirical study to gather the IDF patterns in Linux, which is essential to support such a pragmatic approach. In our study, we first analyzed the Linux code to study how reference to dynamically allocated memory in a function flows out of the function. We analyzed instances of memory allocation and identified 6 IDF patterns. Second, we mined and analyzed memory leak bug fixes from the Linux git repository. Third, we surveyed the literature for static analysis tools that can detect memory leaks. Based on these studies, we found that the set of IDF patterns associated with the memory leak bug fixes in Linux and those that can be detected by the current static analysis tools is a subset of the 6 IDF patterns we identified

    Human-centric verification for software safety and security

    Get PDF
    Software forms a critical part of our lives today. Verifying software to avoid violations of safety and security properties is a necessary task. It is also imperative to have an assurance that the verification process was correct. We propose a human-centric approach to software verification. This involves enabling human-machine collaboration to detect vulnerabilities and to prove the correctness of the verification. We discuss two classes of vulnerabilities. The first class is Algorithmic Complexity Vulnerabilities (ACV). ACVs are a class of software security vulnerabilities that cause denial-of-service attacks. The description of an ACV is not known a priori. The problem is equivalent to searching for a needle in the haystack when we don\u27t know what the needle looks like. We present a novel approach to detect ACVs in web applications. We present a case study audit from DARPA\u27s Space/Time Analysis for Cybersecurity (STAC) program to illustrate our approach. The second class of vulnerabilities is Memory Leaks. Although the description of the Memory Leak (ML) problem is known, a proof of the correctness of the verification is needed to establish trust in the results. We present an approach inspired by the works of Alan Perlis to compute evidence of the verification which can be scrutinized by a human to prove the correctness of the verification. We present a novel abstraction, the Evidence Graph, that succinctly captures the verification evidence and show how to compute the evidence. We evaluate our approach against ML instances in the Linux kernel and report improvement over the state-of-the-art results. We also present two case studies to illustrate how the Evidence Graph can be used to prove the correctness of the verification

    Man-machine partial program analysis for malware detection

    Get PDF
    With the meteoric rise in popularity of the Android platform, there is an urgent need to combat the accompanying proliferation of malware. Existing work addresses the area of consumer malware detection, but cannot detect novel, sophisticated, domain-specific malware that is targeted specifically at one aspect of an organization (eg. ground operations of the US Military). Adversaries can exploit domain knowledge to camoflauge malice within the legitimate behaviors of an app and behind a domain-specific trigger, rendering traditional approaches such as signature-matching, machine learning, and dynamic monitoring ineffective. Manual code inspections are also inadequate, scaling poorly and introducing human error. Yet, there is a dire need to detect this kind of malware before it causes catastrophic loss of life and property. This dissertation presents the Security Toolbox, our novel solution for this challenging new problem posed by DARPA\u27s Automated Program Analysis for Cybersecurity (APAC) program. We employ a human-in-the-loop approach to amplify the natural intelligence of our analysts. Our automation detects interesting program behaviors and exposes them in an analysis Dashboard, allowing the analyst to brainstorm flaw hypotheses and ask new questions, which in turn can be answered by our automated analysis primitives. The Security Toolbox is built on top of Atlas, a novel program analysis platform made by EnSoft. Atlas uses a graph-based mathematical abstraction of software to produce a unified property multigraph, exposes a powerful API for writing analyzers using graph traversals, and provides both automated and interactive capabilities to facilitate program comprehension. The Security Toolbox is also powered by FlowMiner, a novel solution to mine fine-grained, compact data flow summaries of Java libraries. FlowMiner allows the Security Toolbox to complete a scalable and accurate partial program analysis of an application without including all of the libraries that it uses (eg. Android). This dissertation presents the Security Toolbox, Atlas, and FlowMiner. We provide empirical evidence of the effectiveness of the Security Toolbox for detecting novel, sophisticated, domain-specific Android malware, demonstrating that our approach outperforms other cutting-edge research tools and state-of-the-art commercial programs in both time and accuracy metrics. We also evaluate the effectiveness of Atlas as a program analysis platform and FlowMiner as a library summary tool

    Computing homomorphic program invariants

    Get PDF
    Program invariants are properties that are true at a particular program point or points. Program invariants are often undocumented assertions made by a programmer that hold the key to reasoning correctly about a software verification task. Unlike the contemporary research in which program invariants are defined to hold for all control flow paths, we propose \textit{homomorphic program invariants}, which hold with respect to a relevant equivalence class of control flow paths. For a problem-specific task, homomorphic program invariants can form stricter assertions. This work demonstrates that the novelty of computing homomorphic program invariants is both useful and practical. Towards our goal of computing homomorphic program invariants, we deal with the challenge of the astronomical number of paths in programs. Since reasoning about a class of program paths must be efficient in order to scale to real-world programs, we extend prior work to efficiently divide program paths into equivalence classes with respect to control flow events of interest. Our technique reasons about inter-procedural paths, which we then use to determine how to modify a program binary to abort execution at the start of an irrelevant program path. With off-the-shelf components, we employ the state-of-the-art in fuzzing and dynamic invariant detection tools to mine homomorphic program invariants. To aid in the task of identifying likely software anomalies, we develop human-in-the-loop analysis methodologies and a toolbox of human-centric static analysis tools. We present work to perform a statically-informed dynamic analysis to efficiently transition from static analysis to dynamic analysis and leverage the strengths of each approach. To evaluate our approach, we apply our techniques to three case study audits of challenge applications from DARPA\u27s Space/Time Analysis for Cybersecurity (STAC) program. In the final case study, we discover an unintentional vulnerability that causes a denial of service (DoS) in space and time, despite the challenge application having been hardened against static and dynamic analysis techniques

    Selectively Sensitive Static Analysis by Impact Pre-analysis and Machine Learning

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2017. 8. ์ด๊ด‘๊ทผ.์ด ํ•™์œ„ ๋…ผ๋ฌธ์—์„œ๋Š” ์ •์  ๋ถ„์„ ์„ฑ๋Šฅ์„ ๊ฒฐ์ •์ง“๋Š” ์„ธ ๊ฐ€์ง€ ์ถ•์ธ ์•ˆ์ „์„ฑ (soundness), ์ •ํ™•๋„ (precision), ํ™•์žฅ์„ฑ (scalability) ์„ ์ตœ๋Œ€ํ•œ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ์ •์  ๋ถ„์„์—๋Š” ์—ฌ๋Ÿฌ๊ฐ€์ง€ ์ •ํ™•๋„ ์ƒ์Šน ๊ธฐ๋ฒ•๋“ค์ด ์žˆ์ง€๋งŒ, ๋ฌดํ„ฑ๋Œ€๊ณ  ์ ์šฉํ•  ์‹œ์—๋Š” ๋ถ„ ์„์ด ์‹ฌ๊ฐํ•˜๊ฒŒ ๋Š๋ ค์ง€๊ฑฐ๋‚˜ ์‹ค์ œ ์‹คํ–‰ ์˜๋ฏธ๋ฅผ ์ง€๋‚˜์น˜๊ฒŒ ๋งŽ์ด ๋†“์น˜๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ์ด ๋…ผ๋ฌธ์˜ ํ•ต์‹ฌ์€, ์ด๋ ‡๊ฒŒ ์ •ํ™•ํ•˜์ง€๋งŒ ๋น„์šฉ์ด ํฐ ๋ถ„์„ ๊ธฐ๋ฒ•์ด ๊ผญ ํ•„์š”ํ•œ ๊ณณ๋งŒ์„ ์„ ๋ณ„ํ•ด ๋‚ด๋Š” ๊ธฐ์ˆ ์ด๋‹ค. ๋จผ์ €, ์ •ํ™•๋„ ์ƒ์Šน ๊ธฐ๋ฒ•์ด ๊ผญ ํ•„์š”ํ•œ ๋ถ€๋ถ„์„ ์˜ˆ์ธกํ•˜๋Š” ๋˜ ๋‹ค๋ฅธ ์ •์  ๋ถ„์„์ธ ์˜ˆ๋น„ ๋ถ„์„์„ ์ œ์‹œํ•œ๋‹ค. ๋ณธ ๋ถ„์„์€ ์ด ์˜ˆ๋น„ ๋ถ„์„์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ •ํ™•๋„ ์ƒ ์Šน ๊ธฐ๋ฒ•์„ ์„ ๋ณ„์ ์œผ๋กœ ์ ์šฉํ•จ์œผ๋กœ์„œ ํšจ์œจ์ ์œผ๋กœ ๋ถ„์„์„ ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ, ๊ธฐ๊ณ„ํ•™์Šต ์„ ์ด์šฉํ•˜์—ฌ ๊ณผ๊ฑฐ ๋ถ„์„ ๊ฒฐ๊ณผ๋ฅผ ํ•™์Šตํ•จ์œผ๋กœ์จ ๋”์šฑ ํšจ์œจ์ ์œผ๋กœ ์„ ๋ณ„ํ• ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ํ•™์Šต์— ์“ฐ์ด๋Š” ๋ฐ์ดํ„ฐ๋Š” ์•ž์„œ ์ œ์‹œํ•œ ์˜ˆ๋น„ ๋ถ„์„๊ณผ ๋ณธ ๋ถ„์„์„ ์—ฌ๋Ÿฌ ํ•™์Šต ํ”„ ๋กœ๊ทธ๋žจ์— ๋ฏธ๋ฆฌ ์ ์šฉํ•œ ๊ฒฐ๊ณผ๋กœ๋ถ€ํ„ฐ ์ž๋™์œผ๋กœ ์–ป์–ด ๋‚ธ๋‹ค. ์—ฌ๊ธฐ์„œ ์ œ์‹œํ•œ ๋ฐฉ๋ฒ•๋“ค์€ ์‹ค์ œ C ์†Œ์Šค ์ฝ”๋“œ ๋ถ„์„๊ธฐ์— ์ ์šฉํ•˜์—ฌ ๊ทธ ํšจ๊ณผ๋ฅผ ์‹คํ—˜์ ์œผ๋กœ ์ž…์ฆํ–ˆ๋‹ค.1. Introduction 1 1.1 Goal 1 1.2 Solution 2 1.3 Outline 4 2. Preliminaries 5 2.1 Program 5 2.2 Collecting Semantics 6 2.3 Abstract Semantics 6 3 Selectively X-sensitive Analysis by Impact Pre-Analysis 9 3.1 Introduction 9 3.2 Informal Description 11 3.3 ProgramRepresentation 17 3.4 Selective Context-Sensitive Analysis with Context-Sensitivity Parameter K 18 3.5 Impact Pre-Analysis for Finding K 22 3.5.1 Designing an Impact Pre-Analysis 22 3.5.2 Use of the Pre-Analysis Results 28 3.6 Application to Selective Relational Analysis 35 3.7 Experiments 40 3.8 Summary 42 4 Selectively X-sensitive analysis by learning data generated by impact pre-analysis 47 4.1 Introduction 47 4.2 Informal Explanation 50 4.2.1 Octagon Analysis with Variable Clustering 50 4.2.2 Automatic Learning of a Variable-Clustering Strategy 52 4.3 Octagon Analysis with Variable Clustering 56 4.3.1 Programs 56 4.3.2 Octagon Analysis 56 4.3.3 Variable Clustering and Partial Octagon Analysis 58 4.4 Learning a Strategy for Clustering Variables 59 4.4.1 Automatic Generation of Labeled Data 60 4.4.2 Features and Classier 63 4.4.3 Strategy for Clustering Variables 64 4.5 Experiments 66 4.5.1 Effectiveness 67 4.5.2 Generalization 68 4.5.3 Feature Design 69 4.5.4 Choice of an Off-the-shelf Classification Algorithm 70 4.6 Summary 70 5 Selectively Unsound Analysis by Machine Learning 75 5.1 Introduction 75 5.2 Overview 78 5.2.1 Uniformly Unsound Analysis 78 5.2.2 Uniformly Sound Analysis 79 5.2.3 Selectively Unsound Analysis 80 5.2.4 Our Learning Approach 80 5.3 Our Technique 81 5.3.1 Parameterized Static Analysis 82 5.3.2 Learning a Classifier 83 5.4 Instance Analyses 87 5.4.1 A Generic, Selectively Unsound Static Analysis 87 5.4.2 Instantiation 1: Interval Analysis 91 5.4.3 Instantiation 2: TaintAnalysis 91 5.5 Experiments 92 5.5.1 Setting 92 5.5.2 Effectiveness of Our Approach 93 5.5.3 Efficacy of OC-SVM 96 5.5.4 Feature Design 97 5.5.5 Time Cost 98 5.5.6 Discussion 98 5.6 Summary 100 6 Related Work 106 6.1 Parametric Static Analysis 106 6.2 Goal-directed Static Analysis 107 6.3 Data-driven Static Analysis 108 6.4 Context-sensitivity and Relational Analysis 108 6.5 Unsoundness in Static Analysis 110 7 Conclusion 112Docto

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This book is Open Access under a CC BY licence. The LNCS 11427 and 11428 proceedings set constitutes the proceedings of the 25th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2019, which took place in Prague, Czech Republic, in April 2019, held as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019. The total of 42 full and 8 short tool demo papers presented in these volumes was carefully reviewed and selected from 164 submissions. The papers are organized in topical sections as follows: Part I: SAT and SMT, SAT solving and theorem proving; verification and analysis; model checking; tool demo; and machine learning. Part II: concurrent and distributed systems; monitoring and runtime verification; hybrid and stochastic systems; synthesis; symbolic verification; and safety and fault-tolerant systems
    • โ€ฆ
    corecore