5 research outputs found

    Practical memory leak detector based on parameterized procedural summaries

    Full text link
    We present a static analyzer that detects memory leaks in C pro-grams. It achieves relatively high accuracy at a relatively low cost on SPEC2000 benchmarks and several open-source software pack-ages, demonstrating its practicality and competitive edge against other reported analyzers: for a set of benchmarks totaling 1,777 KLOCs, it found 332 bugs with 47 additional false positives (a 12.4 % false-positive ratio), and the average analysis speed was 720 LOC/sec. We separately analyze each procedureโ€™s memory behavior into a summary that is used in analyzing its call sites. Each procedural summary is parameterized by the procedureโ€™s call context so that it can be instantiated at different call sites. What information to cap-ture in each procedural summary has been carefully tuned so that the summary should not lose any common memory-leak-related be-haviors in real-world C programs. Because each procedure is summarized by conventional fixpoint iteration over the abstract semantics (ร  la abstract interpretation), the analyzer naturally handles arbitrary call cycles from direct or indirect recursive calls

    Efficient and linear static approach for finding the memory leak in C

    Get PDF
    Code analysis has discovered that memory leaks are common in the C programming language. In the literature, there exist various approaches for statically analyzing and detecting memory leaks. The complexity and diversity of memory leaks make it difficult to find an approach that is both effective and simple. In embedded systems, costly resources like memory become limited as the systemโ€™s size diminishes. As a result, memory must be handled effectively and efficiently too. To obtain precise analysis, we propose a novel approach that works in a phase-wise manner. Instead of examining all possible paths for finding memory leaks, we use a program slicing to check for a potential memory leak. We introduce a source-sink flow graph (SSFG) based on source-sink properties of memory allocation-deallocation within the C code. To achieve simplicity in analysis, we also reduce the complexity of analysis in linear time. In addition, we utilize a constraint solver to improve the effectiveness of our approach. To evaluate the approach, we perform manual scanning on various test cases: link list applications, Juliet test cases, and common vulnerabilities and exposures found in 2021. The results show the efficiency of the proposed approach by preparing the SSFG with linear complexity

    Empirical study of inter-procedural data flow (IDF) patterns for memory leak analysis in Linux

    Get PDF
    Analysis of inter-procedural data flow (IDF) is a commonly encountered challenge for verifying safety and security properties of large software. In order to address this challenge, a pragmatic approach is to identify IDF patterns that are known to occur in practice, and develop algorithms to detect and handle those patterns correctly. We perform an empirical study to gather the IDF patterns in Linux, which is essential to support such a pragmatic approach. In our study, we first analyzed the Linux code to study how reference to dynamically allocated memory in a function flows out of the function. We analyzed instances of memory allocation and identified 6 IDF patterns. Second, we mined and analyzed memory leak bug fixes from the Linux git repository. Third, we surveyed the literature for static analysis tools that can detect memory leaks. Based on these studies, we found that the set of IDF patterns associated with the memory leak bug fixes in Linux and those that can be detected by the current static analysis tools is a subset of the 6 IDF patterns we identified

    Evidence-enabled verification for the Linux kernel

    Get PDF
    Formal verification of large software has been an elusive target, riddled with problems of low accuracy and high computational complexity. With growing dependence on software in embedded and cyber-physical systems where vulnerabilities and malware can lead to disasters, an efficient and accurate verification has become a crucial need. The verification should be rigorous, computationally efficient, and automated enough to keep the human effort within reasonable limits, but it does not have to be completely automated. The automation should actually enable and simplify human cross-checking which is especially important when the stakes are high. Unfortunately, formal verification methods work mostly as automated black boxes with very little support for cross-checking. This thesis is about a different way to approach the software verification problem. It is about creating a powerful fusion of automation and human intelligence by incorporating algorithmic innovations to address the major challenges to advance the state of the art for accurate and scalable software verification where complete automation has remained intractable. The key is a mathematically rigorous notion of verification-critical evidence that the machine abstracts from software to empower human to reason with. The algorithmic innovation is to discover the patterns the developers have applied to manage complexity and leverage them. A pattern-based verification is crucial because the problem is intractable otherwise. We call the overall approach Evidence-Enabled Verification (EEV). This thesis presents the EEV with two challenging applications: (1) EEV for Lock/Unlock Pairing to verify the correct pairing of mutex lock and spin lock with their corresponding unlocks on all feasible execution paths, and (2) EEV for Allocation/Deallocation Pairing to verify the correct pairing of memory allocation with its corresponding deallocations on all feasible execution paths. We applied the EEV approach to verify recent versions of the Linux kernel. The results include a comparison with the state-of-the-art Linux Driver Verification (LDV) tool, effectiveness of the proposed visual models as verification-critical evidence, representative examples of verification, the discovered bugs, and limitations of the proposed approach

    Selectively Sensitive Static Analysis by Impact Pre-analysis and Machine Learning

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2017. 8. ์ด๊ด‘๊ทผ.์ด ํ•™์œ„ ๋…ผ๋ฌธ์—์„œ๋Š” ์ •์  ๋ถ„์„ ์„ฑ๋Šฅ์„ ๊ฒฐ์ •์ง“๋Š” ์„ธ ๊ฐ€์ง€ ์ถ•์ธ ์•ˆ์ „์„ฑ (soundness), ์ •ํ™•๋„ (precision), ํ™•์žฅ์„ฑ (scalability) ์„ ์ตœ๋Œ€ํ•œ ๋‹ฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ์ •์  ๋ถ„์„์—๋Š” ์—ฌ๋Ÿฌ๊ฐ€์ง€ ์ •ํ™•๋„ ์ƒ์Šน ๊ธฐ๋ฒ•๋“ค์ด ์žˆ์ง€๋งŒ, ๋ฌดํ„ฑ๋Œ€๊ณ  ์ ์šฉํ•  ์‹œ์—๋Š” ๋ถ„ ์„์ด ์‹ฌ๊ฐํ•˜๊ฒŒ ๋Š๋ ค์ง€๊ฑฐ๋‚˜ ์‹ค์ œ ์‹คํ–‰ ์˜๋ฏธ๋ฅผ ์ง€๋‚˜์น˜๊ฒŒ ๋งŽ์ด ๋†“์น˜๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ์ด ๋…ผ๋ฌธ์˜ ํ•ต์‹ฌ์€, ์ด๋ ‡๊ฒŒ ์ •ํ™•ํ•˜์ง€๋งŒ ๋น„์šฉ์ด ํฐ ๋ถ„์„ ๊ธฐ๋ฒ•์ด ๊ผญ ํ•„์š”ํ•œ ๊ณณ๋งŒ์„ ์„ ๋ณ„ํ•ด ๋‚ด๋Š” ๊ธฐ์ˆ ์ด๋‹ค. ๋จผ์ €, ์ •ํ™•๋„ ์ƒ์Šน ๊ธฐ๋ฒ•์ด ๊ผญ ํ•„์š”ํ•œ ๋ถ€๋ถ„์„ ์˜ˆ์ธกํ•˜๋Š” ๋˜ ๋‹ค๋ฅธ ์ •์  ๋ถ„์„์ธ ์˜ˆ๋น„ ๋ถ„์„์„ ์ œ์‹œํ•œ๋‹ค. ๋ณธ ๋ถ„์„์€ ์ด ์˜ˆ๋น„ ๋ถ„์„์˜ ๊ฒฐ๊ณผ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ •ํ™•๋„ ์ƒ ์Šน ๊ธฐ๋ฒ•์„ ์„ ๋ณ„์ ์œผ๋กœ ์ ์šฉํ•จ์œผ๋กœ์„œ ํšจ์œจ์ ์œผ๋กœ ๋ถ„์„์„ ํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ, ๊ธฐ๊ณ„ํ•™์Šต ์„ ์ด์šฉํ•˜์—ฌ ๊ณผ๊ฑฐ ๋ถ„์„ ๊ฒฐ๊ณผ๋ฅผ ํ•™์Šตํ•จ์œผ๋กœ์จ ๋”์šฑ ํšจ์œจ์ ์œผ๋กœ ์„ ๋ณ„ํ• ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ํ•™์Šต์— ์“ฐ์ด๋Š” ๋ฐ์ดํ„ฐ๋Š” ์•ž์„œ ์ œ์‹œํ•œ ์˜ˆ๋น„ ๋ถ„์„๊ณผ ๋ณธ ๋ถ„์„์„ ์—ฌ๋Ÿฌ ํ•™์Šต ํ”„ ๋กœ๊ทธ๋žจ์— ๋ฏธ๋ฆฌ ์ ์šฉํ•œ ๊ฒฐ๊ณผ๋กœ๋ถ€ํ„ฐ ์ž๋™์œผ๋กœ ์–ป์–ด ๋‚ธ๋‹ค. ์—ฌ๊ธฐ์„œ ์ œ์‹œํ•œ ๋ฐฉ๋ฒ•๋“ค์€ ์‹ค์ œ C ์†Œ์Šค ์ฝ”๋“œ ๋ถ„์„๊ธฐ์— ์ ์šฉํ•˜์—ฌ ๊ทธ ํšจ๊ณผ๋ฅผ ์‹คํ—˜์ ์œผ๋กœ ์ž…์ฆํ–ˆ๋‹ค.1. Introduction 1 1.1 Goal 1 1.2 Solution 2 1.3 Outline 4 2. Preliminaries 5 2.1 Program 5 2.2 Collecting Semantics 6 2.3 Abstract Semantics 6 3 Selectively X-sensitive Analysis by Impact Pre-Analysis 9 3.1 Introduction 9 3.2 Informal Description 11 3.3 ProgramRepresentation 17 3.4 Selective Context-Sensitive Analysis with Context-Sensitivity Parameter K 18 3.5 Impact Pre-Analysis for Finding K 22 3.5.1 Designing an Impact Pre-Analysis 22 3.5.2 Use of the Pre-Analysis Results 28 3.6 Application to Selective Relational Analysis 35 3.7 Experiments 40 3.8 Summary 42 4 Selectively X-sensitive analysis by learning data generated by impact pre-analysis 47 4.1 Introduction 47 4.2 Informal Explanation 50 4.2.1 Octagon Analysis with Variable Clustering 50 4.2.2 Automatic Learning of a Variable-Clustering Strategy 52 4.3 Octagon Analysis with Variable Clustering 56 4.3.1 Programs 56 4.3.2 Octagon Analysis 56 4.3.3 Variable Clustering and Partial Octagon Analysis 58 4.4 Learning a Strategy for Clustering Variables 59 4.4.1 Automatic Generation of Labeled Data 60 4.4.2 Features and Classier 63 4.4.3 Strategy for Clustering Variables 64 4.5 Experiments 66 4.5.1 Effectiveness 67 4.5.2 Generalization 68 4.5.3 Feature Design 69 4.5.4 Choice of an Off-the-shelf Classification Algorithm 70 4.6 Summary 70 5 Selectively Unsound Analysis by Machine Learning 75 5.1 Introduction 75 5.2 Overview 78 5.2.1 Uniformly Unsound Analysis 78 5.2.2 Uniformly Sound Analysis 79 5.2.3 Selectively Unsound Analysis 80 5.2.4 Our Learning Approach 80 5.3 Our Technique 81 5.3.1 Parameterized Static Analysis 82 5.3.2 Learning a Classifier 83 5.4 Instance Analyses 87 5.4.1 A Generic, Selectively Unsound Static Analysis 87 5.4.2 Instantiation 1: Interval Analysis 91 5.4.3 Instantiation 2: TaintAnalysis 91 5.5 Experiments 92 5.5.1 Setting 92 5.5.2 Effectiveness of Our Approach 93 5.5.3 Efficacy of OC-SVM 96 5.5.4 Feature Design 97 5.5.5 Time Cost 98 5.5.6 Discussion 98 5.6 Summary 100 6 Related Work 106 6.1 Parametric Static Analysis 106 6.2 Goal-directed Static Analysis 107 6.3 Data-driven Static Analysis 108 6.4 Context-sensitivity and Relational Analysis 108 6.5 Unsoundness in Static Analysis 110 7 Conclusion 112Docto
    corecore