5 research outputs found

    Stream-based dynamic data race detection

    Get PDF
    Detecting data races in modern code executing on multicore processors is challenging. Instrumentation-based techniques for race detection not only have a high performance impact, but also are not likely to be certified for safety-critical systems. This paper presents a data race detector based on the well-known lockset algorithm in the runtime verification language TeSSLa, which is a stream-based specification using dynamic data structures to record lock operations and memory accesses. Such a specification can then be instantiated with particular parameters to make it suitable for the more limited planned monitoring using field- programmable gate arrays

    Static detection of control-flow-related vulnerabilities using graph embedding

    Full text link
    ยฉ 2019 IEEE. Static vulnerability detection has shown its effectiveness in detecting well-defined low-level memory errors. However, high-level control-flow related (CFR) vulnerabilities, such as insufficient control flow management (CWE-691), business logic errors (CWE-840), and program behavioral problems (CWE-438), which are often caused by a wide variety of bad programming practices, posing a great challenge for existing general static analysis solutions. This paper presents a new deep-learning-based graph embedding approach to accurate detection of CFR vulnerabilities. Our approach makes a new attempt by applying a recent graph convolutional network to embed code fragments in a compact and low-dimensional representation that preserves high-level control-flow information of a vulnerable program. We have conducted our experiments using 8,368 real-world vulnerable programs by comparing our approach with several traditional static vulnerability detectors and state-of-the-art machine-learning-based approaches. The experimental results show the effectiveness of our approach in terms of both accuracy and recall. Our research has shed light on the promising direction of combining program analysis with deep learning techniques to address the general static analysis challenges

    A Debugging Technique for Static Analyzer Using Trace History of Abstract Values

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2019. 2. ์ด๊ด‘๊ทผ.๋ณธ ๋…ผ๋ฌธ์€ ์š”์•ฝ ๊ฐ’์ด ์ƒ์„ฑ๋˜๋Š” ๊ณผ์ •์„ ํ•จ๊ป˜ ๋ถ„์„ํ•˜์—ฌ ๋ถ„์„๊ธฐ ๊ฐœ๋ฐœ์ž๊ฐ€ ์ •์  ๋ถ„์„๊ธฐ ๋””๋ฒ„๊น…์„ ๋ณด๋‹ค ์‰ฝ๊ฒŒ ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์†Œ๊ฐœํ•œ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ ํฌ๊ฒŒ ์„ธ ๋‹จ๊ณ„๋กœ ๊ตฌ์„ฑ๋œ๋‹ค: (i) ์š”์•ฝ๋œ ๊ฐ’๋“ค์ด ๋งŒ๋“ค์–ด์ง€๋Š” ๊ณผ์ •์„ ๊ธฐ๋กํ•˜๊ธฐ ์œ„ํ•ด ๋ถ„์„๊ธฐ ์š”์•ฝ ๋„๋ฉ”์ธ์„ ํ™•์žฅํ•œ๋‹ค. ์ด ํ™•์žฅ์„ ํ†ตํ•ด ๊ฐ’ ๊ณ„์‚ฐ ์‹œ ๊ด€์—ฌํ•œ ๋ถ„์„๊ธฐ ํ”„๋กœ๊ทธ๋žจ ์ง€์ ์„ ๋ถ„์„ํ•  ์ˆ˜ ์žˆ๋‹ค. (ii) ๋ถ„์„๊ธฐ ๊ฐœ๋ฐœ์ž๊ฐ€ ์ด์ƒํ•œ ๋ถ„์„ ๊ฒฐ๊ณผ ๊ฐ’์ด ๋งŒ๋“ค์–ด์ง„ ์ง€์ ์„ ์š”์ฒญํ•˜๋ฉด ๋ถ„์„ ๊ฒฐ๊ณผ๋ฅผ ํ† ๋Œ€๋กœ ๊ฐ’์ด ๋งŒ๋“ค์–ด์ง„ ์ง€์ ๋“ค์„ ๋ณด์—ฌ์ค€๋‹ค. (iii) ์ด๋•Œ ๋ณด์—ฌ์ค„ ์ง€์ ์ด ์—ฌ๋Ÿฌ ๊ฐœ๊ฐ€ ์žˆ์„ ๊ฒฝ์šฐ ์˜ค๋ฅ˜๊ฐ€ ์ผ์–ด๋‚  ๊ฒƒ ๊ฐ™์€ ์ง€์ ๋“ค์— ๋Œ€ํ•ด ์˜์‹ฌ์Šค๋Ÿฌ์›€์„ ์ ์ˆ˜๋กœ ๋งค๊ฒจ ๊ฐ€์žฅ ๋†’์€ ์ ์ˆ˜์˜ ์ง€์ ์„ ๊ฐœ๋ฐœ์ž์—๊ฒŒ ๋จผ์ € ๋ณด์—ฌ์ค€๋‹ค. ์ด ๋ฐฉ๋ฒ•์˜ ์œ ์šฉ์„ฑ์„ ๋ณด์ด๊ธฐ ์œ„ํ•ด ์ •์  ๋ถ„์„๊ธฐ ๋””๋ฒ„๊น… ํˆด Footprint์„ ๊ตฌํ˜„ํ•œ๋‹ค. ์ŠคํŒจ๋กœ์šฐ๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๊ณผ์ •์—์„œ ๋ฐœ์ƒํ–ˆ๋˜ ์ด 57๊ฐœ์˜ ๋ฒ„๊ทธ์— ๋Œ€ํ•ด ์ถ”๊ฐ€์ ์ธ ๊ฐ’ ํ๋ฆ„ ์ •๋ณด๊ฐ€ ๋””๋ฒ„๊น…์— ๋„์›€์ด ๋˜๋Š”์ง€ ์‹คํ—˜ํ•˜์˜€๊ณ , 34๊ฐœ์˜ ์ •์  ๋ถ„์„๊ธฐ ๋ฒ„๊ทธ ์ˆ˜์ •์— ๋Œ€ํ•ด์„œ๋Š” ๋„์›€์ด ๋œ๋‹ค๋Š” ๊ฒƒ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๊ทธ ์ค‘ ๋‘๊ฐœ์˜ ๋ฒ„๊ทธ์— ๋Œ€ํ•ด์„œ๋Š” 8์ฒœ ์ค„ ๊ทœ๋ชจ์˜ ์‹ค์ œ ํ”„๋กœ๊ทธ๋žจ์—์„œ ๋ฒ„๊ทธ๊ฐ€ ๋ฐœ์ƒํ–ˆ์„ ๋•Œ์—๋„ Footprint๊ฐ€ ๋„์›€์„ ์ฃผ๋Š”๊ฑธ ํ™•์ธํ•˜์˜€๋‹ค.In this paper, we propose a debugging method for static analyzer developers to provide useful informations that may minimize debugging effort by analyzing how abstract values are generated. This method consists of three steps. (i) Extend an existing abstract domain to log processes by which abstract values are created. (ii) When the developer requests a program point where a wrong abstract value is calculated, the debugging tool shows program points where the value is created based on analysis result. (iii) If there are multiple program points to show, then give scores to the program points where the bug is likely to occur. To demonstrate the usefulness of this method, we implemented the static analyzer debugging tool Footprint. We tested whether additional value flow informations are useful for debugging a total of 57 bugs that occurred during the implementation of Sparrow. As a result, the tool was helpful for 34 bugs. Footprint was useful to show erroneous program points in the analyzer's code even when bugs occurred in actual program of 8,000 lines.์š”์•ฝ ๋ชฉ์ฐจ ๊ทธ๋ฆผ ๋ชฉ์ฐจ ํ‘œ ๋ชฉ์ฐจ ์ œ 1์žฅ ์„œ๋ก  1.1 ๋™๊ธฐ ๋ฐ ๋ฌธ์ œ 1.2 ํ•ด๊ฒฐ : ๊ธฐ์กด ๋ถ„์„๊ธฐ์— ์˜ฌ๋ผํƒ€ ๋ถ„์„ ๊ณผ์ •๋„ ํ•จ๊ป˜ ๋ถ„์„ํ•˜๊ธฐ 1.3 ์‹คํ—˜ ๋ฐฉ๋ฒ• ๋ฐ ๊ฒฐ๊ณผ 1.4 ๋…ผ๋ฌธ ๊ตฌ์„ฑ ์ œ 2์žฅ ๋ฐฐ๊ฒฝ์ง€์‹ 2.1 ํ”„๋กœ๊ทธ๋žจ 2.2 ๋ถ„์„๊ธฐ ๋””์ž์ธ 2.2.1 ๋ชจ๋“ฌ์‹คํ–‰ ์˜๋ฏธ 2.2.2. ์š”์•ฝ ์˜๋ฏธ ๊ณต๊ฐ„ 2.2.3 ์š”์•ฝ ์˜๋ฏธ ํ•จ์ˆ˜ 2.2.4 ๊ณ ์ •์  ๊ณ„์‚ฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜ 2.3 ๋ถ„์„ ์˜ˆ์ œ ์ œ 3์žฅ ์„ค๊ณ„ ๋ฐ ๊ตฌํ˜„ 3.1 ํฐ ๊ทธ๋ฆผ 3.2 ๋””๋ฒ„๊น… ํˆด ๋””์ž์ธ 3.2.1 ์š”์•ฝ ๋„๋ฉ”์ธ ํ™•์žฅํ•˜๊ธฐ 3.2.2 ๊ฐ’ ์ •๋ณด๋„ ๋ชจ์œผ๊ธฐ 3.2.3 ๊ฐ’์˜ ์œ„ํ—˜๋„ ๊ณ„์‚ฐํ•˜๊ธฐ 3.2.4 ๊ฐ’ ์ •๋ณด ์ •๋ ฌํ•˜๊ธฐ 3.3 ๋””๋ฒ„๊น… ๊ตฌํ˜„ ์ œ 4์žฅ ์‹คํ—˜ ๊ฒฐ๊ณผ ์ œ 5์žฅ ๋…ผ์˜ ๋ฐ ๊ฒฐ๋ก  5.1 ๋…ผ์˜ 5.2 ๊ฒฐ๋ก  ์ฐธ๊ณ ๋ฌธํ—Œ AbstractMaste

    Machine Learning for Actionable Warning Identification: A Comprehensive Survey

    Full text link
    Actionable Warning Identification (AWI) plays a crucial role in improving the usability of static code analyzers. With recent advances in Machine Learning (ML), various approaches have been proposed to incorporate ML techniques into AWI. These ML-based AWI approaches, benefiting from ML's strong ability to learn subtle and previously unseen patterns from historical data, have demonstrated superior performance. However, a comprehensive overview of these approaches is missing, which could hinder researchers/practitioners from understanding the current process and discovering potential for future improvement in the ML-based AWI community. In this paper, we systematically review the state-of-the-art ML-based AWI approaches. First, we employ a meticulous survey methodology and gather 50 primary studies from 2000/01/01 to 2023/09/01. Then, we outline the typical ML-based AWI workflow, including warning dataset preparation, preprocessing, AWI model construction, and evaluation stages. In such a workflow, we categorize ML-based AWI approaches based on the warning output format. Besides, we analyze the techniques used in each stage, along with their strengths, weaknesses, and distribution. Finally, we provide practical research directions for future ML-based AWI approaches, focusing on aspects like data improvement (e.g., enhancing the warning labeling strategy) and model exploration (e.g., exploring large language models for AWI)

    Machine-Learning-Guided Selectively Unsound Static Analysis

    No full text
    We present a machine-learning-based technique for selectively applying unsoundness in static analysis. Existing bug-finding static analyzers are unsound in order to be precise and scalable in practice. However, they are uniformly unsound and hence at the risk of missing a large amount of real bugs. By being sound, we can improve the detectability of the analyzer but it often suffers from a large number of false alarms. Our approach aims to strike a balance between these two approaches by selectively allowing unsoundness only when it is likely to reduce false alarms, while retaining true alarms. We use an anomaly-detection technique to learn such harmless unsoundness. We implemented our technique in two static analyzers for full C. One is for a taint analysis for detecting format-string vulnerabilities, and the other is for an interval analysis for buffer-overflow detection. The experimental results show that our approach significantly improves the recall of the original unsound analysis without sacrificing the precision. ยฉ 2017 IEEE.N
    corecore