Search CORE

5 research outputs found

Stream-based dynamic data race detection

Author: Jakšić Svetlana
Li Dan
Pun Ka I
Stolz Volker
Publication venue: NIKT Foundation
Publication date: 08/08/2018
Field of study

Detecting data races in modern code executing on multicore processors is challenging. Instrumentation-based techniques for race detection not only have a high performance impact, but also are not likely to be certified for safety-critical systems. This paper presents a data race detector based on the well-known lockset algorithm in the runtime verification language TeSSLa, which is a stream-based specification using dynamic data structures to record lock operations and memory accesses. Such a specification can then be instantiated with particular parameters to make it suitable for the more limited planned monitoring using field- programmable gate arrays

BIBSYS: Open Journals Systems

Static detection of control-flow-related vulnerabilities using graph embedding

Author: Cheng X
Hua J
Sui Y
Wang H
Xu G
Yi L
Zhang M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2019
Field of study

© 2019 IEEE. Static vulnerability detection has shown its effectiveness in detecting well-defined low-level memory errors. However, high-level control-flow related (CFR) vulnerabilities, such as insufficient control flow management (CWE-691), business logic errors (CWE-840), and program behavioral problems (CWE-438), which are often caused by a wide variety of bad programming practices, posing a great challenge for existing general static analysis solutions. This paper presents a new deep-learning-based graph embedding approach to accurate detection of CFR vulnerabilities. Our approach makes a new attempt by applying a recent graph convolutional network to embed code fragments in a compact and low-dimensional representation that preserves high-level control-flow information of a vulnerable program. We have conducted our experiments using 8,368 real-world vulnerable programs by comparing our approach with several traditional static vulnerability detectors and state-of-the-art machine-learning-based approaches. The experimental results show the effectiveness of our approach in terms of both accuracy and recall. Our research has shed light on the promising direction of combining program analysis with deep learning techniques to address the general static analysis challenges

Crossref

OPUS - University of Technology Sydney

A Debugging Technique for Static Analyzer Using Trace History of Abstract Values

Author: 배요한
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (석사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2019. 2. 이광근.본 논문은 요약 값이 생성되는 과정을 함께 분석하여 분석기 개발자가 정적 분석기 디버깅을 보다 쉽게 할 수 있는 방법을 소개한다. 이 방법은 크게 세 단계로 구성된다: (i) 요약된 값들이 만들어지는 과정을 기록하기 위해 분석기 요약 도메인을 확장한다. 이 확장을 통해 값 계산 시 관여한 분석기 프로그램 지점을 분석할 수 있다. (ii) 분석기 개발자가 이상한 분석 결과 값이 만들어진 지점을 요청하면 분석 결과를 토대로 값이 만들어진 지점들을 보여준다. (iii) 이때 보여줄 지점이 여러 개가 있을 경우 오류가 일어날 것 같은 지점들에 대해 의심스러움을 점수로 매겨 가장 높은 점수의 지점을 개발자에게 먼저 보여준다. 이 방법의 유용성을 보이기 위해 정적 분석기 디버깅 툴 Footprint을 구현한다. 스패로우를 구현하는 과정에서 발생했던 총 57개의 버그에 대해 추가적인 값 흐름 정보가 디버깅에 도움이 되는지 실험하였고, 34개의 정적 분석기 버그 수정에 대해서는 도움이 된다는 것을 확인하였다. 그 중 두개의 버그에 대해서는 8천 줄 규모의 실제 프로그램에서 버그가 발생했을 때에도 Footprint가 도움을 주는걸 확인하였다.In this paper, we propose a debugging method for static analyzer developers to provide useful informations that may minimize debugging effort by analyzing how abstract values are generated. This method consists of three steps. (i) Extend an existing abstract domain to log processes by which abstract values are created. (ii) When the developer requests a program point where a wrong abstract value is calculated, the debugging tool shows program points where the value is created based on analysis result. (iii) If there are multiple program points to show, then give scores to the program points where the bug is likely to occur. To demonstrate the usefulness of this method, we implemented the static analyzer debugging tool Footprint. We tested whether additional value flow informations are useful for debugging a total of 57 bugs that occurred during the implementation of Sparrow. As a result, the tool was helpful for 34 bugs. Footprint was useful to show erroneous program points in the analyzer's code even when bugs occurred in actual program of 8,000 lines.요약 목차 그림 목차 표 목차 제 1장 서론 1.1 동기 및 문제 1.2 해결 : 기존 분석기에 올라타 분석 과정도 함께 분석하기 1.3 실험 방법 및 결과 1.4 논문 구성 제 2장 배경지식 2.1 프로그램 2.2 분석기 디자인 2.2.1 모듬실행 의미 2.2.2. 요약 의미 공간 2.2.3 요약 의미 함수 2.2.4 고정점 계산 알고리즘 2.3 분석 예제 제 3장 설계 및 구현 3.1 큰 그림 3.2 디버깅 툴 디자인 3.2.1 요약 도메인 확장하기 3.2.2 값 정보도 모으기 3.2.3 값의 위험도 계산하기 3.2.4 값 정보 정렬하기 3.3 디버깅 구현 제 4장 실험 결과 제 5장 논의 및 결론 5.1 논의 5.2 결론 참고문헌 AbstractMaste

SNU Open Repository and Archive

Machine Learning for Actionable Warning Identification: A Comprehensive Survey

Author: Chen Zhenyu
Fang Chunrong
Ge Xiuting
Li Xuanye
Lin Shangwei
Liu Yang
Sun Weisong
Wu Daoyuan
Zhai Juan
Zhao Zhihong
Publication venue
Publication date: 30/11/2023
Field of study

Actionable Warning Identification (AWI) plays a crucial role in improving the usability of static code analyzers. With recent advances in Machine Learning (ML), various approaches have been proposed to incorporate ML techniques into AWI. These ML-based AWI approaches, benefiting from ML's strong ability to learn subtle and previously unseen patterns from historical data, have demonstrated superior performance. However, a comprehensive overview of these approaches is missing, which could hinder researchers/practitioners from understanding the current process and discovering potential for future improvement in the ML-based AWI community. In this paper, we systematically review the state-of-the-art ML-based AWI approaches. First, we employ a meticulous survey methodology and gather 50 primary studies from 2000/01/01 to 2023/09/01. Then, we outline the typical ML-based AWI workflow, including warning dataset preparation, preprocessing, AWI model construction, and evaluation stages. In such a workflow, we categorize ML-based AWI approaches based on the warning output format. Besides, we analyze the techniques used in each stage, along with their strengths, weaknesses, and distribution. Finally, we provide practical research directions for future ML-based AWI approaches, focusing on aspects like data improvement (e.g., enhancing the warning labeling strategy) and model exploration (e.g., exploring large language models for AWI)

arXiv.org e-Print Archive

Machine-Learning-Guided Selectively Unsound Static Analysis

Author: Heo K.
Oh H.
Yi K.
Publication venue: Institute of Electrical and Electronics Engineers Inc.
Publication date: 01/01/2017
Field of study

We present a machine-learning-based technique for selectively applying unsoundness in static analysis. Existing bug-finding static analyzers are unsound in order to be precise and scalable in practice. However, they are uniformly unsound and hence at the risk of missing a large amount of real bugs. By being sound, we can improve the detectability of the analyzer but it often suffers from a large number of false alarms. Our approach aims to strike a balance between these two approaches by selectively allowing unsoundness only when it is likely to reduce false alarms, while retaining true alarms. We use an anomaly-detection technique to learn such harmless unsoundness. We implemented our technique in two static analyzers for full C. One is for a taint analysis for detecting format-string vulnerabilities, and the other is for an interval analysis for buffer-overflow detection. The experimental results show that our approach significantly improves the recall of the original unsound analysis without sacrificing the precision. © 2017 IEEE.N

Crossref

SNU Open Repository and Archive