1,112 research outputs found
Combining Static and Dynamic Analysis for Vulnerability Detection
In this paper, we present a hybrid approach for buffer overflow detection in
C code. The approach makes use of static and dynamic analysis of the
application under investigation. The static part consists in calculating taint
dependency sequences (TDS) between user controlled inputs and vulnerable
statements. This process is akin to program slice of interest to calculate
tainted data- and control-flow path which exhibits the dependence between
tainted program inputs and vulnerable statements in the code. The dynamic part
consists of executing the program along TDSs to trigger the vulnerability by
generating suitable inputs. We use genetic algorithm to generate inputs. We
propose a fitness function that approximates the program behavior (control
flow) based on the frequencies of the statements along TDSs. This runtime
aspect makes the approach faster and accurate. We provide experimental results
on the Verisec benchmark to validate our approach.Comment: There are 15 pages with 1 figur
Reproducing Failures in Fault Signatures
Software often fails in the field, however reproducing and debugging field
failures is very challenging: the failure-inducing input may be missing, and
the program setup can be complicated and hard to reproduce by the developers.
In this paper, we propose to generate fault signatures from the failure
locations and the original source code to reproduce the faults in small
executable programs. We say that a fault signature reproduces the fault in the
original program if the two failed in the same location, triggered the same
error conditions after executing the same selective sequences of
failure-inducing statements. A fault signature aims to contain only sufficient
statements that can reproduce the faults. That way, it provides some context to
inform how a fault is developed and also avoids unnecessary complexity and
setups that may block fault diagnosis. To compute fault signatures from the
failures, we applied a path-sensitive static analysis tool to generate a path
that leads to the fault, and then applied an existing syntactic patching tool
to convert the path into an executable program. Our evaluation on real-world
bugs from Corebench, BugBench, and Manybugs shows that fault signatures can
reproduce the fault for the original programs. Because fault signatures are
less complex, automatic test input generation tools generated failure-inducing
inputs that could not be generated by using the entire programs. Some
failure-inducing inputs can be directly transferred to the original programs.
Our experimental data are publicly available at
https://doi.org/10.5281/zenodo.5430155
Verification and falsification of programs with loops using predicate abstraction
Predicate abstraction is a major abstraction technique for the verification of software. Data is abstracted by means of Boolean variables, which keep track of predicates over the data. In many cases, predicate abstraction suffers from the need for at least one predicate for each iteration of a loop construct in the program. We propose to extract looping counterexamples from the abstract model, and to parametrise the simulation instance in the number of loop iterations. We present a novel technique that speeds up the detection of long counterexamples as well as the verification of programs with loop
On the Feasibility of Malware Authorship Attribution
There are many occasions in which the security community is interested to
discover the authorship of malware binaries, either for digital forensics
analysis of malware corpora or for thwarting live threats of malware invasion.
Such a discovery of authorship might be possible due to stylistic features
inherent to software codes written by human programmers. Existing studies of
authorship attribution of general purpose software mainly focus on source code,
which is typically based on the style of programs and environment. However,
those features critically depend on the availability of the program source
code, which is usually not the case when dealing with malware binaries. Such
program binaries often do not retain many semantic or stylistic features due to
the compilation process. Therefore, authorship attribution in the domain of
malware binaries based on features and styles that will survive the compilation
process is challenging. This paper provides the state of the art in this
literature. Further, we analyze the features involved in those techniques. By
using a case study, we identify features that can survive the compilation
process. Finally, we analyze existing works on binary authorship attribution
and study their applicability to real malware binaries.Comment: FPS 201
Buffer Overflow Vulnerability Diagnosis For Commodity Software
Buffer overflow attacks have been a computer security threat in software-based systems andapplications for decades. The existence of buffer overflow vulnerabilities makes the systemsusceptible to Internet worms and denial of service (DDoS) attacks which can cause hugesocial and financial impacts. Due to its importance, buffer overflow problem has been intensively studied. Researchershave proposed different techniques to defend against unknown buffer overflow attacks. Theyhave also investigated various solutions, including automatic signature generation, automatic patch generation, etc., to automatically protect computer systems with known vulnerabilities. The effectiveness and efficiency of the automatic signature generation approaches andthe automatic patch generation approaches are all based on the accurate understanding ofthe vulnerabilities, the buffer overflow vulnerability diagnosis (BOVD). Currently, the results of automatic signature generation and automatic patch generation are far from satisfaction due to the insufficient research results from the automatic BOVD. This thesis defines the automatic buffer overflow vulnerability diagnosis (BOVD) problemand provides solutions towards automatic BOVD for commodity software. It targets oncommodity software when source code and symbol table are not available. The solutionscombine both of the dynamic analysis techniques and static analysis techniques to achievethe goal. Based on the observation that buffer overflow attack happens when the size of the destination buffer is smaller than the total number of writes after the data copy process if the buffer overflow attack happens through a data copy procedure, the diagnosis results return the information of the size of destination buffer, the total number of writes of a data copy procedure and how the user inputs are related with them. They are achieved through bound analysis, loop analysis and input analysis respectively. We demonstrate the effectiveness of this thesis approach using real world vulnerable applications including the buffer overflow vulnerabilities attacked by the record-setting Slammer and Blaster worms. This thesis also does the complete case study for buffer overflow vulnerabilities which may have independent interests to researchers. Our buffer overflow case study results can help other researchers to design more effective defense systems and debugging tools against buffer overflow attacks
Are Code Examples on an Online Q&A Forum Reliable?
Programmers often consult an online Q&A forum such as Stack Overflow to learn new APIs. This paper presents an empirical study on the prevalence and severity of API misuse on Stack Overflow. To reduce manual assessment effort, we design ExampleCheck, an API usage mining framework that extracts patterns from over 380K Java repositories on GitHub and subsequently reports potential API usage violations in Stack Overflow posts. We analyze 217,818 Stack Overflow posts using ExampleCheck and find that 31% may have potential API usage violations that could produce unexpected behavior such as program crashes and resource leaks. Such API misuse is caused by three main reasons---missing control constructs, missing or incorrect order of API calls, and incorrect guard conditions. Even the posts that are accepted as correct answers or upvoted by other programmers are not necessarily more reliable than other posts in terms of API misuse. This study result calls for a new approach to augment Stack Overflow with alternative API usage details that are not typically shown in curated examples
- …