340 research outputs found
Automatically Finding Bugs in Open Source Programs
We consider properties desirable for static analysis tools targeted at finding bugs in the real open source code, and review tools based on various approaches to defect detection. A static analysis tool is described, that includes a framework for flow-sensitive interprocedural dataflow analysis and scales to analysis of large
programs. The framework enables implementation of multiple checkers searching for specific bugs, such as null pointer dereference and buffer overflow, abstracting from the checkers details such as alias analysis
Existence of Dependency-Based Attacks in NodeJS Environment
Node.js is an open source server-side run-time platform for JavaScript applications. Node.js applications are dependent on several, even hundreds, packages, which in turn have many dependencies. There is always a risk of malicious code hidden in one of these dependencies.
This work analyzes vulnerabilities found in Node.js based applications, discusses basic types of attacks and reports about the assessment of five frequently-used Node.js packages
Static Analysis for Discovering Security Vulnerabilities in Web Applications on the Asp.Net Platform
Tato bakalářská práce popisuje jak teoretické základy, tak způsob vytvoření statického analyzátoru založeném na platformě .NET Framework a službách poskytnutých prostřednictvím .NET Compiler Platform. Tento analyzátor detekuje bezpečnostní slabiny typu SQL injection na platformě ASP.NET MVC. Analyzátor nejdříve sestrojuje grafy řízení toku jako abstraktní reprezentaci analyzovaného programu. Poté využívá statické analýzy pro sledování potenciálně nedůvěryhodných dat. Nakonec jsou výsledky analýzy prezentovány uživateli.This Bachelor thesis is intended to describe theoretical foundations as well as the construction of a static taint analyser based on the .NET Framework and the analysis services provided by the .NET Compiler Platform. This analyser detects SQL injection security vulnerabilities on the ASP.NET MVC platform. Firstly, the analyser constructs control flow graphs as an abstract representation of the analysed program. Then, it uses a static taint analysis to track potentially distrusted and tainted data values. Finally, analysis results are presented to the user.
Generating Predicate Callback Summaries for the Android Framework
One of the challenges of analyzing, testing and debugging Android apps is
that the potential execution orders of callbacks are missing from the apps'
source code. However, bugs, vulnerabilities and refactoring transformations
have been found to be related to callback sequences. Existing work on control
flow analysis of Android apps have mainly focused on analyzing GUI events. GUI
events, although being a key part of determining control flow of Android apps,
do not offer a complete picture. Our observation is that orthogonal to GUI
events, the Android API calls also play an important role in determining the
order of callbacks. In the past, such control flow information has been modeled
manually. This paper presents a complementary solution of constructing program
paths for Android apps. We proposed a specification technique, called Predicate
Callback Summary (PCS), that represents the callback control flow information
(including callback sequences as well as the conditions under which the
callbacks are invoked) in Android API methods and developed static analysis
techniques to automatically compute and apply such summaries to construct apps'
callback sequences. Our experiments show that by applying PCSs, we are able to
construct Android apps' control flow graphs, including inter-callback
relations, and also to detect infeasible paths involving multiple callbacks.
Such control flow information can help program analysis and testing tools to
report more precise results. Our detailed experimental data is available at:
http://goo.gl/NBPrKsComment: 11 page
Symbol-Specific Sparsification of Interprocedural Distributive Environment Problems
Previous work has shown that one can often greatly speed up static analysis
by computing data flows not for every edge in the program's control-flow graph
but instead only along definition-use chains. This yields a so-called sparse
static analysis. Recent work on SparseDroid has shown that specifically taint
analysis can be "sparsified" with extraordinary effectiveness because the taint
state of one variable does not depend on those of others. This allows one to
soundly omit more flow-function computations than in the general case.
In this work, we now assess whether this result carries over to the more
generic setting of so-called Interprocedural Distributive Environment (IDE)
problems. Opposed to taint analysis, IDE comprises distributive problems with
large or even infinitely broad domains, such as typestate analysis or linear
constant propagation. Specifically, this paper presents Sparse IDE, a framework
that realizes sparsification for any static analysis that fits the IDE
framework.
We implement Sparse IDE in SparseHeros, as an extension to the popular Heros
IDE solver, and evaluate its performance on real-world Java libraries by
comparing it to the baseline IDE algorithm. To this end, we design, implement
and evaluate a linear constant propagation analysis client on top of
SparseHeros. Our experiments show that, although IDE analyses can only be
sparsified with respect to symbols and not (numeric) values, Sparse IDE can
nonetheless yield significantly lower runtimes and often also memory
consumptions compared to the original IDE.Comment: To be published in ICSE 202
Lossless, Persisted Summarization of Static Callgraph, Points-To and Data-Flow Analysis
Static analysis is used to automatically detect bugs and security breaches, and aids compiler optimization. Whole-program analysis (WPA) can yield high precision, however causes long analysis times and thus does not match common software-development workflows, making it often impractical to use for large, real-world applications.
This paper thus presents the design and implementation of ModAlyzer, a novel static-analysis approach that aims at accelerating whole-program analysis by making the analysis modular and compositional. It shows how to compute lossless, persisted summaries for callgraph, points-to and data-flow information, and it reports under which circumstances this function-level compositional analysis outperforms WPA.
We implemented ModAlyzer as an extension to LLVM and PhASAR, and applied it to 12 real-world C and C++ applications. At analysis time, ModAlyzer modularly and losslessly summarizes the analysis effect of the library code those applications share, hence avoiding its repeated re-analysis. The experimental results show that the reuse of these summaries can save, on average, 72% of analysis time over WPA. Moreover, because it is lossless, the module-wise analysis fully retains precision and recall. Surprisingly, as our results show, it sometimes even yields precision superior to WPA. The initial summary generation, on average, takes about 3.67 times as long as WPA
Enabling Additional Parallelism in Asynchronous JavaScript Applications
JavaScript is a single-threaded programming language, so asynchronous programming is practiced out of necessity to ensure that applications remain responsive in the presence of user input or interactions with file systems and networks. However, many JavaScript applications execute in environments that do exhibit concurrency by, e.g., interacting with multiple or concurrent servers, or by using file systems managed by operating systems that support concurrent I/O. In this paper, we demonstrate that JavaScript programmers often schedule asynchronous I/O operations suboptimally, and that reordering such operations may yield significant performance benefits. Concretely, we define a static side-effect analysis that can be used to determine how asynchronous I/O operations can be refactored so that asynchronous I/O-related requests are made as early as possible, and so that the results of these requests are awaited as late as possible. While our static analysis is potentially unsound, we have not encountered any situations where it suggested reorderings that change program behavior. We evaluate the refactoring on 20 applications that perform file- or network-related I/O. For these applications, we observe average speedups ranging between 0.99% and 53.6% for the tests that execute refactored code (8.1% on average)
Parameterized Algorithms for Scalable Interprocedural Data-flow Analysis
Data-flow analysis is a general technique used to compute information of
interest at different points of a program and is considered to be a cornerstone
of static analysis. In this thesis, we consider interprocedural data-flow
analysis as formalized by the standard IFDS framework, which can express many
widely-used static analyses such as reaching definitions, live variables, and
null-pointer. We focus on the well-studied on-demand setting in which queries
arrive one-by-one in a stream and each query should be answered as fast as
possible. While the classical IFDS algorithm provides a polynomial-time
solution to this problem, it is not scalable in practice. Specifically, it
either requires a quadratic-time preprocessing phase or takes linear time per
query, both of which are untenable for modern huge codebases with hundreds of
thousands of lines. Previous works have already shown that parameterizing the
problem by the treewidth of the program's control-flow graph is promising and
can lead to significant gains in efficiency. Unfortunately, these results were
only applicable to the limited special case of same-context queries.
In this work, we obtain significant speedups for the general case of
on-demand IFDS with queries that are not necessarily same-context. This is
achieved by exploiting a new graph sparsity parameter, namely the treedepth of
the program's call graph. Our approach is the first to exploit the sparsity of
control-flow graphs and call graphs at the same time and parameterize by both
treewidth and treedepth. We obtain an algorithm with a linear preprocessing
phase that can answer each query in constant time with respect to the input
size. Finally, we show experimental results demonstrating that our approach
significantly outperforms the classical IFDS and its on-demand variant
- …