15 research outputs found
Interprocedural Data Flow Analysis in Soot using Value Contexts
An interprocedural analysis is precise if it is flow sensitive and fully
context-sensitive even in the presence of recursion. Many methods of
interprocedural analysis sacrifice precision for scalability while some are
precise but limited to only a certain class of problems.
Soot currently supports interprocedural analysis of Java programs using graph
reachability. However, this approach is restricted to IFDS/IDE problems, and is
not suitable for general data flow frameworks such as heap reference analysis
and points-to analysis which have non-distributive flow functions.
We describe a general-purpose interprocedural analysis framework for Soot
using data flow values for context-sensitivity. This framework is not
restricted to problems with distributive flow functions, although the lattice
must be finite. It combines the key ideas of the tabulation method of the
functional approach and the technique of value-based termination of call string
construction.
The efficiency and precision of interprocedural analyses is heavily affected
by the precision of the underlying call graph. This is especially important for
object-oriented languages like Java where virtual method invocations cause an
explosion of spurious call edges if the call graph is constructed naively. We
have instantiated our framework with a flow and context-sensitive points-to
analysis in Soot, which enables the construction of call graphs that are far
more precise than those constructed by Soot's SPARK engine.Comment: SOAP 2013 Final Versio
Can Large Language Models Write Good Property-Based Tests?
Property-based testing (PBT), while an established technique in the software
testing research community, is still relatively underused in real-world
software. Pain points in writing property-based tests include implementing
diverse random input generators and thinking of meaningful properties to test.
Developers, however, are more amenable to writing documentation; plenty of
library API documentation is available and can be used as natural language
specifications for property-based tests. As large language models (LLMs) have
recently shown promise in a variety of coding tasks, we explore the potential
of using LLMs to synthesize property-based tests. We call our approach PBT-GPT,
and propose three different strategies of prompting the LLM for PBT. We
characterize various failure modes of PBT-GPT and detail an evaluation
methodology for automatically synthesized property-based tests. PBT-GPT
achieves promising results in our preliminary studies on sample Python library
APIs in , , and
Distributed Execution Indexing
This work-in-progress report presents both the design and partial evaluation
of distributed execution indexing, a technique for microservice applications
that precisely identifies dynamic instances of inter-service remote procedure
calls (RPCs). Such an indexing scheme is critical for request-level fault
injection techniques, which aim to automatically find failure-handling bugs in
microservice applications.Distributed execution indexes enable granular
specification of request-level faults, while also establishing a correspondence
between inter-service RPCs across multiple executions, as is required to
perform a systematic search of the fault space.In this paper, we formally
define the general concept of a distributed execution index, which can be
parameterized on different ways of identifying an RPC in a single service. We
identify an instantiation that maintains precision in the presence of a variety
of program structure complexities such as loops, function indirection, and
concurrency with scheduling nondeterminism. We demonstrate that this particular
instantiation addresses gaps in the state-of-the-art in request-level fault
injection and show that they are all special cases of distributed execution
indexing. We discuss the implementation challenges and provide an
implementation of distributed execution indexing as an extension of
\Filibuster{}, a resilience testing tool for microservice applications for the
Java programming language, which supports fault injection for gRPC and HTTP
Recommended from our members
Abstractions and Algorithms for Specializing Dynamic Program Analysis and Random Fuzz Testing
Software bugs affect the security, performance, and reliability of critical systems that much of our society depends on. In practice, the predominant method of ensuring software quality is via extensive testing. Software developers have considerable domain expertise about their own software, and are adept at writing functional tests. However, handcrafted tests often fail to catch corner cases. Further, it is far less common to find software projects that ship with handwritten tests that target non-functional software issues such as performance, concurrency, security, and privacy.Dynamic program analysis techniques can be used to find potential software bugs by observing program execution. Such techniques are limited by the availability of quality inputs with which to execute the program. For example, although profilers can be used to diagnose performance issues when good stress tests are available, they are not very useful when provided with only small functional test cases. Researchers have also developed various algorithms to automatically generate test inputs. Techniques such random fuzzing are a promising approach for discovering unexpected inputs in a scalable manner. Coverage-guided fuzzing (CGF) tools that evolve a corpus of test inputs via random mutations and guided by test-execution feedback have recently become popular due to their success in crashing programs that process binary data. However, by relying solely on hard-coded heuristics, their effectiveness as push-button tools is limited when the test program, the input format, or the testing objective becomes complex. This dissertation presents new abstractions and algorithms that empower software developers to specialize automated testing tools using their domain expertise. First, we present two techniques to find algorithmic performance issues, such accidentally sub-optimal worst-case complexity, using only developer-provided functional tests: (1) Travioli performs dynamic analysis of unit test executions to precisely identify program functions that perform redundant data-structure traversals; (2) PerfFuzz employs a novel algorithm based on CGF to automatically generate inputs that exercise worst-case complexity. These techniques have helped discover previously unknown asymptotic performance bugs in real-world software including the D3 visualization toolkit, the ExpressJS web server, and the Google Closure Compiler.Second, we present Zest+JQF, a technique and framework respectively to find semantic bugs in programs that process complex structured inputs in a multi-stage pipeline, such as compilers. This approach leverages domain knowledge about a program under test by allowing users to provide: (1) simple generator functions that sample syntactically valid inputs, and (2) predicate functions that determine whether a sampled input is also semantically valid. Zest automatically guides the user-provided generator functions towards producing inputs that are likely to be semantically valid and also increase code coverage in the program under test. JQF allows researchers to plug-in custom algorithms for guiding such generators. Together, Zest+JQF have enabled the discovery of 42 previously unknown software bugs in widely used Java projects such as OpenJDK, Apache Commons, Maven, Ant, and the Google Closure Compiler. Many of these bugs are far beyond the reach of conventional CGF or generator-based testing tools.Finally, we present FuzzFactory, a framework for rapidly prototyping and composing domain-specific fuzzing applications. With FuzzFactory, new fuzzing applications can be created by defining a strategy for selecting which mutated inputs should be saved as the basis for subsequent mutations; such inputs are called waypoints. FuzzFactory provides a lightweight API for instrumenting programs such that they provide custom feedback during test execution; this feedback is used to determine if the corresponding test input should be considered a waypoint. We describe six domain-specific fuzzing applications created with FuzzFactory. We also show how two of these applications can be composed together to create a fuzzer that performs better than the sum of its parts
Recommended from our members
Dyson: An Architecture for Extensible Wireless LANs
As wireless local area networks (WLANs) continue to evolve. the fundamental division of responsibility between the access point (AP) and the client has remained unchanged. In most cases, clients make independent decisions about associations and packet transmissions, using only locally available information. Furthermore, the IEEE 802.11 standard defines a very limited interface for transferring information between the APs and the clients. These factors impede customization of WLANs to meet site-specific challenges, and in a more general sense, impede rapid innovation to face challenges posed by new applications such as VoIP. This paper describes Dyson, an extensible architecture for WLANs, targeted primarily at enterprise scenarios. Our architecture is based on centralized, global management of channel resources. To provide extensibility, the interface between the infrastructure and clients is simple and relatively low-level, and can be controlled through a programmatic interface. Clients provide primitives that allow the central controller to control many aspects of client behavior. The controller can also instruct clients to gather and report information about channel conditions. We show that using these simple primitives, and by leveraging historical information, the network designer can easily customize many aspects of the WLAN behavior. We have built a prototype implementation of Dyson, which currently runs on a 23-node testbed distributed across one floor of a typical academic building. Using this testbed, we examine various aspects of the architecture in detail, including a range of policies for improving client-AP associations, providing user-specific airtime reservations, mitigating the effects of interference, and improving mobile handoffs. We show that Dyson is effective at providing greater efficiency while opening up the network to site-specific customizations.Engineering and Applied Science