344 research outputs found
A Brief Survey on Oracle-based Test Adequacy Metrics
Even though code coverage is a widespread and popular test adequacy metric,
it has several limitations. One of the major limitations is that code coverage
does not satisfy the necessary conditions for effective fault detection, as it
only cares about executing different parts of a program. Studies showed that
code coverage as a test adequacy metric is a poor indicator of the quality of a
test suite as it does not consider the quality of test oracles. To address this
limitation, researchers proposed extensions to traditional code coverage
metrics that explicitly consider test oracle quality. We name these metrics as
\textit{oracle-based code coverage}. This survey paper has discussed all
oracle-based techniques published so far, starting from 2007.Comment: 7 page
Generalized Abstract Symbolic Summaries
Current techniques for validating and verifying program changes often consider the entire program, even for small changes, leading to enormous V&V costs over a program s lifetime. This is due, in large part, to the use of syntactic program techniques which are necessarily imprecise. Building on recent advances in symbolic execution of heap manipulating programs, in this paper, we develop techniques for performing abstract semantic differencing of program behaviors that offer the potential for improved precision
Runtime Verification in Context : Can Optimizing Error Detection Improve Fault Diagnosis
Runtime verification has primarily been developed and evaluated as a means of enriching the software testing process. While many researchers have pointed to its potential applicability in online approaches to software fault tolerance, there has been a dearth of work exploring the details of how that might be accomplished. In this paper, we describe how a component-oriented approach to software health management exposes the connections between program execution, error detection, fault diagnosis, and recovery. We identify both research challenges and opportunities in exploiting those connections. Specifically, we describe how recent approaches to reducing the overhead of runtime monitoring aimed at error detection might be adapted to reduce the overhead and improve the effectiveness of fault diagnosis
Harnessing Neuron Stability to Improve DNN Verification
Deep Neural Networks (DNN) have emerged as an effective approach to tackling
real-world problems. However, like human-written software, DNNs are susceptible
to bugs and attacks. This has generated significant interests in developing
effective and scalable DNN verification techniques and tools. In this paper, we
present VeriStable, a novel extension of recently proposed DPLL-based
constraint DNN verification approach. VeriStable leverages the insight that
while neuron behavior may be non-linear across the entire DNN input space, at
intermediate states computed during verification many neurons may be
constrained to have linear behavior - these neurons are stable. Efficiently
detecting stable neurons reduces combinatorial complexity without compromising
the precision of abstractions. Moreover, the structure of clauses arising in
DNN verification problems shares important characteristics with industrial SAT
benchmarks. We adapt and incorporate multi-threading and restart optimizations
targeting those characteristics to further optimize DPLL-based DNN
verification. We evaluate the effectiveness of VeriStable across a range of
challenging benchmarks including fully-connected feedforward networks (FNNs),
convolutional neural networks (CNNs) and residual networks (ResNets) applied to
the standard MNIST and CIFAR datasets. Preliminary results show that VeriStable
is competitive and outperforms state-of-the-art DNN verification tools,
including --CROWN and MN-BaB, the first and second performers of
the VNN-COMP, respectively.Comment: VeriStable and experimental data are available at:
https://github.com/veristable/veristabl
Development Context Driven Change Awareness and Analysis Framework
Recent work on workspace monitoring allows conflict prediction early in the development process, however, these approaches mostly use syntactic differencing techniques to compare different program versions. In contrast, traditional change-impact analysis techniques analyze related versions of the program only after the code has been checked into the master repository. We propose a novel approach, De- CAF (Development Context Analysis Framework), that leverages the development context to scope a change impact analysis technique. The goal is to characterize the impact of each developer on other developers in the team. There are various client applications such as task prioritization, early conflict detection, and providing advice on testing that can benefit from such a characterization. The DeCAF framework leverages information from the development context to bound the iDiSE change impact analysis technique to analyze only the parts of the code base that are of interest. Bounding the analysis can enable DeCAF to efficiently compute the impact of changes using a combination of program dependence and symbolic execution based approaches
Exact and Approximate Probabilistic Symbolic Execution
Probabilistic software analysis seeks to quantify the likelihood of reaching a target event under uncertain environments. Recent approaches compute probabilities of execution paths using symbolic execution, but do not support nondeterminism. Nondeterminism arises naturally when no suitable probabilistic model can capture a program behavior, e.g., for multithreading or distributed systems. In this work, we propose a technique, based on symbolic execution, to synthesize schedulers that resolve nondeterminism to maximize the probability of reaching a target event. To scale to large systems, we also introduce approximate algorithms to search for good schedulers, speeding up established random sampling and reinforcement learning results through the quantification of path probabilities based on symbolic execution. We implemented the techniques in Symbolic PathFinder and evaluated them on nondeterministic Java programs. We show that our algorithms significantly improve upon a state-of- the-art statistical model checking algorithm, originally developed for Markov Decision Processes
- …