2,148 research outputs found
Your Proof Fails? Testing Helps to Find the Reason
Applying deductive verification to formally prove that a program respects its
formal specification is a very complex and time-consuming task due in
particular to the lack of feedback in case of proof failures. Along with a
non-compliance between the code and its specification (due to an error in at
least one of them), possible reasons of a proof failure include a missing or
too weak specification for a called function or a loop, and lack of time or
simply incapacity of the prover to finish a particular proof. This work
proposes a new methodology where test generation helps to identify the reason
of a proof failure and to exhibit a counter-example clearly illustrating the
issue. We describe how to transform an annotated C program into C code suitable
for testing and illustrate the benefits of the method on comprehensive
examples. The method has been implemented in STADY, a plugin of the software
analysis platform FRAMA-C. Initial experiments show that detecting
non-compliances and contract weaknesses allows to precisely diagnose most proof
failures.Comment: 11 pages, 10 figure
Stateful Testing: Finding More Errors in Code and Contracts
Automated random testing has shown to be an effective approach to finding
faults but still faces a major unsolved issue: how to generate test inputs
diverse enough to find many faults and find them quickly. Stateful testing, the
automated testing technique introduced in this article, generates new test
cases that improve an existing test suite. The generated test cases are
designed to violate the dynamically inferred contracts (invariants)
characterizing the existing test suite. As a consequence, they are in a good
position to detect new errors, and also to improve the accuracy of the inferred
contracts by discovering those that are unsound. Experiments on 13 data
structure classes totalling over 28,000 lines of code demonstrate the
effectiveness of stateful testing in improving over the results of long
sessions of random testing: stateful testing found 68.4% new errors and
improved the accuracy of automatically inferred contracts to over 99%, with
just a 7% time overhead.Comment: 11 pages, 3 figure
Exploring the link between test suite quality and automatic specification inference
While no one doubts the importance of correct and complete specifications, many industrial systems
still do not have formal specifications written out — and even when they do, it is hard to check their
correctness and completeness. This work explores the possibility of using an invariant extraction tool
such as Daikon to automatically infer specifications from available test suites with the idea of aiding
software engineers to improve the specifications by having another version to compare to. Given that
our initial experiments did not produce satisfactory results, in this paper we explore which test suite
attributes influence the quality of the inferred specification. Following further study, we found that
instruction, branch and method coverage are correlated to high recall values, reaching up to 97.93%.peer-reviewe
Inferring Concise Specifications of APIs
Modern software relies on libraries and uses them via application programming
interfaces (APIs). Correct API usage as well as many software engineering tasks
are enabled when APIs have formal specifications. In this work, we analyze the
implementation of each method in an API to infer a formal postcondition.
Conventional wisdom is that, if one has preconditions, then one can use the
strongest postcondition predicate transformer (SP) to infer postconditions.
However, SP yields postconditions that are exponentially large, which makes
them difficult to use, either by humans or by tools. Our key idea is an
algorithm that converts such exponentially large specifications into a form
that is more concise and thus more usable. This is done by leveraging the
structure of the specifications that result from the use of SP. We applied our
technique to infer postconditions for over 2,300 methods in seven popular Java
libraries. Our technique was able to infer specifications for 75.7% of these
methods, each of which was verified using an Extended Static Checker. We also
found that 84.6% of resulting specifications were less than 1/4 page (20 lines)
in length. Our technique was able to reduce the length of SMT proofs needed for
verifying implementations by 76.7% and reduced prover execution time by 26.7%
Test generation for high coverage with abstraction refinement and coarsening (ARC)
Testing is the main approach used in the software industry to expose failures. Producing thorough test suites is an expensive and error prone task that can greatly benefit from automation. Two challenging problems in test automation are generating test input and evaluating the adequacy of test suites: the first amounts to producing a set of test cases that accurately represent the software behavior, the second requires defining appropriate metrics to evaluate the thoroughness of the testing activities. Structural testing addresses these problems by measuring the amount of code elements that are executed by a test suite. The code elements that are not covered by any execution are natural candidates for generating further test cases, and the measured coverage rate can be used to estimate the thoroughness of the test suite. Several empirical studies show that test suites achieving high coverage rates exhibit a high failure detection ability. However, producing highly covering test suites automatically is hard as certain code elements are executed only under complex conditions while other might be not reachable at all. In this thesis we propose Abstraction Refinement and Coarsening (ARC), a goal oriented technique that combines static and dynamic software analysis to automatically generate test suites with high code coverage. At the core of our approach there is an abstract program model that enables the synergistic application of the different analysis components. In ARC we integrate Dynamic Symbolic Execution (DSE) and abstraction refinement to precisely direct test generation towards the coverage goals and detect infeasible elements. ARC includes a novel coarsening algorithm for improved scalability. We implemented ARC-B, a prototype tool that analyses C programs and produces test suites that achieve high branch coverage. Our experiments show that the approach effectively exploits the synergy between symbolic testing and reachability analysis outperforming state of the art test generation approaches. We evaluated ARC-B on industry relevant software, and exposed previously unknown failures in a safety-critical software component
Building Program Vector Representations for Deep Learning
Deep learning has made significant breakthroughs in various fields of
artificial intelligence. Advantages of deep learning include the ability to
capture highly complicated features, weak involvement of human engineering,
etc. However, it is still virtually impossible to use deep learning to analyze
programs since deep architectures cannot be trained effectively with pure back
propagation. In this pioneering paper, we propose the "coding criterion" to
build program vector representations, which are the premise of deep learning
for program analysis. Our representation learning approach directly makes deep
learning a reality in this new field. We evaluate the learned vector
representations both qualitatively and quantitatively. We conclude, based on
the experiments, the coding criterion is successful in building program
representations. To evaluate whether deep learning is beneficial for program
analysis, we feed the representations to deep neural networks, and achieve
higher accuracy in the program classification task than "shallow" methods, such
as logistic regression and the support vector machine. This result confirms the
feasibility of deep learning to analyze programs. It also gives primary
evidence of its success in this new field. We believe deep learning will become
an outstanding technique for program analysis in the near future.Comment: This paper was submitted to ICSE'1
A Brief Survey on Oracle-based Test Adequacy Metrics
Even though code coverage is a widespread and popular test adequacy metric,
it has several limitations. One of the major limitations is that code coverage
does not satisfy the necessary conditions for effective fault detection, as it
only cares about executing different parts of a program. Studies showed that
code coverage as a test adequacy metric is a poor indicator of the quality of a
test suite as it does not consider the quality of test oracles. To address this
limitation, researchers proposed extensions to traditional code coverage
metrics that explicitly consider test oracle quality. We name these metrics as
\textit{oracle-based code coverage}. This survey paper has discussed all
oracle-based techniques published so far, starting from 2007.Comment: 7 page
Recurrence-based time series analysis by means of complex network methods
Complex networks are an important paradigm of modern complex systems sciences
which allows quantitatively assessing the structural properties of systems
composed of different interacting entities. During the last years, intensive
efforts have been spent on applying network-based concepts also for the
analysis of dynamically relevant higher-order statistical properties of time
series. Notably, many corresponding approaches are closely related with the
concept of recurrence in phase space. In this paper, we review recent
methodological advances in time series analysis based on complex networks, with
a special emphasis on methods founded on recurrence plots. The potentials and
limitations of the individual methods are discussed and illustrated for
paradigmatic examples of dynamical systems as well as for real-world time
series. Complex network measures are shown to provide information about
structural features of dynamical systems that are complementary to those
characterized by other methods of time series analysis and, hence,
substantially enrich the knowledge gathered from other existing (linear as well
as nonlinear) approaches.Comment: To be published in International Journal of Bifurcation and Chaos
(2011
Developing a distributed electronic health-record store for India
The DIGHT project is addressing the problem of building a scalable and highly available information store for the Electronic Health Records (EHRs) of the over one billion citizens of India
- …