9,367 research outputs found
Predicting SMT solver performance for software verification
The approach Why3 takes to interfacing with a wide variety of interactive
and automatic theorem provers works well: it is designed to overcome
limitations on what can be proved by a system which relies on a single
tightly-integrated solver. In common with other systems, however, the degree
to which proof obligations (or “goals”) are proved depends as much on
the SMT solver as the properties of the goal itself. In this work, we present a
method to use syntactic analysis to characterise goals and predict the most
appropriate solver via machine-learning techniques.
Combining solvers in this way - a portfolio-solving approach - maximises
the number of goals which can be proved. The driver-based architecture of
Why3 presents a unique opportunity to use a portfolio of SMT solvers for
software verification. The intelligent scheduling of solvers minimises the
time it takes to prove these goals by avoiding solvers which return Timeout
and Unknown responses. We assess the suitability of a number of machinelearning
algorithms for this scheduling task.
The performance of our tool Where4 is evaluated on a dataset of proof
obligations. We compare Where4 to a range of SMT solvers and theoretical
scheduling strategies. We find that Where4 can out-perform individual
solvers by proving a greater number of goals in a shorter average time.
Furthermore, Where4 can integrate into a Why3 user’s normal workflow -
simplifying and automating the non-expert use of SMT solvers for software
verification
Predicting SMT solver performance for software verification
The approach Why3 takes to interfacing with a wide variety of interactive
and automatic theorem provers works well: it is designed to overcome
limitations on what can be proved by a system which relies on a single
tightly-integrated solver. In common with other systems, however, the degree
to which proof obligations (or “goals”) are proved depends as much on
the SMT solver as the properties of the goal itself. In this work, we present a
method to use syntactic analysis to characterise goals and predict the most
appropriate solver via machine-learning techniques.
Combining solvers in this way - a portfolio-solving approach - maximises
the number of goals which can be proved. The driver-based architecture of
Why3 presents a unique opportunity to use a portfolio of SMT solvers for
software verification. The intelligent scheduling of solvers minimises the
time it takes to prove these goals by avoiding solvers which return Timeout
and Unknown responses. We assess the suitability of a number of machinelearning
algorithms for this scheduling task.
The performance of our tool Where4 is evaluated on a dataset of proof
obligations. We compare Where4 to a range of SMT solvers and theoretical
scheduling strategies. We find that Where4 can out-perform individual
solvers by proving a greater number of goals in a shorter average time.
Furthermore, Where4 can integrate into a Why3 user’s normal workflow -
simplifying and automating the non-expert use of SMT solvers for software
verification
Recommended from our members
The limits of human predictions of recidivism.
Dressel and Farid recently found that laypeople were as accurate as statistical algorithms in predicting whether a defendant would reoffend, casting doubt on the value of risk assessment tools in the criminal justice system. We report the results of a replication and extension of Dressel and Farid's experiment. Under conditions similar to the original study, we found nearly identical results, with humans and algorithms performing comparably. However, algorithms beat humans in the three other datasets we examined. The performance gap between humans and algorithms was particularly pronounced when, in a departure from the original study, participants were not provided with immediate feedback on the accuracy of their responses. Algorithms also outperformed humans when the information provided for predictions included an enriched (versus restricted) set of risk factors. These results suggest that algorithms can outperform human predictions of recidivism in ecologically valid settings
Large Area Crop Inventory Experiment (LACIE). Experiment plan for evaluation of LANDSAT agronomic variables using wheat intensive test sites
There are no author-identified significant results in this report
The Configurable SAT Solver Challenge (CSSC)
It is well known that different solution strategies work well for different
types of instances of hard combinatorial problems. As a consequence, most
solvers for the propositional satisfiability problem (SAT) expose parameters
that allow them to be customized to a particular family of instances. In the
international SAT competition series, these parameters are ignored: solvers are
run using a single default parameter setting (supplied by the authors) for all
benchmark instances in a given track. While this competition format rewards
solvers with robust default settings, it does not reflect the situation faced
by a practitioner who only cares about performance on one particular
application and can invest some time into tuning solver parameters for this
application. The new Configurable SAT Solver Competition (CSSC) compares
solvers in this latter setting, scoring each solver by the performance it
achieved after a fully automated configuration step. This article describes the
CSSC in more detail, and reports the results obtained in its two instantiations
so far, CSSC 2013 and 2014
- …