1,105 research outputs found
Formal Verification of Input-Output Mappings of Tree Ensembles
Recent advances in machine learning and artificial intelligence are now being
considered in safety-critical autonomous systems where software defects may
cause severe harm to humans and the environment. Design organizations in these
domains are currently unable to provide convincing arguments that their systems
are safe to operate when machine learning algorithms are used to implement
their software.
In this paper, we present an efficient method to extract equivalence classes
from decision trees and tree ensembles, and to formally verify that their
input-output mappings comply with requirements. The idea is that, given that
safety requirements can be traced to desirable properties on system
input-output patterns, we can use positive verification outcomes in safety
arguments. This paper presents the implementation of the method in the tool
VoTE (Verifier of Tree Ensembles), and evaluates its scalability on two case
studies presented in current literature.
We demonstrate that our method is practical for tree ensembles trained on
low-dimensional data with up to 25 decision trees and tree depths of up to 20.
Our work also studies the limitations of the method with high-dimensional data
and preliminarily investigates the trade-off between large number of trees and
time taken for verification
Breaking Instance-Independent Symmetries In Exact Graph Coloring
Code optimization and high level synthesis can be posed as constraint
satisfaction and optimization problems, such as graph coloring used in register
allocation. Graph coloring is also used to model more traditional CSPs relevant
to AI, such as planning, time-tabling and scheduling. Provably optimal
solutions may be desirable for commercial and defense applications.
Additionally, for applications such as register allocation and code
optimization, naturally-occurring instances of graph coloring are often small
and can be solved optimally. A recent wave of improvements in algorithms for
Boolean satisfiability (SAT) and 0-1 Integer Linear Programming (ILP) suggests
generic problem-reduction methods, rather than problem-specific heuristics,
because (1) heuristics may be upset by new constraints, (2) heuristics tend to
ignore structure, and (3) many relevant problems are provably inapproximable.
Problem reductions often lead to highly symmetric SAT instances, and
symmetries are known to slow down SAT solvers. In this work, we compare several
avenues for symmetry breaking, in particular when certain kinds of symmetry are
present in all generated instances. Our focus on reducing CSPs to SAT allows us
to leverage recent dramatic improvement in SAT solvers and automatically
benefit from future progress. We can use a variety of black-box SAT solvers
without modifying their source code because our symmetry-breaking techniques
are static, i.e., we detect symmetries and add symmetry breaking predicates
(SBPs) during pre-processing.
An important result of our work is that among the types of
instance-independent SBPs we studied and their combinations, the simplest and
least complete constructions are the most effective. Our experiments also
clearly indicate that instance-independent symmetries should mostly be
processed together with instance-specific symmetries rather than at the
specification level, contrary to what has been suggested in the literature
Boosting Answer Set Optimization with Weighted Comparator Networks
Answer set programming (ASP) is a paradigm for modeling knowledge intensive
domains and solving challenging reasoning problems. In ASP solving, a typical
strategy is to preprocess problem instances by rewriting complex rules into
simpler ones. Normalization is a rewriting process that removes extended rule
types altogether in favor of normal rules. Recently, such techniques led to
optimization rewriting in ASP, where the goal is to boost answer set
optimization by refactoring the optimization criteria of interest. In this
paper, we present a novel, general, and effective technique for optimization
rewriting based on comparator networks, which are specific kinds of circuits
for reordering the elements of vectors. The idea is to connect an ASP encoding
of a comparator network to the literals being optimized and to redistribute the
weights of these literals over the structure of the network. The encoding
captures information about the weight of an answer set in auxiliary atoms in a
structured way that is proven to yield exponential improvements during
branch-and-bound optimization on an infinite family of example programs. The
used comparator network can be tuned freely, e.g., to find the best size for a
given benchmark class. Experiments show accelerated optimization performance on
several benchmark problems.Comment: 36 page
Boosting Haplotype Inference with Local Search
Abstract. A very challenging problem in the genetics domain is to infer haplotypes from genotypes. This process is expected to identify genes affecting health, disease and response to drugs. One of the approaches to haplotype inference aims to minimise the number of different haplotypes used, and is known as haplotype inference by pure parsimony (HIPP). The HIPP problem is computationally difficult, being NP-hard. Recently, a SAT-based method (SHIPs) has been proposed to solve the HIPP problem. This method iteratively considers an increasing number of haplotypes, starting from an initial lower bound. Hence, one important aspect of SHIPs is the lower bounding procedure, which reduces the number of iterations of the basic algorithm, and also indirectly simplifies the resulting SAT model. This paper describes the use of local search to improve existing lower bounding procedures. The new lower bounding procedure is guaranteed to be as tight as the existing procedures. In practice the new procedure is in most cases considerably tighter, allowing significant improvement of performance on challenging problem instances.
Enabling Incrementality in the Implicit Hitting Set Approach to MaxSAT Under Changing Weights
Recent advances in solvers for the Boolean satisfiability (SAT) based optimization paradigm of maximum satisfiability (MaxSAT) have turned MaxSAT into a viable approach to finding provably optimal solutions for various types of hard optimization problems. In various types of real-world problem settings, a sequence of related optimization problems need to solved. This calls for studying ways of enabling incremental computations in MaxSAT, with the hope of speeding up the overall computation times. However, current state-of-the-art MaxSAT solvers offer no or limited forms of incrementality. In this work, we study ways of enabling incremental computations in the context of the implicit hitting set (IHS) approach to MaxSAT solving, as both one of the key MaxSAT solving approaches today and a relatively well-suited candidate for extending to incremental computations. In particular, motivated by several recent applications of MaxSAT in the context of interpretability in machine learning calling for this type of incrementality, we focus on enabling incrementality in IHS under changes to the objective function coefficients (i.e., to the weights of soft clauses). To this end, we explain to what extent different search techniques applied in IHS-based MaxSAT solving can and cannot be adapted to this incremental setting. As practical result, we develop an incremental version of an IHS MaxSAT solver, and show it provides significant runtime improvements in recent application settings which can benefit from incrementality but in which MaxSAT solvers have so-far been applied only non-incrementally, i.e., by calling a MaxSAT solver from scratch after each change to the problem instance at hand
Deep Graph Laplacian Regularization for Robust Denoising of Real Images
Recent developments in deep learning have revolutionized the paradigm of
image restoration. However, its applications on real image denoising are still
limited, due to its sensitivity to training data and the complex nature of real
image noise. In this work, we combine the robustness merit of model-based
approaches and the learning power of data-driven approaches for real image
denoising. Specifically, by integrating graph Laplacian regularization as a
trainable module into a deep learning framework, we are less susceptible to
overfitting than pure CNN-based approaches, achieving higher robustness to
small datasets and cross-domain denoising. First, a sparse neighborhood graph
is built from the output of a convolutional neural network (CNN). Then the
image is restored by solving an unconstrained quadratic programming problem,
using a corresponding graph Laplacian regularizer as a prior term. The proposed
restoration pipeline is fully differentiable and hence can be end-to-end
trained. Experimental results demonstrate that our work is less prone to
overfitting given small training data. It is also endowed with strong
cross-domain generalization power, outperforming the state-of-the-art
approaches by a remarkable margin
- …