42 research outputs found
Improve SAT-solving with Machine Learning
In this project, we aimed to improve the runtime of Minisat, a
Conflict-Driven Clause Learning (CDCL) solver that solves the Propositional
Boolean Satisfiability (SAT) problem. We first used a logistic regression model
to predict the satisfiability of propositional boolean formulae after fixing
the values of a certain fraction of the variables in each formula. We then
applied the logistic model and added a preprocessing period to Minisat to
determine the preferable initial value (either true or false) of each boolean
variable using a Monte-Carlo approach. Concretely, for each Monte-Carlo trial,
we fixed the values of a certain ratio of randomly selected variables, and
calculated the confidence that the resulting sub-formula is satisfiable with
our logistic regression model. The initial value of each variable was set based
on the mean confidence scores of the trials that started from the literals of
that variable. We were particularly interested in setting the initial values of
the backbone variables correctly, which are variables that have the same value
in all solutions of a SAT formula. Our Monte-Carlo method was able to set 78%
of the backbones correctly. Excluding the preprocessing time, compared with the
default setting of Minisat, the runtime of Minisat for satisfiable formulae
decreased by 23%. However, our method did not outperform vanilla Minisat in
runtime, as the decrease in the conflicts was outweighed by the long runtime of
the preprocessing period.Comment: 2 pages, SIGCSE SRC 201
VeriX: Towards Verified Explainability of Deep Neural Networks
We present VeriX, a system for producing optimal robust explanations and
generating counterfactuals along decision boundaries of machine learning
models. We build such explanations and counterfactuals iteratively using
constraint solving techniques and a heuristic based on feature-level
sensitivity ranking. We evaluate our method on image recognition benchmarks and
a real-world scenario of autonomous aircraft taxiing
Lemur: Integrating Large Language Models in Automated Program Verification
The demonstrated code-understanding capability of LLMs raises the question of
whether they can be used for automated program verification, a task that often
demands high-level abstract reasoning about program properties, which is
challenging for verification tools. We propose a general methodology to combine
the power of LLMs and automated reasoners for automated program verification.
We formally describe this methodology as a set of derivation rules and prove
its soundness. We instantiate the calculus as a sound automated verification
procedure, which led to practical improvements on a set of synthetic and
competition benchmarks.Comment: Under submissio
Soy: An Efficient MILP Solver for Piecewise-Affine Systems
Piecewise-affine (PWA) systems are widely used for modeling and control of
robotics problems including modeling contact dynamics. A common approach is to
encode the control problem of the PWA system as a Mixed-Integer Convex Program
(MICP), which can be solved by general-purpose off-the-shelf MICP solvers. To
mitigate the scalability challenge of solving these MICP problems, existing
work focuses on devising efficient and strong formulations of the problems,
while less effort has been spent on exploiting their specific structure to
develop specialized solvers. The latter is the theme of our work. We focus on
efficiently handling one-hot constraints, which are particularly relevant when
encoding PWA dynamics. We have implemented our techniques in a tool, Soy, which
organically integrates logical reasoning, arithmetic reasoning, and stochastic
local search. For a set of PWA control benchmarks, Soy solves more problems,
faster, than two state-of-the-art MICP solvers.Comment: Under submissio
Scalable Verification of GNN-based Job Schedulers
Recently, Graph Neural Networks (GNNs) have been applied for scheduling jobs
over clusters, achieving better performance than hand-crafted heuristics.
Despite their impressive performance, concerns remain over whether these
GNN-based job schedulers meet users' expectations about other important
properties, such as strategy-proofness, sharing incentive, and stability. In
this work, we consider formal verification of GNN-based job schedulers. We
address several domain-specific challenges such as networks that are deeper and
specifications that are richer than those encountered when verifying image and
NLP classifiers. We develop vegas, the first general framework for verifying
both single-step and multi-step properties of these schedulers based on
carefully designed algorithms that combine abstractions, refinements, solvers,
and proof transfer. Our experimental results show that vegas achieves
significant speed-up when verifying important properties of a state-of-the-art
GNN-based scheduler compared to previous methods.Comment: Condensed version published at OOPSLA'2