26 research outputs found
Understanding Boolean Function Learnability on Deep Neural Networks
Computational learning theory states that many classes of boolean formulas
are learnable in polynomial time. This paper addresses the understudied subject
of how, in practice, such formulas can be learned by deep neural networks.
Specifically, we analyse boolean formulas associated with the decision version
of combinatorial optimisation problems, model sampling benchmarks, and random
3-CNFs with varying degrees of constrainedness. Our extensive experiments
indicate that: (i) regardless of the combinatorial optimisation problem,
relatively small and shallow neural networks are very good approximators of the
associated formulas; (ii) smaller formulas seem harder to learn, possibly due
to the fewer positive (satisfying) examples available; and (iii) interestingly,
underconstrained 3-CNF formulas are more challenging to learn than
overconstrained ones. Source code and relevant datasets are publicly available
(https://github.com/machine-reasoning-ufrgs/mlbf)
Fault Trees from Data: Efficient Learning with an Evolutionary Algorithm
Cyber-physical systems come with increasingly complex architectures and
failure modes, which complicates the task of obtaining accurate system
reliability models. At the same time, with the emergence of the (industrial)
Internet-of-Things, systems are more and more often being monitored via
advanced sensor systems. These sensors produce large amounts of data about the
components' failure behaviour, and can, therefore, be fruitfully exploited to
learn reliability models automatically. This paper presents an effective
algorithm for learning a prominent class of reliability models, namely fault
trees, from observational data. Our algorithm is evolutionary in nature; i.e.,
is an iterative, population-based, randomized search method among fault-tree
structures that are increasingly more consistent with the observational data.
We have evaluated our method on a large number of case studies, both on
synthetic data, and industrial data. Our experiments show that our algorithm
outperforms other methods and provides near-optimal results.Comment: This paper is an extended version of the SETTA 2019 paper,
Springer-Verla
Invariant Synthesis for Incomplete Verification Engines
We propose a framework for synthesizing inductive invariants for incomplete
verification engines, which soundly reduce logical problems in undecidable
theories to decidable theories. Our framework is based on the counter-example
guided inductive synthesis principle (CEGIS) and allows verification engines to
communicate non-provability information to guide invariant synthesis. We show
precisely how the verification engine can compute such non-provability
information and how to build effective learning algorithms when invariants are
expressed as Boolean combinations of a fixed set of predicates. Moreover, we
evaluate our framework in two verification settings, one in which verification
engines need to handle quantified formulas and one in which verification
engines have to reason about heap properties expressed in an expressive but
undecidable separation logic. Our experiments show that our invariant synthesis
framework based on non-provability information can both effectively synthesize
inductive invariants and adequately strengthen contracts across a large suite
of programs
Applying MDL to Learning Best Model Granularity
The Minimum Description Length (MDL) principle is solidly based on a provably
ideal method of inference using Kolmogorov complexity. We test how the theory
behaves in practice on a general problem in model selection: that of learning
the best model granularity. The performance of a model depends critically on
the granularity, for example the choice of precision of the parameters. Too
high precision generally involves modeling of accidental noise and too low
precision may lead to confusion of models that should be distinguished. This
precision is often determined ad hoc. In MDL the best model is the one that
most compresses a two-part code of the data set: this embodies ``Occam's
Razor.'' In two quite different experimental settings the theoretical value
determined using MDL coincides with the best value found experimentally. In the
first experiment the task is to recognize isolated handwritten characters in
one subject's handwriting, irrespective of size and orientation. Based on a new
modification of elastic matching, using multiple prototypes per character, the
optimal prediction rate is predicted for the learned parameter (length of
sampling interval) considered most likely by MDL, which is shown to coincide
with the best value found experimentally. In the second experiment the task is
to model a robot arm with two degrees of freedom using a three layer
feed-forward neural network where we need to determine the number of nodes in
the hidden layer giving best modeling performance. The optimal model (the one
that extrapolizes best on unseen examples) is predicted for the number of nodes
in the hidden layer considered most likely by MDL, which again is found to
coincide with the best value found experimentally.Comment: LaTeX, 32 pages, 5 figures. Artificial Intelligence journal, To
appea
A Complete Characterization of Statistical Query Learning with Applications to Evolvability
Statistical query (SQ) learning model of Kearns (1993) is a natural
restriction of the PAC learning model in which a learning algorithm is allowed
to obtain estimates of statistical properties of the examples but cannot see
the examples themselves. We describe a new and simple characterization of the
query complexity of learning in the SQ learning model. Unlike the previously
known bounds on SQ learning our characterization preserves the accuracy and the
efficiency of learning. The preservation of accuracy implies that that our
characterization gives the first characterization of SQ learning in the
agnostic learning framework. The preservation of efficiency is achieved using a
new boosting technique and allows us to derive a new approach to the design of
evolutionary algorithms in Valiant's (2006) model of evolvability. We use this
approach to demonstrate the existence of a large class of monotone evolutionary
learning algorithms based on square loss performance estimation. These results
differ significantly from the few known evolutionary algorithms and give
evidence that evolvability in Valiant's model is a more versatile phenomenon
than there had been previous reason to suspect.Comment: Simplified Lemma 3.8 and it's application
Learning Coverage Functions and Private Release of Marginals
We study the problem of approximating and learning coverage functions. A
function is a coverage function, if
there exists a universe with non-negative weights for each
and subsets of such that . Alternatively, coverage functions can be described
as non-negative linear combinations of monotone disjunctions. They are a
natural subclass of submodular functions and arise in a number of applications.
We give an algorithm that for any , given random and uniform
examples of an unknown coverage function , finds a function that
approximates within factor on all but -fraction of the
points in time . This is the first fully-polynomial
algorithm for learning an interesting class of functions in the demanding PMAC
model of Balcan and Harvey (2011). Our algorithms are based on several new
structural properties of coverage functions. Using the results in (Feldman and
Kothari, 2014), we also show that coverage functions are learnable agnostically
with excess -error over all product and symmetric
distributions in time . In contrast, we show that,
without assumptions on the distribution, learning coverage functions is at
least as hard as learning polynomial-size disjoint DNF formulas, a class of
functions for which the best known algorithm runs in time
(Klivans and Servedio, 2004).
As an application of our learning results, we give simple
differentially-private algorithms for releasing monotone conjunction counting
queries with low average error. In particular, for any , we obtain
private release of -way marginals with average error in time
Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations
Consider the following heuristic for building a decision tree for a function
. Place the most influential variable of
at the root, and recurse on the subfunctions and on the
left and right subtrees respectively; terminate once the tree is an
-approximation of . We analyze the quality of this heuristic,
obtaining near-matching upper and lower bounds:
Upper bound: For every with decision tree size and every
, this heuristic builds a decision tree of size
at most .
Lower bound: For every and , there is an with decision tree size such that
this heuristic builds a decision tree of size .
We also obtain upper and lower bounds for monotone functions:
and
respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004)
and Lee (2009).
Our upper bounds yield new algorithms for properly learning decision trees
under the uniform distribution. We show that these algorithms---which are
motivated by widely employed and empirically successful top-down decision tree
learning heuristics such as ID3, C4.5, and CART---achieve provable guarantees
that compare favorably with those of the current fastest algorithm (Ehrenfeucht
and Haussler, 1989). Our lower bounds shed new light on the limitations of
these heuristics.
Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend
it to give the first uniform-distribution proper learning algorithm that
achieves polynomial sample and memory complexity, while matching its
state-of-the-art quasipolynomial runtime