68,688 research outputs found
Learning pseudo-Boolean k-DNF and Submodular Functions
We prove that any submodular function f: {0,1}^n -> {0,1,...,k} can be
represented as a pseudo-Boolean 2k-DNF formula. Pseudo-Boolean DNFs are a
natural generalization of DNF representation for functions with integer range.
Each term in such a formula has an associated integral constant. We show that
an analog of Hastad's switching lemma holds for pseudo-Boolean k-DNFs if all
constants associated with the terms of the formula are bounded.
This allows us to generalize Mansour's PAC-learning algorithm for k-DNFs to
pseudo-Boolean k-DNFs, and hence gives a PAC-learning algorithm with membership
queries under the uniform distribution for submodular functions of the form
f:{0,1}^n -> {0,1,...,k}. Our algorithm runs in time polynomial in n, k^{O(k
\log k / \epsilon)}, 1/\epsilon and log(1/\delta) and works even in the
agnostic setting. The line of previous work on learning submodular functions
[Balcan, Harvey (STOC '11), Gupta, Hardt, Roth, Ullman (STOC '11), Cheraghchi,
Klivans, Kothari, Lee (SODA '12)] implies only n^{O(k)} query complexity for
learning submodular functions in this setting, for fixed epsilon and delta.
Our learning algorithm implies a property tester for submodularity of
functions f:{0,1}^n -> {0, ..., k} with query complexity polynomial in n for
k=O((\log n/ \loglog n)^{1/2}) and constant proximity parameter \epsilon
Learning circuits with few negations
Monotone Boolean functions, and the monotone Boolean circuits that compute
them, have been intensively studied in complexity theory. In this paper we
study the structure of Boolean functions in terms of the minimum number of
negations in any circuit computing them, a complexity measure that interpolates
between monotone functions and the class of all functions. We study this
generalization of monotonicity from the vantage point of learning theory,
giving near-matching upper and lower bounds on the uniform-distribution
learnability of circuits in terms of the number of negations they contain. Our
upper bounds are based on a new structural characterization of negation-limited
circuits that extends a classical result of A. A. Markov. Our lower bounds,
which employ Fourier-analytic tools from hardness amplification, give new
results even for circuits with no negations (i.e. monotone functions)
The intersection of two halfspaces has high threshold degree
The threshold degree of a Boolean function f:{0,1}^n->{-1,+1} is the least
degree of a real polynomial p such that f(x)=sgn p(x). We construct two
halfspaces on {0,1}^n whose intersection has threshold degree Theta(sqrt n), an
exponential improvement on previous lower bounds. This solves an open problem
due to Klivans (2002) and rules out the use of perceptron-based techniques for
PAC learning the intersection of two halfspaces, a central unresolved challenge
in computational learning. We also prove that the intersection of two majority
functions has threshold degree Omega(log n), which is tight and settles a
conjecture of O'Donnell and Servedio (2003).
Our proof consists of two parts. First, we show that for any nonconstant
Boolean functions f and g, the intersection f(x)^g(y) has threshold degree O(d)
if and only if ||f-F||_infty + ||g-G||_infty < 1 for some rational functions F,
G of degree O(d). Second, we settle the least degree required for approximating
a halfspace and a majority function to any given accuracy by rational
functions.
Our technique further allows us to make progress on Aaronson's challenge
(2008) and contribute strong direct product theorems for polynomial
representations of composed Boolean functions of the form F(f_1,...,f_n). In
particular, we give an improved lower bound on the approximate degree of the
AND-OR tree.Comment: Full version of the FOCS'09 pape
Learning, Generalization, and Functional Entropy in Random Automata Networks
It has been shown \citep{broeck90:physicalreview,patarnello87:europhys} that
feedforward Boolean networks can learn to perform specific simple tasks and
generalize well if only a subset of the learning examples is provided for
learning. Here, we extend this body of work and show experimentally that random
Boolean networks (RBNs), where both the interconnections and the Boolean
transfer functions are chosen at random initially, can be evolved by using a
state-topology evolution to solve simple tasks. We measure the learning and
generalization performance, investigate the influence of the average node
connectivity , the system size , and introduce a new measure that allows
to better describe the network's learning and generalization behavior. We show
that the connectivity of the maximum entropy networks scales as a power-law of
the system size . Our results show that networks with higher average
connectivity (supercritical) achieve higher memorization and partial
generalization. However, near critical connectivity, the networks show a higher
perfect generalization on the even-odd task
Approximation Algorithms for Stochastic Boolean Function Evaluation and Stochastic Submodular Set Cover
Stochastic Boolean Function Evaluation is the problem of determining the
value of a given Boolean function f on an unknown input x, when each bit of x_i
of x can only be determined by paying an associated cost c_i. The assumption is
that x is drawn from a given product distribution, and the goal is to minimize
the expected cost. This problem has been studied in Operations Research, where
it is known as "sequential testing" of Boolean functions. It has also been
studied in learning theory in the context of learning with attribute costs. We
consider the general problem of developing approximation algorithms for
Stochastic Boolean Function Evaluation. We give a 3-approximation algorithm for
evaluating Boolean linear threshold formulas. We also present an approximation
algorithm for evaluating CDNF formulas (and decision trees) achieving a factor
of O(log kd), where k is the number of terms in the DNF formula, and d is the
number of clauses in the CNF formula. In addition, we present approximation
algorithms for simultaneous evaluation of linear threshold functions, and for
ranking of linear functions.
Our function evaluation algorithms are based on reductions to the Stochastic
Submodular Set Cover (SSSC) problem. This problem was introduced by Golovin and
Krause. They presented an approximation algorithm for the problem, called
Adaptive Greedy. Our main technical contribution is a new approximation
algorithm for the SSSC problem, which we call Adaptive Dual Greedy. It is an
extension of the Dual Greedy algorithm for Submodular Set Cover due to Fujito,
which is a generalization of Hochbaum's algorithm for the classical Set Cover
Problem. We also give a new bound on the approximation achieved by the Adaptive
Greedy algorithm of Golovin and Krause
Approximate resilience, monotonicity, and the complexity of agnostic learning
A function is -resilient if all its Fourier coefficients of degree at
most are zero, i.e., is uncorrelated with all low-degree parities. We
study the notion of of Boolean
functions, where we say that is -approximately -resilient if
is -close to a -valued -resilient function in
distance. We show that approximate resilience essentially characterizes the
complexity of agnostic learning of a concept class over the uniform
distribution. Roughly speaking, if all functions in a class are far from
being -resilient then can be learned agnostically in time and
conversely, if contains a function close to being -resilient then
agnostic learning of in the statistical query (SQ) framework of Kearns has
complexity of at least . This characterization is based on the
duality between approximation by degree- polynomials and
approximate -resilience that we establish. In particular, it implies that
approximation by low-degree polynomials, known to be sufficient for
agnostic learning over product distributions, is in fact necessary.
Focusing on monotone Boolean functions, we exhibit the existence of
near-optimal -approximately
-resilient monotone functions for all
. Prior to our work, it was conceivable even that every monotone
function is -far from any -resilient function. Furthermore, we
construct simple, explicit monotone functions based on and that are close to highly resilient functions. Our constructions are
based on a fairly general resilience analysis and amplification. These
structural results, together with the characterization, imply nearly optimal
lower bounds for agnostic learning of monotone juntas
MCMC Learning
The theory of learning under the uniform distribution is rich and deep, with
connections to cryptography, computational complexity, and the analysis of
boolean functions to name a few areas. This theory however is very limited due
to the fact that the uniform distribution and the corresponding Fourier basis
are rarely encountered as a statistical model.
A family of distributions that vastly generalizes the uniform distribution on
the Boolean cube is that of distributions represented by Markov Random Fields
(MRF). Markov Random Fields are one of the main tools for modeling high
dimensional data in many areas of statistics and machine learning.
In this paper we initiate the investigation of extending central ideas,
methods and algorithms from the theory of learning under the uniform
distribution to the setup of learning concepts given examples from MRF
distributions. In particular, our results establish a novel connection between
properties of MCMC sampling of MRFs and learning under the MRF distribution.Comment: 28 pages, 1 figur
- …