299 research outputs found
Learning pseudo-Boolean k-DNF and Submodular Functions
We prove that any submodular function f: {0,1}^n -> {0,1,...,k} can be
represented as a pseudo-Boolean 2k-DNF formula. Pseudo-Boolean DNFs are a
natural generalization of DNF representation for functions with integer range.
Each term in such a formula has an associated integral constant. We show that
an analog of Hastad's switching lemma holds for pseudo-Boolean k-DNFs if all
constants associated with the terms of the formula are bounded.
This allows us to generalize Mansour's PAC-learning algorithm for k-DNFs to
pseudo-Boolean k-DNFs, and hence gives a PAC-learning algorithm with membership
queries under the uniform distribution for submodular functions of the form
f:{0,1}^n -> {0,1,...,k}. Our algorithm runs in time polynomial in n, k^{O(k
\log k / \epsilon)}, 1/\epsilon and log(1/\delta) and works even in the
agnostic setting. The line of previous work on learning submodular functions
[Balcan, Harvey (STOC '11), Gupta, Hardt, Roth, Ullman (STOC '11), Cheraghchi,
Klivans, Kothari, Lee (SODA '12)] implies only n^{O(k)} query complexity for
learning submodular functions in this setting, for fixed epsilon and delta.
Our learning algorithm implies a property tester for submodularity of
functions f:{0,1}^n -> {0, ..., k} with query complexity polynomial in n for
k=O((\log n/ \loglog n)^{1/2}) and constant proximity parameter \epsilon
DNF Sparsification and a Faster Deterministic Counting Algorithm
Given a DNF formula on n variables, the two natural size measures are the
number of terms or size s(f), and the maximum width of a term w(f). It is
folklore that short DNF formulas can be made narrow. We prove a converse,
showing that narrow formulas can be sparsified. More precisely, any width w DNF
irrespective of its size can be -approximated by a width DNF with
at most terms.
We combine our sparsification result with the work of Luby and Velikovic to
give a faster deterministic algorithm for approximately counting the number of
satisfying solutions to a DNF. Given a formula on n variables with poly(n)
terms, we give a deterministic time algorithm
that computes an additive approximation to the fraction of
satisfying assignments of f for \epsilon = 1/\poly(\log n). The previous best
result due to Luby and Velickovic from nearly two decades ago had a run-time of
.Comment: To appear in the IEEE Conference on Computational Complexity, 201
Learning Coverage Functions and Private Release of Marginals
We study the problem of approximating and learning coverage functions. A
function is a coverage function, if
there exists a universe with non-negative weights for each
and subsets of such that . Alternatively, coverage functions can be described
as non-negative linear combinations of monotone disjunctions. They are a
natural subclass of submodular functions and arise in a number of applications.
We give an algorithm that for any , given random and uniform
examples of an unknown coverage function , finds a function that
approximates within factor on all but -fraction of the
points in time . This is the first fully-polynomial
algorithm for learning an interesting class of functions in the demanding PMAC
model of Balcan and Harvey (2011). Our algorithms are based on several new
structural properties of coverage functions. Using the results in (Feldman and
Kothari, 2014), we also show that coverage functions are learnable agnostically
with excess -error over all product and symmetric
distributions in time . In contrast, we show that,
without assumptions on the distribution, learning coverage functions is at
least as hard as learning polynomial-size disjoint DNF formulas, a class of
functions for which the best known algorithm runs in time
(Klivans and Servedio, 2004).
As an application of our learning results, we give simple
differentially-private algorithms for releasing monotone conjunction counting
queries with low average error. In particular, for any , we obtain
private release of -way marginals with average error in time
Learning DNF Expressions from Fourier Spectrum
Since its introduction by Valiant in 1984, PAC learning of DNF expressions
remains one of the central problems in learning theory. We consider this
problem in the setting where the underlying distribution is uniform, or more
generally, a product distribution. Kalai, Samorodnitsky and Teng (2009) showed
that in this setting a DNF expression can be efficiently approximated from its
"heavy" low-degree Fourier coefficients alone. This is in contrast to previous
approaches where boosting was used and thus Fourier coefficients of the target
function modified by various distributions were needed. This property is
crucial for learning of DNF expressions over smoothed product distributions, a
learning model introduced by Kalai et al. (2009) and inspired by the seminal
smoothed analysis model of Spielman and Teng (2001).
We introduce a new approach to learning (or approximating) a polynomial
threshold functions which is based on creating a function with range [-1,1]
that approximately agrees with the unknown function on low-degree Fourier
coefficients. We then describe conditions under which this is sufficient for
learning polynomial threshold functions. Our approach yields a new, simple
algorithm for approximating any polynomial-size DNF expression from its "heavy"
low-degree Fourier coefficients alone. Our algorithm greatly simplifies the
proof of learnability of DNF expressions over smoothed product distributions.
We also describe an application of our algorithm to learning monotone DNF
expressions over product distributions. Building on the work of Servedio
(2001), we give an algorithm that runs in time \poly((s \cdot
\log{(s/\eps)})^{\log{(s/\eps)}}, n), where is the size of the target DNF
expression and \eps is the accuracy. This improves on \poly((s \cdot
\log{(ns/\eps)})^{\log{(s/\eps)} \cdot \log{(1/\eps)}}, n) bound of Servedio
(2001).Comment: Appears in Conference on Learning Theory (COLT) 201
Achieving New Upper Bounds for the Hypergraph Duality Problem through Logic
The hypergraph duality problem DUAL is defined as follows: given two simple
hypergraphs and , decide whether
consists precisely of all minimal transversals of (in which case
we say that is the dual of ). This problem is
equivalent to deciding whether two given non-redundant monotone DNFs are dual.
It is known that non-DUAL, the complementary problem to DUAL, is in
, where
denotes the complexity class of all problems that after a nondeterministic
guess of bits can be decided (checked) within complexity class
. It was conjectured that non-DUAL is in . In this paper we prove this conjecture and actually
place the non-DUAL problem into the complexity class which is a subclass of . We here refer to the logtime-uniform version of
, which corresponds to , i.e., first order
logic augmented by counting quantifiers. We achieve the latter bound in two
steps. First, based on existing problem decomposition methods, we develop a new
nondeterministic algorithm for non-DUAL that requires to guess
bits. We then proceed by a logical analysis of this algorithm, allowing us to
formulate its deterministic part in . From this result, by
the well known inclusion , it follows
that DUAL belongs also to . Finally, by exploiting
the principles on which the proposed nondeterministic algorithm is based, we
devise a deterministic algorithm that, given two hypergraphs and
, computes in quadratic logspace a transversal of
missing in .Comment: Restructured the presentation in order to be the extended version of
a paper that will shortly appear in SIAM Journal on Computin
- …