10,919 research outputs found
Learning Coverage Functions and Private Release of Marginals
We study the problem of approximating and learning coverage functions. A
function is a coverage function, if
there exists a universe with non-negative weights for each
and subsets of such that . Alternatively, coverage functions can be described
as non-negative linear combinations of monotone disjunctions. They are a
natural subclass of submodular functions and arise in a number of applications.
We give an algorithm that for any , given random and uniform
examples of an unknown coverage function , finds a function that
approximates within factor on all but -fraction of the
points in time . This is the first fully-polynomial
algorithm for learning an interesting class of functions in the demanding PMAC
model of Balcan and Harvey (2011). Our algorithms are based on several new
structural properties of coverage functions. Using the results in (Feldman and
Kothari, 2014), we also show that coverage functions are learnable agnostically
with excess -error over all product and symmetric
distributions in time . In contrast, we show that,
without assumptions on the distribution, learning coverage functions is at
least as hard as learning polynomial-size disjoint DNF formulas, a class of
functions for which the best known algorithm runs in time
(Klivans and Servedio, 2004).
As an application of our learning results, we give simple
differentially-private algorithms for releasing monotone conjunction counting
queries with low average error. In particular, for any , we obtain
private release of -way marginals with average error in time
Resolution over Linear Equations and Multilinear Proofs
We develop and study the complexity of propositional proof systems of varying
strength extending resolution by allowing it to operate with disjunctions of
linear equations instead of clauses. We demonstrate polynomial-size refutations
for hard tautologies like the pigeonhole principle, Tseitin graph tautologies
and the clique-coloring tautologies in these proof systems. Using the
(monotone) interpolation by a communication game technique we establish an
exponential-size lower bound on refutations in a certain, considerably strong,
fragment of resolution over linear equations, as well as a general polynomial
upper bound on (non-monotone) interpolants in this fragment.
We then apply these results to extend and improve previous results on
multilinear proofs (over fields of characteristic 0), as studied in
[RazTzameret06]. Specifically, we show the following:
1. Proofs operating with depth-3 multilinear formulas polynomially simulate a
certain, considerably strong, fragment of resolution over linear equations.
2. Proofs operating with depth-3 multilinear formulas admit polynomial-size
refutations of the pigeonhole principle and Tseitin graph tautologies. The
former improve over a previous result that established small multilinear proofs
only for the \emph{functional} pigeonhole principle. The latter are different
than previous proofs, and apply to multilinear proofs of Tseitin mod p graph
tautologies over any field of characteristic 0.
We conclude by connecting resolution over linear equations with extensions of
the cutting planes proof system.Comment: 44 page
The Bayesian sampler : generic Bayesian inference causes incoherence in human probability
Human probability judgments are systematically biased, in apparent tension with Bayesian models of cognition. But perhaps the brain does not represent probabilities explicitly, but approximates probabilistic calculations through a process of sampling, as used in computational probabilistic models in statistics. Naïve probability estimates can be obtained by calculating the relative frequency of an event within a sample, but these estimates tend to be extreme when the sample size is small. We propose instead that people use a generic prior to improve the accuracy of their probability estimates based on samples, and we call this model the Bayesian sampler. The Bayesian sampler trades off the coherence of probabilistic judgments for improved accuracy, and provides a single framework for explaining phenomena associated with diverse biases and heuristics such as conservatism and the conjunction fallacy. The approach turns out to provide a rational reinterpretation of “noise” in an important recent model of probability judgment, the probability theory plus noise model (Costello & Watts, 2014, 2016a, 2017; Costello & Watts, 2019; Costello, Watts, & Fisher, 2018), making equivalent average predictions for simple events, conjunctions, and disjunctions. The Bayesian sampler does, however, make distinct predictions for conditional probabilities and distributions of probability estimates. We show in 2 new experiments that this model better captures these mean judgments both qualitatively and quantitatively; which model best fits individual distributions of responses depends on the assumed size of the cognitive sample
Agnostic Learning of Disjunctions on Symmetric Distributions
We consider the problem of approximating and learning disjunctions (or
equivalently, conjunctions) on symmetric distributions over .
Symmetric distributions are distributions whose PDF is invariant under any
permutation of the variables. We give a simple proof that for every symmetric
distribution , there exists a set of
functions , such that for every disjunction , there is function
, expressible as a linear combination of functions in , such
that -approximates in distance on or
. This directly
gives an agnostic learning algorithm for disjunctions on symmetric
distributions that runs in time . The best known
previous bound is and follows from approximation of the
more general class of halfspaces (Wimmer, 2010). We also show that there exists
a symmetric distribution , such that the minimum degree of a
polynomial that -approximates the disjunction of all variables is
distance on is . Therefore the
learning result above cannot be achieved via -regression with a
polynomial basis used in most other agnostic learning algorithms.
Our technique also gives a simple proof that for any product distribution
and every disjunction , there exists a polynomial of
degree such that -approximates in
distance on . This was first proved by Blais et al.
(2008) via a more involved argument
A novel Boolean kernels family for categorical data
Kernel based classifiers, such as SVM, are considered state-of-the-art algorithms and are widely used on many classification tasks. However, this kind of methods are hardly interpretable and for this reason they are often considered as black-box models. In this paper, we propose a new family of Boolean kernels for categorical data where features correspond to propositional formulas applied to the input variables. The idea is to create human-readable features to ease the extraction of interpretation rules directly from the embedding space. Experiments on artificial and benchmark datasets show the effectiveness of the proposed family of kernels with respect to established ones, such as RBF, in terms of classification accuracy
Privately Releasing Conjunctions and the Statistical Query Barrier
Suppose we would like to know all answers to a set of statistical queries C
on a data set up to small error, but we can only access the data itself using
statistical queries. A trivial solution is to exhaustively ask all queries in
C. Can we do any better?
+ We show that the number of statistical queries necessary and sufficient for
this task is---up to polynomial factors---equal to the agnostic learning
complexity of C in Kearns' statistical query (SQ) model. This gives a complete
answer to the question when running time is not a concern.
+ We then show that the problem can be solved efficiently (allowing arbitrary
error on a small fraction of queries) whenever the answers to C can be
described by a submodular function. This includes many natural concept classes,
such as graph cuts and Boolean disjunctions and conjunctions.
While interesting from a learning theoretic point of view, our main
applications are in privacy-preserving data analysis:
Here, our second result leads to the first algorithm that efficiently
releases differentially private answers to of all Boolean conjunctions with 1%
average error. This presents significant progress on a key open problem in
privacy-preserving data analysis.
Our first result on the other hand gives unconditional lower bounds on any
differentially private algorithm that admits a (potentially
non-privacy-preserving) implementation using only statistical queries. Not only
our algorithms, but also most known private algorithms can be implemented using
only statistical queries, and hence are constrained by these lower bounds. Our
result therefore isolates the complexity of agnostic learning in the SQ-model
as a new barrier in the design of differentially private algorithms
Approximate F_2-Sketching of Valuation Functions
We study the problem of constructing a linear sketch of minimum dimension that allows approximation of a given real-valued function f : F_2^n - > R with small expected squared error. We develop a general theory of linear sketching for such functions through which we analyze their dimension for most commonly studied types of valuation functions: additive, budget-additive, coverage, alpha-Lipschitz submodular and matroid rank functions. This gives a characterization of how many bits of information have to be stored about the input x so that one can compute f under additive updates to its coordinates.
Our results are tight in most cases and we also give extensions to the distributional version of the problem where the input x in F_2^n is generated uniformly at random. Using known connections with dynamic streaming algorithms, both upper and lower bounds on dimension obtained in our work extend to the space complexity of algorithms evaluating f(x) under long sequences of additive updates to the input x presented as a stream. Similar results hold for simultaneous communication in a distributed setting
- …