338 research outputs found
On the Distribution of the Fourier Spectrum of Halfspaces
Bourgain showed that any noise stable Boolean function can be
well-approximated by a junta. In this note we give an exponential sharpening of
the parameters of Bourgain's result under the additional assumption that is
a halfspace
MCMC Learning
The theory of learning under the uniform distribution is rich and deep, with
connections to cryptography, computational complexity, and the analysis of
boolean functions to name a few areas. This theory however is very limited due
to the fact that the uniform distribution and the corresponding Fourier basis
are rarely encountered as a statistical model.
A family of distributions that vastly generalizes the uniform distribution on
the Boolean cube is that of distributions represented by Markov Random Fields
(MRF). Markov Random Fields are one of the main tools for modeling high
dimensional data in many areas of statistics and machine learning.
In this paper we initiate the investigation of extending central ideas,
methods and algorithms from the theory of learning under the uniform
distribution to the setup of learning concepts given examples from MRF
distributions. In particular, our results establish a novel connection between
properties of MCMC sampling of MRFs and learning under the MRF distribution.Comment: 28 pages, 1 figur
The Fuglede conjecture for convex domains is true in all dimensions
A set is said to be spectral if the space
has an orthogonal basis of exponential functions. A conjecture
due to Fuglede (1974) stated that is a spectral set if and only if it
can tile the space by translations. While this conjecture was disproved for
general sets, it has long been known that for a convex body the "tiling implies spectral" part of the conjecture is in fact
true.
To the contrary, the "spectral implies tiling" direction of the conjecture
for convex bodies was proved only in , and also in
under the a priori assumption that is a convex polytope. In higher
dimensions, this direction of the conjecture remained completely open (even in
the case when is a polytope) and could not be treated using the
previously developed techniques.
In this paper we fully settle Fuglede's conjecture for convex bodies
affirmatively in all dimensions, i.e. we prove that if a convex body is a spectral set then is a convex polytope
which can tile the space by translations. To prove this we introduce a new
technique, involving a construction from crystallographic diffraction theory,
which allows us to establish a geometric "weak tiling" condition necessary for
a set to be spectral.Comment: To appear in Acta Mathematic
Approximate resilience, monotonicity, and the complexity of agnostic learning
A function is -resilient if all its Fourier coefficients of degree at
most are zero, i.e., is uncorrelated with all low-degree parities. We
study the notion of of Boolean
functions, where we say that is -approximately -resilient if
is -close to a -valued -resilient function in
distance. We show that approximate resilience essentially characterizes the
complexity of agnostic learning of a concept class over the uniform
distribution. Roughly speaking, if all functions in a class are far from
being -resilient then can be learned agnostically in time and
conversely, if contains a function close to being -resilient then
agnostic learning of in the statistical query (SQ) framework of Kearns has
complexity of at least . This characterization is based on the
duality between approximation by degree- polynomials and
approximate -resilience that we establish. In particular, it implies that
approximation by low-degree polynomials, known to be sufficient for
agnostic learning over product distributions, is in fact necessary.
Focusing on monotone Boolean functions, we exhibit the existence of
near-optimal -approximately
-resilient monotone functions for all
. Prior to our work, it was conceivable even that every monotone
function is -far from any -resilient function. Furthermore, we
construct simple, explicit monotone functions based on and that are close to highly resilient functions. Our constructions are
based on a fairly general resilience analysis and amplification. These
structural results, together with the characterization, imply nearly optimal
lower bounds for agnostic learning of monotone juntas
Embedding Hard Learning Problems Into Gaussian Space
We give the first representation-independent hardness result for agnostically learning halfspaces with respect to the Gaussian distribution. We reduce from the problem of learning sparse parities with noise with respect to the uniform distribution on the hypercube (sparse LPN), a notoriously hard problem in theoretical computer science and show that any algorithm for agnostically learning halfspaces requires n^Omega(log(1/epsilon)) time under the assumption that k-sparse LPN requires n^Omega(k) time, ruling out a polynomial time algorithm for the problem. As far as we are aware, this is the first representation-independent hardness result for supervised learning when the underlying distribution is restricted to be a Gaussian.
We also show that the problem of agnostically learning sparse polynomials with respect to the Gaussian distribution in polynomial time is as hard as PAC learning DNFs on the uniform distribution in polynomial time. This complements the surprising result of Andoni et. al. 2013 who show that sparse polynomials are learnable under random Gaussian noise in polynomial time.
Taken together, these results show the inherent difficulty of designing supervised learning algorithms in Euclidean space even in the presence of strong distributional assumptions. Our results use a novel embedding of random labeled examples from the uniform distribution on the Boolean hypercube into random labeled examples from the Gaussian distribution that allows us to relate the hardness of learning problems on two different domains and distributions
Agnostically Learning Halfspaces
We consider the problem of learning a halfspace in the agnostic framework of Kearns et al., where a learner is given access to a distribution on labelled examples but the labelling may be arbitrary. The learner's goal is to output a hypothesis which performs almost as well as the optimal halfspace with respect to future draws from this distribution. Although the agnostic learning framework does not explicitly deal with noise, it is closely related to learning in worst-case noise models such as malicious noise. We give the first polynomial-time algorithm for agnostically learning halfspaces with respect to several distributions, such as the uniform distribution over the -dimensional Boolean cube {0,1}^n or unit sphere in n-dimensional Euclidean space, as well as any log-concave distribution in n-dimensional Euclidean space. Given any constant additive factor eps>0, our algorithm runs in poly(n) time and constructs a hypothesis whose error rate is within an additive eps of the optimal halfspace. We also show this algorithm agnostically learns Boolean disjunctions in time roughly 2^{\sqrt{n}} with respect to any distribution; this is the first subexponential-time algorithm for this problem. Finally, we obtain a new algorithm for PAC learning halfspaces under the uniform distribution on the unit sphere which can tolerate the highest level of malicious noise of any algorithm to date. Our main tool is a polynomial regression algorithm which finds a polynomial that best fits a set of points with respect to a particular metric. We show that, in fact, this algorithm is an arbitrary-distribution generalization of the well known "low-degree" Fourier algorithm of Linial, Mansour, and Nisan and has excellent noise tolerance properties when minimizing with respect to the L_1 norm. We apply this algorithm in conjunction with a non-standard Fourier transform (which does not use the traditional parity basis) for learning halfspaces over the uniform distribution on the unit sphere; we believe this technique is of independent interest
Quantum algorithms for highly non-linear Boolean functions
Attempts to separate the power of classical and quantum models of computation
have a long history. The ultimate goal is to find exponential separations for
computational problems. However, such separations do not come a dime a dozen:
while there were some early successes in the form of hidden subgroup problems
for abelian groups--which generalize Shor's factoring algorithm perhaps most
faithfully--only for a handful of non-abelian groups efficient quantum
algorithms were found. Recently, problems have gotten increased attention that
seek to identify hidden sub-structures of other combinatorial and algebraic
objects besides groups. In this paper we provide new examples for exponential
separations by considering hidden shift problems that are defined for several
classes of highly non-linear Boolean functions. These so-called bent functions
arise in cryptography, where their property of having perfectly flat Fourier
spectra on the Boolean hypercube gives them resilience against certain types of
attack. We present new quantum algorithms that solve the hidden shift problems
for several well-known classes of bent functions in polynomial time and with a
constant number of queries, while the classical query complexity is shown to be
exponential. Our approach uses a technique that exploits the duality between
bent functions and their Fourier transforms.Comment: 15 pages, 1 figure, to appear in Proceedings of the 21st Annual
ACM-SIAM Symposium on Discrete Algorithms (SODA'10). This updated version of
the paper contains a new exponential separation between classical and quantum
query complexit
- …