41 research outputs found
Nearly optimal solutions for the Chow Parameters Problem and low-weight approximation of halfspaces
The \emph{Chow parameters} of a Boolean function
are its degree-0 and degree-1 Fourier coefficients. It has been known
since 1961 (Chow, Tannenbaum) that the (exact values of the) Chow parameters of
any linear threshold function uniquely specify within the space of all
Boolean functions, but until recently (O'Donnell and Servedio) nothing was
known about efficient algorithms for \emph{reconstructing} (exactly or
approximately) from exact or approximate values of its Chow parameters. We
refer to this reconstruction problem as the \emph{Chow Parameters Problem.}
Our main result is a new algorithm for the Chow Parameters Problem which,
given (sufficiently accurate approximations to) the Chow parameters of any
linear threshold function , runs in time \tilde{O}(n^2)\cdot
(1/\eps)^{O(\log^2(1/\eps))} and with high probability outputs a
representation of an LTF that is \eps-close to . The only previous
algorithm (O'Donnell and Servedio) had running time \poly(n) \cdot
2^{2^{\tilde{O}(1/\eps^2)}}.
As a byproduct of our approach, we show that for any linear threshold
function over , there is a linear threshold function which
is \eps-close to and has all weights that are integers at most \sqrt{n}
\cdot (1/\eps)^{O(\log^2(1/\eps))}. This significantly improves the best
previous result of Diakonikolas and Servedio which gave a \poly(n) \cdot
2^{\tilde{O}(1/\eps^{2/3})} weight bound, and is close to the known lower
bound of (1/\eps)^{\Omega(\log \log (1/\eps))}\} (Goldberg,
Servedio). Our techniques also yield improved algorithms for related problems
in learning theory
Learning Geometric Concepts with Nasty Noise
We study the efficient learnability of geometric concept classes -
specifically, low-degree polynomial threshold functions (PTFs) and
intersections of halfspaces - when a fraction of the data is adversarially
corrupted. We give the first polynomial-time PAC learning algorithms for these
concept classes with dimension-independent error guarantees in the presence of
nasty noise under the Gaussian distribution. In the nasty noise model, an
omniscient adversary can arbitrarily corrupt a small fraction of both the
unlabeled data points and their labels. This model generalizes well-studied
noise models, including the malicious noise model and the agnostic (adversarial
label noise) model. Prior to our work, the only concept class for which
efficient malicious learning algorithms were known was the class of
origin-centered halfspaces.
Specifically, our robust learning algorithm for low-degree PTFs succeeds
under a number of tame distributions -- including the Gaussian distribution
and, more generally, any log-concave distribution with (approximately) known
low-degree moments. For LTFs under the Gaussian distribution, we give a
polynomial-time algorithm that achieves error , where
is the noise rate. At the core of our PAC learning results is an efficient
algorithm to approximate the low-degree Chow-parameters of any bounded function
in the presence of nasty noise. To achieve this, we employ an iterative
spectral method for outlier detection and removal, inspired by recent work in
robust unsupervised learning. Our aforementioned algorithm succeeds for a range
of distributions satisfying mild concentration bounds and moment assumptions.
The correctness of our robust learning algorithm for intersections of
halfspaces makes essential use of a novel robust inverse independence lemma
that may be of broader interest
Nearly Tight Bounds for Robust Proper Learning of Halfspaces with a Margin
We study the problem of {\em properly} learning large margin halfspaces in
the agnostic PAC model. In more detail, we study the complexity of properly
learning -dimensional halfspaces on the unit ball within misclassification
error , where
is the optimal -margin error rate and is the approximation ratio. We give learning algorithms and
computational hardness results for this problem, for all values of the
approximation ratio , that are nearly-matching for a range of
parameters. Specifically, for the natural setting that is any constant
bigger than one, we provide an essentially tight complexity characterization.
On the positive side, we give an -approximate proper learner
that uses samples (which is optimal) and runs in
time . On the
negative side, we show that {\em any} constant factor approximate proper
learner has runtime ,
assuming the Exponential Time Hypothesis
Public projects, Boolean functions and the borders of Border's theorem
Border's theorem gives an intuitive linear characterization of the feasible
interim allocation rules of a Bayesian single-item environment, and it has
several applications in economic and algorithmic mechanism design. All known
generalizations of Border's theorem either restrict attention to relatively
simple settings, or resort to approximation. This paper identifies a
complexity-theoretic barrier that indicates, assuming standard complexity class
separations, that Border's theorem cannot be extended significantly beyond the
state-of-the-art. We also identify a surprisingly tight connection between
Myerson's optimal auction theory, when applied to public project settings, and
some fundamental results in the analysis of Boolean functions.Comment: Accepted to ACM EC 201
Ready for the design of voting rules?
The design of fair voting rules has been addressed quite often in the
literature. Still, the so-called inverse problem is not entirely resolved. We
summarize some achievements in this direction and formulate explicit open
questions and conjectures.Comment: 10 page
Halfway to Halfspace Testing
In this thesis I study the problem of testing halfspaces under arbitrary probability distributions, using only random samples. A halfspace, or linear threshold function, is a boolean function f : Rⁿ → {±1} defined as the sign of a linear function; that is,
f(x) = sign(Σᵢ wᵢxᵢ - θ)
where we refer to w ∈ Rⁿ as a weight vector and θ ∈ R as a threshold. These functions have been studied intensively since the middle of the 20th century; they appear in many places, including social choice theory (the theory of voting rules), circuit complexity theory, machine learning theory, hardness of approximation, and the analysis of boolean functions.
The problem of testing halfspaces, in the sense of property testing, is to design an algorithm that, with high probability, decides whether an unknown function f is a halfspace function or far from a halfspace, using as few examples of labelled points (x, f (x)) as possible. In this work I focus on the problem of testing halfspaces using only random examples drawn from an arbitrary distribution, and the algorithm cannot choose the points it receives. This is in contrast with previous work on the problem, where the algorithm can query points of its choice, and the distribution was assumed to be uniform over the boolean hypercube.
Towards a solution to this problem I present an algorithm that works for rotationally invariant probability distributions (under reasonable conditions), using roughly O(√n) random examples, which is close to the known lower bound of Ω(√n/ √log n) . I further develop the algorithm to work for mixtures of two such rotationally invariant distributions and provide a partial analysis. I also survey related machine learning results, and conclude with a survey of the theory of halfspaces over the boolean hypercube, which has recently received much attention
Learning DNF Expressions from Fourier Spectrum
Since its introduction by Valiant in 1984, PAC learning of DNF expressions
remains one of the central problems in learning theory. We consider this
problem in the setting where the underlying distribution is uniform, or more
generally, a product distribution. Kalai, Samorodnitsky and Teng (2009) showed
that in this setting a DNF expression can be efficiently approximated from its
"heavy" low-degree Fourier coefficients alone. This is in contrast to previous
approaches where boosting was used and thus Fourier coefficients of the target
function modified by various distributions were needed. This property is
crucial for learning of DNF expressions over smoothed product distributions, a
learning model introduced by Kalai et al. (2009) and inspired by the seminal
smoothed analysis model of Spielman and Teng (2001).
We introduce a new approach to learning (or approximating) a polynomial
threshold functions which is based on creating a function with range [-1,1]
that approximately agrees with the unknown function on low-degree Fourier
coefficients. We then describe conditions under which this is sufficient for
learning polynomial threshold functions. Our approach yields a new, simple
algorithm for approximating any polynomial-size DNF expression from its "heavy"
low-degree Fourier coefficients alone. Our algorithm greatly simplifies the
proof of learnability of DNF expressions over smoothed product distributions.
We also describe an application of our algorithm to learning monotone DNF
expressions over product distributions. Building on the work of Servedio
(2001), we give an algorithm that runs in time \poly((s \cdot
\log{(s/\eps)})^{\log{(s/\eps)}}, n), where is the size of the target DNF
expression and \eps is the accuracy. This improves on \poly((s \cdot
\log{(ns/\eps)})^{\log{(s/\eps)} \cdot \log{(1/\eps)}}, n) bound of Servedio
(2001).Comment: Appears in Conference on Learning Theory (COLT) 201
On the Complexity of the Inverse Semivalue Problem for Weighted Voting Games
Weighted voting games are a family of cooperative games, typically used to
model voting situations where a number of agents (players) vote against or for
a proposal. In such games, a proposal is accepted if an appropriately weighted
sum of the votes exceeds a prespecified threshold. As the influence of a player
over the voting outcome is not in general proportional to her assigned weight,
various power indices have been proposed to measure each player's influence.
The inverse power index problem is the problem of designing a weighted voting
game that achieves a set of target influences according to a predefined power
index. In this work, we study the computational complexity of the inverse
problem when the power index belongs to the class of semivalues. We prove that
the inverse problem is computationally intractable for a broad family of
semivalues, including all regular semivalues. As a special case of our general
result, we establish computational hardness of the inverse problem for the
Banzhaf indices and the Shapley values, arguably the most popular power
indices.Comment: To appear in AAAI 201