159 research outputs found
Weighted Polynomial Approximations: Limits for Learning and Pseudorandomness
Polynomial approximations to boolean functions have led to many positive
results in computer science. In particular, polynomial approximations to the
sign function underly algorithms for agnostically learning halfspaces, as well
as pseudorandom generators for halfspaces. In this work, we investigate the
limits of these techniques by proving inapproximability results for the sign
function.
Firstly, the polynomial regression algorithm of Kalai et al. (SIAM J. Comput.
2008) shows that halfspaces can be learned with respect to log-concave
distributions on in the challenging agnostic learning model. The
power of this algorithm relies on the fact that under log-concave
distributions, halfspaces can be approximated arbitrarily well by low-degree
polynomials. We ask whether this technique can be extended beyond log-concave
distributions, and establish a negative result. We show that polynomials of any
degree cannot approximate the sign function to within arbitrarily low error for
a large class of non-log-concave distributions on the real line, including
those with densities proportional to .
Secondly, we investigate the derandomization of Chernoff-type concentration
inequalities. Chernoff-type tail bounds on sums of independent random variables
have pervasive applications in theoretical computer science. Schmidt et al.
(SIAM J. Discrete Math. 1995) showed that these inequalities can be established
for sums of random variables with only -wise independence,
for a tail probability of . We show that their results are tight up to
constant factors.
These results rely on techniques from weighted approximation theory, which
studies how well functions on the real line can be approximated by polynomials
under various distributions. We believe that these techniques will have further
applications in other areas of computer science.Comment: 22 page
Algorithms and lower bounds for de Morgan formulas of low-communication leaf gates
The class consists of Boolean functions
computable by size- de Morgan formulas whose leaves are any Boolean
functions from a class . We give lower bounds and (SAT, Learning,
and PRG) algorithms for , for classes
of functions with low communication complexity. Let
be the maximum -party NOF randomized communication
complexity of . We show:
(1) The Generalized Inner Product function cannot be computed in
on more than fraction of inputs
for As a corollary, we get an average-case lower bound for
against .
(2) There is a PRG of seed length that -fools . For
, we get the better seed length . This gives the first
non-trivial PRG (with seed length ) for intersections of half-spaces
in the regime where .
(3) There is a randomized -time SAT algorithm for , where In particular, this implies a nontrivial
#SAT algorithm for .
(4) The Minimum Circuit Size Problem is not in .
On the algorithmic side, we show that can be
PAC-learned in time
Pre-Reduction Graph Products: Hardnesses of Properly Learning DFAs and Approximating EDP on DAGs
The study of graph products is a major research topic and typically concerns
the term , e.g., to show that . In this paper, we
study graph products in a non-standard form where is a
"reduction", a transformation of any graph into an instance of an intended
optimization problem. We resolve some open problems as applications.
(1) A tight -approximation hardness for the minimum
consistent deterministic finite automaton (DFA) problem, where is the
sample size. Due to Board and Pitt [Theoretical Computer Science 1992], this
implies the hardness of properly learning DFAs assuming (the
weakest possible assumption).
(2) A tight hardness for the edge-disjoint paths (EDP)
problem on directed acyclic graphs (DAGs), where denotes the number of
vertices.
(3) A tight hardness of packing vertex-disjoint -cycles for large .
(4) An alternative (and perhaps simpler) proof for the hardness of properly
learning DNF, CNF and intersection of halfspaces [Alekhnovich et al., FOCS 2004
and J. Comput.Syst.Sci. 2008]
Who Should Predict? Exact Algorithms For Learning to Defer to Humans
Automated AI classifiers should be able to defer the prediction to a human
decision maker to ensure more accurate predictions. In this work, we jointly
train a classifier with a rejector, which decides on each data point whether
the classifier or the human should predict. We show that prior approaches can
fail to find a human-AI system with low misclassification error even when there
exists a linear classifier and rejector that have zero error (the realizable
setting). We prove that obtaining a linear pair with low error is NP-hard even
when the problem is realizable. To complement this negative result, we give a
mixed-integer-linear-programming (MILP) formulation that can optimally solve
the problem in the linear setting. However, the MILP only scales to
moderately-sized problems. Therefore, we provide a novel surrogate loss
function that is realizable-consistent and performs well empirically. We test
our approaches on a comprehensive set of datasets and compare to a wide range
of baselines.Comment: AISTATS 202
- β¦