Search CORE

159 research outputs found

Weighted Polynomial Approximations: Limits for Learning and Pseudorandomness

Author: Bun Mark
Steinke Thomas
Publication venue
Publication date: 08/12/2014
Field of study

Polynomial approximations to boolean functions have led to many positive results in computer science. In particular, polynomial approximations to the sign function underly algorithms for agnostically learning halfspaces, as well as pseudorandom generators for halfspaces. In this work, we investigate the limits of these techniques by proving inapproximability results for the sign function. Firstly, the polynomial regression algorithm of Kalai et al. (SIAM J. Comput. 2008) shows that halfspaces can be learned with respect to log-concave distributions on

\mathbb{R}^n

in the challenging agnostic learning model. The power of this algorithm relies on the fact that under log-concave distributions, halfspaces can be approximated arbitrarily well by low-degree polynomials. We ask whether this technique can be extended beyond log-concave distributions, and establish a negative result. We show that polynomials of any degree cannot approximate the sign function to within arbitrarily low error for a large class of non-log-concave distributions on the real line, including those with densities proportional to

\exp(-|x|^{0.99})

. Secondly, we investigate the derandomization of Chernoff-type concentration inequalities. Chernoff-type tail bounds on sums of independent random variables have pervasive applications in theoretical computer science. Schmidt et al. (SIAM J. Discrete Math. 1995) showed that these inequalities can be established for sums of random variables with only

O(\log(1/\delta))

-wise independence, for a tail probability of

\delta

. We show that their results are tight up to constant factors. These results rely on techniques from weighted approximation theory, which studies how well functions on the real line can be approximated by polynomials under various distributions. We believe that these techniques will have further applications in other areas of computer science.Comment: 22 page

arXiv.org e-Print Archive

CiteSeerX

Dagstuhl Research Online Publication Server

Algorithms and lower bounds for de Morgan formulas of low-communication leaf gates

Author: Carboni Oliveira Igor
Kabanets Valentine
Koroth Sajin
Lu Zhenjian
Myrisiotis Dimitrios
Publication venue
Publication date: 01/01/2020
Field of study

The class

FORMULA[s] \circ \mathcal{G}

consists of Boolean functions computable by size-

s

de Morgan formulas whose leaves are any Boolean functions from a class

\mathcal{G}

. We give lower bounds and (SAT, Learning, and PRG) algorithms for

FORMULA[n^{1.99}]\circ \mathcal{G}

, for classes

\mathcal{G}

of functions with low communication complexity. Let

R^{(k)}(\mathcal{G})

be the maximum

k

-party NOF randomized communication complexity of

\mathcal{G}

. We show: (1) The Generalized Inner Product function

GIP^k_n

cannot be computed in

FORMULA[s]\circ \mathcal{G}

on more than

1/2+\varepsilon

fraction of inputs for

s = o \! \left ( \frac{n^2}{ \left(k \cdot 4^k \cdot {R}^{(k)}(\mathcal{G}) \cdot \log (n/\varepsilon) \cdot \log(1/\varepsilon) \right)^{2}} \right).

As a corollary, we get an average-case lower bound for

GIP^k_n

against

FORMULA[n^{1.99}]\circ PTF^{k-1}

. (2) There is a PRG of seed length

n/2 + O\left(\sqrt{s} \cdot R^{(2)}(\mathcal{G}) \cdot\log(s/\varepsilon) \cdot \log (1/\varepsilon) \right)

that

\varepsilon

-fools

FORMULA[s] \circ \mathcal{G}

. For

FORMULA[s] \circ LTF

, we get the better seed length

O\left(n^{1/2}\cdot s^{1/4}\cdot \log(n)\cdot \log(n/\varepsilon)\right)

. This gives the first non-trivial PRG (with seed length

o(n)

) for intersections of

n

half-spaces in the regime where

\varepsilon \leq 1/n

. (3) There is a randomized

2^{n-t}

-time

\#

SAT algorithm for

FORMULA[s] \circ \mathcal{G}

, where

t=\Omega\left(\frac{n}{\sqrt{s}\cdot\log^2(s)\cdot R^{(2)}(\mathcal{G})}\right)^{1/2}.

In particular, this implies a nontrivial #SAT algorithm for

FORMULA[n^{1.99}]\circ LTF

. (4) The Minimum Circuit Size Problem is not in

FORMULA[n^{1.99}]\circ XOR

. On the algorithmic side, we show that

FORMULA[n^{1.99}] \circ XOR

can be PAC-learned in time

2^{O(n/\log n)}

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Warwick Research Archives Portal Repository

Pre-Reduction Graph Products: Hardnesses of Properly Learning DFAs and Approximating EDP on DAGs

Author: Chalermsook Parinya
Laekhanukit Bundit
Nanongkai Danupon
Publication venue
Publication date: 01/01/2014
Field of study

The study of graph products is a major research topic and typically concerns the term

f(G*H)

, e.g., to show that

f(G*H)=f(G)f(H)

. In this paper, we study graph products in a non-standard form

f(R[G*H]

where

R

is a "reduction", a transformation of any graph into an instance of an intended optimization problem. We resolve some open problems as applications. (1) A tight

n^{1-\epsilon}

-approximation hardness for the minimum consistent deterministic finite automaton (DFA) problem, where

n

is the sample size. Due to Board and Pitt [Theoretical Computer Science 1992], this implies the hardness of properly learning DFAs assuming

NP\neq RP

(the weakest possible assumption). (2) A tight

n^{1/2-\epsilon}

hardness for the edge-disjoint paths (EDP) problem on directed acyclic graphs (DAGs), where

n

denotes the number of vertices. (3) A tight hardness of packing vertex-disjoint

k

-cycles for large

k

. (4) An alternative (and perhaps simpler) proof for the hardness of properly learning DNF, CNF and intersection of halfspaces [Alekhnovich et al., FOCS 2004 and J. Comput.Syst.Sci. 2008]

arXiv.org e-Print Archive

Who Should Predict? Exact Algorithms For Learning to Defer to Humans

Author: Das Subhro
Lang Hunter
Mozannar Hussein
Sattigeri Prasanna
Sontag David
Wei Dennis
Publication venue
Publication date: 11/04/2023
Field of study

Automated AI classifiers should be able to defer the prediction to a human decision maker to ensure more accurate predictions. In this work, we jointly train a classifier with a rejector, which decides on each data point whether the classifier or the human should predict. We show that prior approaches can fail to find a human-AI system with low misclassification error even when there exists a linear classifier and rejector that have zero error (the realizable setting). We prove that obtaining a linear pair with low error is NP-hard even when the problem is realizable. To complement this negative result, we give a mixed-integer-linear-programming (MILP) formulation that can optimally solve the problem in the linear setting. However, the MILP only scales to moderately-sized problems. Therefore, we provide a novel surrogate loss function that is realizable-consistent and performs well empirically. We test our approaches on a comprehensive set of datasets and compare to a wide range of baselines.Comment: AISTATS 202

arXiv.org e-Print Archive