868 research outputs found

    Oracles Are Subtle But Not Malicious

    Full text link
    Theoretical computer scientists have been debating the role of oracles since the 1970's. This paper illustrates both that oracles can give us nontrivial insights about the barrier problems in circuit complexity, and that they need not prevent us from trying to solve those problems. First, we give an oracle relative to which PP has linear-sized circuits, by proving a new lower bound for perceptrons and low- degree threshold polynomials. This oracle settles a longstanding open question, and generalizes earlier results due to Beigel and to Buhrman, Fortnow, and Thierauf. More importantly, it implies the first nonrelativizing separation of "traditional" complexity classes, as opposed to interactive proof classes such as MIP and MA-EXP. For Vinodchandran showed, by a nonrelativizing argument, that PP does not have circuits of size n^k for any fixed k. We present an alternative proof of this fact, which shows that PP does not even have quantum circuits of size n^k with quantum advice. To our knowledge, this is the first nontrivial lower bound on quantum circuit size. Second, we study a beautiful algorithm of Bshouty et al. for learning Boolean circuits in ZPP^NP. We show that the NP queries in this algorithm cannot be parallelized by any relativizing technique, by giving an oracle relative to which ZPP^||NP and even BPP^||NP have linear-size circuits. On the other hand, we also show that the NP queries could be parallelized if P=NP. Thus, classes such as ZPP^||NP inhabit a "twilight zone," where we need to distinguish between relativizing and black-box techniques. Our results on this subject have implications for computational learning theory as well as for the circuit minimization problem.Comment: 20 pages, 1 figur

    Immunity and Simplicity for Exact Counting and Other Counting Classes

    Full text link
    Ko [RAIRO 24, 1990] and Bruschi [TCS 102, 1992] showed that in some relativized world, PSPACE (in fact, ParityP) contains a set that is immune to the polynomial hierarchy (PH). In this paper, we study and settle the question of (relativized) separations with immunity for PH and the counting classes PP, C_{=}P, and ParityP in all possible pairwise combinations. Our main result is that there is an oracle A relative to which C_{=}P contains a set that is immune to BPP^{ParityP}. In particular, this C_{=}P^A set is immune to PH^{A} and ParityP^{A}. Strengthening results of Tor\'{a}n [J.ACM 38, 1991] and Green [IPL 37, 1991], we also show that, in suitable relativizations, NP contains a C_{=}P-immune set, and ParityP contains a PP^{PH}-immune set. This implies the existence of a C_{=}P^{B}-simple set for some oracle B, which extends results of Balc\'{a}zar et al. [SIAM J.Comp. 14, 1985; RAIRO 22, 1988] and provides the first example of a simple set in a class not known to be contained in PH. Our proof technique requires a circuit lower bound for ``exact counting'' that is derived from Razborov's [Mat. Zametki 41, 1987] lower bound for majority.Comment: 20 page

    A note on quantum algorithms and the minimal degree of epsilon-error polynomials for symmetric functions

    Full text link
    The degrees of polynomials representing or approximating Boolean functions are a prominent tool in various branches of complexity theory. Sherstov recently characterized the minimal degree deg_{\eps}(f) among all polynomials (over the reals) that approximate a symmetric function f:{0,1}^n-->{0,1} up to worst-case error \eps: deg_{\eps}(f) = ~\Theta(deg_{1/3}(f) + \sqrt{n\log(1/\eps)}). In this note we show how a tighter version (without the log-factors hidden in the ~\Theta-notation), can be derived quite easily using the close connection between polynomials and quantum algorithms.Comment: 7 pages LaTeX. 2nd version: corrected a few small inaccuracie

    Model Interpretability through the Lens of Computational Complexity

    Get PDF
    In spite of several claims stating that some models are more interpretable than others -- e.g., "linear models are more interpretable than deep neural networks" -- we still lack a principled notion of interpretability to formally compare among different classes of models. We make a step towards such a notion by studying whether folklore interpretability claims have a correlate in terms of computational complexity theory. We focus on local post-hoc explainability queries that, intuitively, attempt to answer why individual inputs are classified in a certain way by a given model. In a nutshell, we say that a class C1\mathcal{C}_1 of models is more interpretable than another class C2\mathcal{C}_2, if the computational complexity of answering post-hoc queries for models in C2\mathcal{C}_2 is higher than for those in C1\mathcal{C}_1. We prove that this notion provides a good theoretical counterpart to current beliefs on the interpretability of models; in particular, we show that under our definition and assuming standard complexity-theoretical assumptions (such as P\neqNP), both linear and tree-based models are strictly more interpretable than neural networks. Our complexity analysis, however, does not provide a clear-cut difference between linear and tree-based models, as we obtain different results depending on the particular post-hoc explanations considered. Finally, by applying a finer complexity analysis based on parameterized complexity, we are able to prove a theoretical result suggesting that shallow neural networks are more interpretable than deeper ones.Comment: 36 pages, including 9 pages of main text. This is the arXiv version of the NeurIPS'2020 paper. Except from minor differences that could be introduced by the publisher, the only difference should be the addition of the appendix, which contains all the proofs that do not appear in the main tex

    Polynomials that Sign Represent Parity and Descartes' Rule of Signs

    Full text link
    A real polynomial P(X1,...,Xn)P(X_1,..., X_n) sign represents f:An{0,1}f: A^n \to \{0,1\} if for every (a1,...,an)An(a_1, ..., a_n) \in A^n, the sign of P(a1,...,an)P(a_1,...,a_n) equals (1)f(a1,...,an)(-1)^{f(a_1,...,a_n)}. Such sign representations are well-studied in computer science and have applications to computational complexity and computational learning theory. In this work, we present a systematic study of tradeoffs between degree and sparsity of sign representations through the lens of the parity function. We attempt to prove bounds that hold for any choice of set AA. We show that sign representing parity over {0,...,m1}n\{0,...,m-1\}^n with the degree in each variable at most m1m-1 requires sparsity at least mnm^n. We show that a tradeoff exists between sparsity and degree, by exhibiting a sign representation that has higher degree but lower sparsity. We show a lower bound of n(m2)+1n(m -2) + 1 on the sparsity of polynomials of any degree representing parity over {0,...,m1}n\{0,..., m-1\}^n. We prove exact bounds on the sparsity of such polynomials for any two element subset AA. The main tool used is Descartes' Rule of Signs, a classical result in algebra, relating the sparsity of a polynomial to its number of real roots. As an application, we use bounds on sparsity to derive circuit lower bounds for depth-two AND-OR-NOT circuits with a Threshold Gate at the top. We use this to give a simple proof that such circuits need size 1.5n1.5^n to compute parity, which improves the previous bound of 4/3n/2{4/3}^{n/2} due to Goldmann (1997). We show a tight lower bound of 2n2^n for the inner product function over {0,1}n×{0,1}n\{0,1\}^n \times \{0, 1\}^n.Comment: To appear in Computational Complexit

    The intersection of two halfspaces has high threshold degree

    Full text link
    The threshold degree of a Boolean function f:{0,1}^n->{-1,+1} is the least degree of a real polynomial p such that f(x)=sgn p(x). We construct two halfspaces on {0,1}^n whose intersection has threshold degree Theta(sqrt n), an exponential improvement on previous lower bounds. This solves an open problem due to Klivans (2002) and rules out the use of perceptron-based techniques for PAC learning the intersection of two halfspaces, a central unresolved challenge in computational learning. We also prove that the intersection of two majority functions has threshold degree Omega(log n), which is tight and settles a conjecture of O'Donnell and Servedio (2003). Our proof consists of two parts. First, we show that for any nonconstant Boolean functions f and g, the intersection f(x)^g(y) has threshold degree O(d) if and only if ||f-F||_infty + ||g-G||_infty < 1 for some rational functions F, G of degree O(d). Second, we settle the least degree required for approximating a halfspace and a majority function to any given accuracy by rational functions. Our technique further allows us to make progress on Aaronson's challenge (2008) and contribute strong direct product theorems for polynomial representations of composed Boolean functions of the form F(f_1,...,f_n). In particular, we give an improved lower bound on the approximate degree of the AND-OR tree.Comment: Full version of the FOCS'09 pape

    Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a Review

    Get PDF
    The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning. Deep convolutional networks are a special case of these conditions, though weight sharing is not the main reason for their exponential advantage

    Weighted Polynomial Approximations: Limits for Learning and Pseudorandomness

    Get PDF
    Polynomial approximations to boolean functions have led to many positive results in computer science. In particular, polynomial approximations to the sign function underly algorithms for agnostically learning halfspaces, as well as pseudorandom generators for halfspaces. In this work, we investigate the limits of these techniques by proving inapproximability results for the sign function. Firstly, the polynomial regression algorithm of Kalai et al. (SIAM J. Comput. 2008) shows that halfspaces can be learned with respect to log-concave distributions on Rn\mathbb{R}^n in the challenging agnostic learning model. The power of this algorithm relies on the fact that under log-concave distributions, halfspaces can be approximated arbitrarily well by low-degree polynomials. We ask whether this technique can be extended beyond log-concave distributions, and establish a negative result. We show that polynomials of any degree cannot approximate the sign function to within arbitrarily low error for a large class of non-log-concave distributions on the real line, including those with densities proportional to exp(x0.99)\exp(-|x|^{0.99}). Secondly, we investigate the derandomization of Chernoff-type concentration inequalities. Chernoff-type tail bounds on sums of independent random variables have pervasive applications in theoretical computer science. Schmidt et al. (SIAM J. Discrete Math. 1995) showed that these inequalities can be established for sums of random variables with only O(log(1/δ))O(\log(1/\delta))-wise independence, for a tail probability of δ\delta. We show that their results are tight up to constant factors. These results rely on techniques from weighted approximation theory, which studies how well functions on the real line can be approximated by polynomials under various distributions. We believe that these techniques will have further applications in other areas of computer science.Comment: 22 page
    corecore