8,930 research outputs found

    Progress on Polynomial Identity Testing - II

    Full text link
    We survey the area of algebraic complexity theory; with the focus being on the problem of polynomial identity testing (PIT). We discuss the key ideas that have gone into the results of the last few years.Comment: 17 pages, 1 figure, surve

    Strong ETH Breaks With Merlin and Arthur: Short Non-Interactive Proofs of Batch Evaluation

    Get PDF
    We present an efficient proof system for Multipoint Arithmetic Circuit Evaluation: for every arithmetic circuit C(x1,
,xn)C(x_1,\ldots,x_n) of size ss and degree dd over a field F{\mathbb F}, and any inputs a1,
,aK∈Fna_1,\ldots,a_K \in {\mathbb F}^n, ∙\bullet the Prover sends the Verifier the values C(a1),
,C(aK)∈FC(a_1), \ldots, C(a_K) \in {\mathbb F} and a proof of O~(K⋅d)\tilde{O}(K \cdot d) length, and ∙\bullet the Verifier tosses poly(log⁥(dK∣F∣/Δ))\textrm{poly}(\log(dK|{\mathbb F}|/\varepsilon)) coins and can check the proof in about O~(K⋅(n+d)+s)\tilde{O}(K \cdot(n + d) + s) time, with probability of error less than Δ\varepsilon. For small degree dd, this "Merlin-Arthur" proof system (a.k.a. MA-proof system) runs in nearly-linear time, and has many applications. For example, we obtain MA-proof systems that run in cnc^{n} time (for various c<2c < 2) for the Permanent, #\#Circuit-SAT for all sublinear-depth circuits, counting Hamiltonian cycles, and infeasibility of 00-11 linear programs. In general, the value of any polynomial in Valiant's class VP{\sf VP} can be certified faster than "exhaustive summation" over all possible assignments. These results strongly refute a Merlin-Arthur Strong ETH and Arthur-Merlin Strong ETH posed by Russell Impagliazzo and others. We also give a three-round (AMA) proof system for quantified Boolean formulas running in 22n/3+o(n)2^{2n/3+o(n)} time, nearly-linear time MA-proof systems for counting orthogonal vectors in a collection and finding Closest Pairs in the Hamming metric, and a MA-proof system running in nk/2+O(1)n^{k/2+O(1)}-time for counting kk-cliques in graphs. We point to some potential future directions for refuting the Nondeterministic Strong ETH.Comment: 17 page

    Proof Complexity of Systems of (Non-Deterministic) Decision Trees and Branching Programs

    Get PDF
    This paper studies propositional proof systems in which lines are sequents of decision trees or branching programs, deterministic or non-deterministic. Decision trees (DTs) are represented by a natural term syntax, inducing the system LDT, and non-determinism is modelled by including disjunction, ?, as primitive (system LNDT). Branching programs generalise DTs to dag-like structures and are duly handled by extension variables in our setting, as is common in proof complexity (systems eLDT and eLNDT). Deterministic and non-deterministic branching programs are natural nonuniform analogues of log-space (L) and nondeterministic log-space (NL), respectively. Thus eLDT and eLNDT serve as natural systems of reasoning corresponding to L and NL, respectively. The main results of the paper are simulation and non-simulation results for tree-like and dag-like proofs in LDT, LNDT, eLDT and eLNDT. We also compare them with Frege systems, constant-depth Frege systems and extended Frege systems

    Arithmetic on a Distributed-Memory Quantum Multicomputer

    Full text link
    We evaluate the performance of quantum arithmetic algorithms run on a distributed quantum computer (a quantum multicomputer). We vary the node capacity and I/O capabilities, and the network topology. The tradeoff of choosing between gates executed remotely, through ``teleported gates'' on entangled pairs of qubits (telegate), versus exchanging the relevant qubits via quantum teleportation, then executing the algorithm using local gates (teledata), is examined. We show that the teledata approach performs better, and that carry-ripple adders perform well when the teleportation block is decomposed so that the key quantum operations can be parallelized. A node size of only a few logical qubits performs adequately provided that the nodes have two transceiver qubits. A linear network topology performs acceptably for a broad range of system sizes and performance parameters. We therefore recommend pursuing small, high-I/O bandwidth nodes and a simple network. Such a machine will run Shor's algorithm for factoring large numbers efficiently.Comment: 24 pages, 10 figures, ACM transactions format. Extended version of Int. Symp. on Comp. Architecture (ISCA) paper; v2, correct one circuit error, numerous small changes for clarity, add reference

    Vector processing-aware advanced clock-gating techniques for low-power fused multiply-add

    Get PDF
    The need for power efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a retailoring for the mobile market that they are entering now. Floating-point (FP) fused multiply-add (FMA), being a functional unit with high power consumption, deserves special attention. Although clock gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector FMA units (VFUs). These techniques ensure power savings without jeopardizing the timing. We evaluate the proposed techniques using both synthetic and “real-world” application-based benchmarking. Using vector masking and vector multilane-aware clock gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector FP instructions. Finally, when evaluating all techniques together, using “real-world” benchmarking, the power reductions are up to 80%. Additionally, in accordance with processor design trends, we perform this research in a fully parameterizable and automated fashion.The research leading to these results has received funding from the RoMoL ERC Advanced Grant GA 321253 and is supported in part by the European Union (FEDER funds) under contract TTIN2015-65316-P. The work of I. Ratkovic was supported by a FPU research grant from the Spanish MECD.Peer ReviewedPostprint (author's final draft

    The Road to Quantum Computational Supremacy

    Full text link
    We present an idiosyncratic view of the race for quantum computational supremacy. Google's approach and IBM challenge are examined. An unexpected side-effect of the race is the significant progress in designing fast classical algorithms. Quantum supremacy, if achieved, won't make classical computing obsolete.Comment: 15 pages, 1 figur

    Classical simulation of commuting quantum computations implies collapse of the polynomial hierarchy

    Full text link
    We consider quantum computations comprising only commuting gates, known as IQP computations, and provide compelling evidence that the task of sampling their output probability distributions is unlikely to be achievable by any efficient classical means. More specifically we introduce the class post-IQP of languages decided with bounded error by uniform families of IQP circuits with post-selection, and prove first that post-IQP equals the classical class PP. Using this result we show that if the output distributions of uniform IQP circuit families could be classically efficiently sampled, even up to 41% multiplicative error in the probabilities, then the infinite tower of classical complexity classes known as the polynomial hierarchy, would collapse to its third level. We mention some further results on the classical simulation properties of IQP circuit families, in particular showing that if the output distribution results from measurements on only O(log n) lines then it may in fact be classically efficiently sampled.Comment: 13 page

    A Near-Optimal Depth-Hierarchy Theorem for Small-Depth Multilinear Circuits

    Full text link
    We study the size blow-up that is necessary to convert an algebraic circuit of product-depth Δ+1\Delta+1 to one of product-depth Δ\Delta in the multilinear setting. We show that for every positive Δ=Δ(n)=o(log⁥n/log⁥log⁥n),\Delta = \Delta(n) = o(\log n/\log \log n), there is an explicit multilinear polynomial P(Δ)P^{(\Delta)} on nn variables that can be computed by a multilinear formula of product-depth Δ+1\Delta+1 and size O(n)O(n), but not by any multilinear circuit of product-depth Δ\Delta and size less than exp⁥(nΩ(1/Δ))\exp(n^{\Omega(1/\Delta)}). This result is tight up to the constant implicit in the double exponent for all Δ=o(log⁥n/log⁥log⁥n).\Delta = o(\log n/\log \log n). This strengthens a result of Raz and Yehudayoff (Computational Complexity 2009) who prove a quasipolynomial separation for constant-depth multilinear circuits, and a result of Kayal, Nair and Saha (STACS 2016) who give an exponential separation in the case Δ=1.\Delta = 1. Our separating examples may be viewed as algebraic analogues of variants of the Graph Reachability problem studied by Chen, Oliveira, Servedio and Tan (STOC 2016), who used them to prove lower bounds for constant-depth Boolean circuits
    • 

    corecore