8,930 research outputs found
Progress on Polynomial Identity Testing - II
We survey the area of algebraic complexity theory; with the focus being on
the problem of polynomial identity testing (PIT). We discuss the key ideas that
have gone into the results of the last few years.Comment: 17 pages, 1 figure, surve
Strong ETH Breaks With Merlin and Arthur: Short Non-Interactive Proofs of Batch Evaluation
We present an efficient proof system for Multipoint Arithmetic Circuit
Evaluation: for every arithmetic circuit of size and
degree over a field , and any inputs ,
the Prover sends the Verifier the values and a proof of length, and
the Verifier tosses coins and can check the proof in about time, with probability of error less than .
For small degree , this "Merlin-Arthur" proof system (a.k.a. MA-proof
system) runs in nearly-linear time, and has many applications. For example, we
obtain MA-proof systems that run in time (for various ) for the
Permanent, Circuit-SAT for all sublinear-depth circuits, counting
Hamiltonian cycles, and infeasibility of - linear programs. In general,
the value of any polynomial in Valiant's class can be certified
faster than "exhaustive summation" over all possible assignments. These results
strongly refute a Merlin-Arthur Strong ETH and Arthur-Merlin Strong ETH posed
by Russell Impagliazzo and others.
We also give a three-round (AMA) proof system for quantified Boolean formulas
running in time, nearly-linear time MA-proof systems for
counting orthogonal vectors in a collection and finding Closest Pairs in the
Hamming metric, and a MA-proof system running in -time for
counting -cliques in graphs.
We point to some potential future directions for refuting the
Nondeterministic Strong ETH.Comment: 17 page
Proof Complexity of Systems of (Non-Deterministic) Decision Trees and Branching Programs
This paper studies propositional proof systems in which lines are sequents of decision trees or branching programs, deterministic or non-deterministic. Decision trees (DTs) are represented by a natural term syntax, inducing the system LDT, and non-determinism is modelled by including disjunction, ?, as primitive (system LNDT). Branching programs generalise DTs to dag-like structures and are duly handled by extension variables in our setting, as is common in proof complexity (systems eLDT and eLNDT).
Deterministic and non-deterministic branching programs are natural nonuniform analogues of log-space (L) and nondeterministic log-space (NL), respectively. Thus eLDT and eLNDT serve as natural systems of reasoning corresponding to L and NL, respectively.
The main results of the paper are simulation and non-simulation results for tree-like and dag-like proofs in LDT, LNDT, eLDT and eLNDT. We also compare them with Frege systems, constant-depth Frege systems and extended Frege systems
Arithmetic on a Distributed-Memory Quantum Multicomputer
We evaluate the performance of quantum arithmetic algorithms run on a
distributed quantum computer (a quantum multicomputer). We vary the node
capacity and I/O capabilities, and the network topology. The tradeoff of
choosing between gates executed remotely, through ``teleported gates'' on
entangled pairs of qubits (telegate), versus exchanging the relevant qubits via
quantum teleportation, then executing the algorithm using local gates
(teledata), is examined. We show that the teledata approach performs better,
and that carry-ripple adders perform well when the teleportation block is
decomposed so that the key quantum operations can be parallelized. A node size
of only a few logical qubits performs adequately provided that the nodes have
two transceiver qubits. A linear network topology performs acceptably for a
broad range of system sizes and performance parameters. We therefore recommend
pursuing small, high-I/O bandwidth nodes and a simple network. Such a machine
will run Shor's algorithm for factoring large numbers efficiently.Comment: 24 pages, 10 figures, ACM transactions format. Extended version of
Int. Symp. on Comp. Architecture (ISCA) paper; v2, correct one circuit error,
numerous small changes for clarity, add reference
Vector processing-aware advanced clock-gating techniques for low-power fused multiply-add
The need for power efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a retailoring for the mobile market that they are entering now. Floating-point (FP) fused multiply-add (FMA), being a functional unit with high power consumption, deserves special attention. Although clock gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector FMA units (VFUs). These techniques ensure power savings without jeopardizing the timing. We evaluate the proposed techniques using both synthetic and âreal-worldâ application-based benchmarking. Using vector masking and vector multilane-aware clock gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector FP instructions. Finally, when evaluating all techniques together, using âreal-worldâ benchmarking, the power reductions are up to 80%. Additionally, in accordance with processor design trends, we perform this research in a fully parameterizable and automated fashion.The research leading to these results has received funding from the RoMoL ERC Advanced Grant GA 321253 and is supported in part by the European Union (FEDER funds) under contract TTIN2015-65316-P.
The work of I. Ratkovic was supported by a FPU research grant from the Spanish MECD.Peer ReviewedPostprint (author's final draft
The Road to Quantum Computational Supremacy
We present an idiosyncratic view of the race for quantum computational
supremacy. Google's approach and IBM challenge are examined. An unexpected
side-effect of the race is the significant progress in designing fast classical
algorithms. Quantum supremacy, if achieved, won't make classical computing
obsolete.Comment: 15 pages, 1 figur
Classical simulation of commuting quantum computations implies collapse of the polynomial hierarchy
We consider quantum computations comprising only commuting gates, known as
IQP computations, and provide compelling evidence that the task of sampling
their output probability distributions is unlikely to be achievable by any
efficient classical means. More specifically we introduce the class post-IQP of
languages decided with bounded error by uniform families of IQP circuits with
post-selection, and prove first that post-IQP equals the classical class PP.
Using this result we show that if the output distributions of uniform IQP
circuit families could be classically efficiently sampled, even up to 41%
multiplicative error in the probabilities, then the infinite tower of classical
complexity classes known as the polynomial hierarchy, would collapse to its
third level. We mention some further results on the classical simulation
properties of IQP circuit families, in particular showing that if the output
distribution results from measurements on only O(log n) lines then it may in
fact be classically efficiently sampled.Comment: 13 page
A Near-Optimal Depth-Hierarchy Theorem for Small-Depth Multilinear Circuits
We study the size blow-up that is necessary to convert an algebraic circuit
of product-depth to one of product-depth in the multilinear
setting.
We show that for every positive
there is an explicit multilinear polynomial on variables
that can be computed by a multilinear formula of product-depth and
size , but not by any multilinear circuit of product-depth and
size less than . This result is tight up to the
constant implicit in the double exponent for all
This strengthens a result of Raz and Yehudayoff (Computational Complexity
2009) who prove a quasipolynomial separation for constant-depth multilinear
circuits, and a result of Kayal, Nair and Saha (STACS 2016) who give an
exponential separation in the case
Our separating examples may be viewed as algebraic analogues of variants of
the Graph Reachability problem studied by Chen, Oliveira, Servedio and Tan
(STOC 2016), who used them to prove lower bounds for constant-depth Boolean
circuits
- âŠ