5,172 research outputs found

    On fast multiplication of a matrix by its transpose

    Get PDF
    We present a non-commutative algorithm for the multiplication of a 2x2-block-matrix by its transpose using 5 block products (3 recursive calls and 2 general products) over C or any finite field.We use geometric considerations on the space of bilinear forms describing 2x2 matrix products to obtain this algorithm and we show how to reduce the number of involved additions.The resulting algorithm for arbitrary dimensions is a reduction of multiplication of a matrix by its transpose to general matrix product, improving by a constant factor previously known reductions.Finally we propose schedules with low memory footprint that support a fast and memory efficient practical implementation over a finite field.To conclude, we show how to use our result in LDLT factorization.Comment: ISSAC 2020, Jul 2020, Kalamata, Greec

    Quantum Speedup by Quantum Annealing

    Full text link
    We study the glued-trees problem of Childs et. al. in the adiabatic model of quantum computing and provide an annealing schedule to solve an oracular problem exponentially faster than classically possible. The Hamiltonians involved in the quantum annealing do not suffer from the so-called sign problem. Unlike the typical scenario, our schedule is efficient even though the minimum energy gap of the Hamiltonians is exponentially small in the problem size. We discuss generalizations based on initial-state randomization to avoid some slowdowns in adiabatic quantum computing due to small gaps.Comment: 7 page

    Scalable Emulation of Sign-Problem−-Free Hamiltonians with Room Temperature p-bits

    Full text link
    The growing field of quantum computing is based on the concept of a q-bit which is a delicate superposition of 0 and 1, requiring cryogenic temperatures for its physical realization along with challenging coherent coupling techniques for entangling them. By contrast, a probabilistic bit or a p-bit is a robust classical entity that fluctuates between 0 and 1, and can be implemented at room temperature using present-day technology. Here, we show that a probabilistic coprocessor built out of room temperature p-bits can be used to accelerate simulations of a special class of quantum many-body systems that are sign-problem−-free or stoquastic, leveraging the well-known Suzuki-Trotter decomposition that maps a dd-dimensional quantum many body Hamiltonian to a dd+1-dimensional classical Hamiltonian. This mapping allows an efficient emulation of a quantum system by classical computers and is commonly used in software to perform Quantum Monte Carlo (QMC) algorithms. By contrast, we show that a compact, embedded MTJ-based coprocessor can serve as a highly efficient hardware-accelerator for such QMC algorithms providing several orders of magnitude improvement in speed compared to optimized CPU implementations. Using realistic device-level SPICE simulations we demonstrate that the correct quantum correlations can be obtained using a classical p-circuit built with existing technology and operating at room temperature. The proposed coprocessor can serve as a tool to study stoquastic quantum many-body systems, overcoming challenges associated with physical quantum annealers.Comment: Fixed minor typos and expanded Appendi

    On fast multiplication of a matrix by its transpose

    Get PDF
    We present a non-commutative algorithm for the multiplication of a block-matrix by its transpose over C or any finite field using 5 recursive products. We use geometric considerations on the space of bilinear forms describing 2×2 matrix products to obtain this algorithm and we show how to reduce the number of involved additions. The resulting algorithm for arbitrary dimensions is a reduction of multiplication of a matrix by its transpose to general matrix product, improving by a constant factor previously known reductions. Finally we propose space and time efficient schedules that enable us to provide fast practical implementations for higher-dimensional matrix products

    0.5 Petabyte Simulation of a 45-Qubit Quantum Circuit

    Full text link
    Near-term quantum computers will soon reach sizes that are challenging to directly simulate, even when employing the most powerful supercomputers. Yet, the ability to simulate these early devices using classical computers is crucial for calibration, validation, and benchmarking. In order to make use of the full potential of systems featuring multi- and many-core processors, we use automatic code generation and optimization of compute kernels, which also enables performance portability. We apply a scheduling algorithm to quantum supremacy circuits in order to reduce the required communication and simulate a 45-qubit circuit on the Cori II supercomputer using 8,192 nodes and 0.5 petabytes of memory. To our knowledge, this constitutes the largest quantum circuit simulation to this date. Our highly-tuned kernels in combination with the reduced communication requirements allow an improvement in time-to-solution over state-of-the-art simulations by more than an order of magnitude at every scale

    Quantum Algorithms for Finding Constant-sized Sub-hypergraphs

    Full text link
    We develop a general framework to construct quantum algorithms that detect if a 33-uniform hypergraph given as input contains a sub-hypergraph isomorphic to a prespecified constant-sized hypergraph. This framework is based on the concept of nested quantum walks recently proposed by Jeffery, Kothari and Magniez [SODA'13], and extends the methodology designed by Lee, Magniez and Santha [SODA'13] for similar problems over graphs. As applications, we obtain a quantum algorithm for finding a 44-clique in a 33-uniform hypergraph on nn vertices with query complexity O(n1.883)O(n^{1.883}), and a quantum algorithm for determining if a ternary operator over a set of size nn is associative with query complexity O(n2.113)O(n^{2.113}).Comment: 18 pages; v2: changed title, added more backgrounds to the introduction, added another applicatio
    • …
    corecore