91,872 research outputs found

    Labyrinth: Compiling Imperative Control Flow to Parallel Dataflows

    Full text link
    Parallel dataflow systems have become a standard technology for large-scale data analytics. Complex data analysis programs in areas such as machine learning and graph analytics often involve control flow, i.e., iterations and branching. Therefore, systems for advanced analytics should include control flow constructs that are efficient and easy to use. A natural approach is to provide imperative control flow constructs similar to those of mainstream programming languages: while-loops, if-statements, and mutable variables, whose values can change between iteration steps. However, current parallel dataflow systems execute programs written using imperative control flow constructs by launching a separate dataflow job after every control flow decision (e.g., for every step of a loop). The performance of this approach is suboptimal, because (a) launching a dataflow job incurs scheduling overhead; and (b) it prevents certain optimizations across iteration steps. In this paper, we introduce Labyrinth, a method to compile programs written using imperative control flow constructs to a single dataflow job, which executes the whole program, including all iteration steps. This way, we achieve both efficiency and ease of use. We also conduct an experimental evaluation, which shows that Labyrinth has orders of magnitude smaller per-iteration-step overhead than launching new dataflow jobs, and also allows for significant optimizations across iteration steps

    Compiling universal quantum circuits

    Full text link
    We propose a method of compiling that permits to identify quantum circuits able to simulate arbitrary nn-qubit unitary operations via the adjustment of angles in single-qubit gates therein. The method of compiling itself extends older quantum control techniques and stays computationally tractable for several qubits. As an application we identify compiling universal circuits for 33, 44 and 55 qubits consisting of 1616, 6464 and 256 256 CNOTs respectively.Comment: 5 pages, 4 figures. The application to QFT circuits has been eliminated, short circuits composed by CNOTs are presented instea

    CodeTrolley: Hardware-Assisted Control Flow Obfuscation

    Full text link
    Many cybersecurity attacks rely on analyzing a binary executable to find exploitable sections of code. Code obfuscation is used to prevent attackers from reverse engineering these executables. In this work, we focus on control flow obfuscation - a technique that prevents attackers from statically determining which code segments are original, and which segments are added in to confuse attackers. We propose a RISC-V-based hardware-assisted deobfuscation technique that deobfuscates code at runtime based on a secret safely stored in hardware, along with an LLVM compiler extension for obfuscating binaries. Unlike conventional tools, our work does not rely on compiling hard-to-reverse-engineer code, but on securing a secret key. As such, it can be seen as a lightweight alternative to on-the-fly binary decryption.Comment: 2019 Boston Area Architecture Workshop (BARC'19

    Automatic Full Compilation of Julia Programs and ML Models to Cloud TPUs

    Full text link
    Google's Cloud TPUs are a promising new hardware architecture for machine learning workloads. They have powered many of Google's milestone machine learning achievements in recent years. Google has now made TPUs available for general use on their cloud platform and as of very recently has opened them up further to allow use by non-TensorFlow frontends. We describe a method and implementation for offloading suitable sections of Julia programs to TPUs via this new API and the Google XLA compiler. Our method is able to completely fuse the forward pass of a VGG19 model expressed as a Julia program into a single TPU executable to be offloaded to the device. Our method composes well with existing compiler-based automatic differentiation techniques on Julia code, and we are thus able to also automatically obtain the VGG19 backwards pass and similarly offload it to the TPU. Targeting TPUs using our compiler, we are able to evaluate the VGG19 forward pass on a batch of 100 images in 0.23s which compares favorably to the 52.4s required for the original model on the CPU. Our implementation is less than 1000 lines of Julia, with no TPU specific changes made to the core Julia compiler or any other Julia packages.Comment: Submitted to SysML 201

    Lowering IrGL to CUDA

    Full text link
    The IrGL intermediate representation is an explicitly parallel representation for irregular programs that targets GPUs. In this report, we describe IrGL constructs, examples of their use and how IrGL is compiled to CUDA by the Galois GPU compiler

    Quantum Compiling with Approximation of Multiplexors

    Full text link
    A quantum compiling algorithm is an algorithm for decomposing ("compiling") an arbitrary unitary matrix into a sequence of elementary operations (SEO). Suppose UinU_{in} is an \nb-bit unstructured unitary matrix (a unitary matrix with no special symmetries) that we wish to compile. For \nb>10, expressing UinU_{in} as a SEO requires more than a million CNOTs. This calls for a method for finding a unitary matrix that: (1)approximates UinU_{in} well, and (2) is expressible with fewer CNOTs than UinU_{in}. The purpose of this paper is to propose one such approximation method. Various quantum compiling algorithms have been proposed in the literature that decompose an arbitrary unitary matrix into a sequence of U(2)-multiplexors, each of which is then decomposed into a SEO. Our strategy for approximating UinU_{in} is to approximate these intermediate U(2)-multiplexors. In this paper, we will show how one can approximate a U(2)-multiplexor by another U(2)-multiplexor that is expressible with fewer CNOTs.Comment: Ver1:18 pages (files: 1 .tex, 1 .sty, 7 .eps); Ver2:26 pages (files: 1 .tex, 1 .sty, 7 .eps, 7 .m) Ver2 = Ver1 + new material, including 7 Octave/Matlab m-file

    Two Procedures for Compiling Influence Diagrams

    Full text link
    Two algorithms are presented for "compiling" influence diagrams into a set of simple decision rules. These decision rules define simple-to-execute, complete, consistent, and near-optimal decision procedures. These compilation algorithms can be used to derive decision procedures for human teams solving time constrained decision problems.Comment: Appears in Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence (UAI1993

    Noise tailoring for scalable quantum computation via randomized compiling

    Full text link
    Quantum computers are poised to radically outperform their classical counterparts by manipulating coherent quantum systems. A realistic quantum computer will experience errors due to the environment and imperfect control. When these errors are even partially coherent, they present a major obstacle to achieving robust computation. Here, we propose a method for introducing independent random single-qubit gates into the logical circuit in such a way that the effective logical circuit remains unchanged. We prove that this randomization tailors the noise into stochastic Pauli errors, leading to dramatic reductions in worst-case and cumulative error rates, while introducing little or no experimental overhead. Moreover we prove that our technique is robust to variation in the errors over the gate sets and numerically illustrate the dramatic reductions in worst-case error that are achievable. Given such tailored noise, gates with significantly lower fidelity are sufficient to achieve fault-tolerant quantum computation, and, importantly, the worst case error rate of the tailored noise can be directly and efficiently measured through randomized benchmarking experiments. Remarkably, our method enables the realization of fault-tolerant quantum computation under the error rates observed in recent experiments.Comment: 7+6 pages, comments welcom

    Towards a Study of Meta-Predicate Semantics

    Full text link
    We describe and compare design choices for meta-predicate semantics, as found in representative Prolog module systems and in Logtalk. We look at the consequences of these design choices from a pragmatic perspective, discussing explicit qualification semantics, computational reflection support, expressiveness of meta-predicate declarations, safety of meta-predicate definitions, portability of meta-predicate definitions, and meta-predicate performance. Our aim is to provide useful insight for debating meta-predicate semantics and portability issues based on actual implementations and common usage patterns.Comment: Online proceedings of the Joint Workshop on Implementation of Constraint Logic Programming Systems and Logic-based Methods in Programming Environments (CICLOPS-WLPE 2010), Edinburgh, Scotland, U.K., July 15, 201

    Knuth-Bendix Completion Algorithm and Shuffle Algebras For Compiling NISQ Circuits

    Full text link
    Compiling quantum circuits lends itself to an elegant formulation in the language of rewriting systems on non commutative polynomial algebras Q⟨X⟩\mathbb Q\langle X\rangle. The alphabet XX is the set of the allowed hardware 2-qubit gates. The set of gates that we wish to implement from XX are elements of a free monoid X∗X^* (obtained by concatenating the letters of XX). In this setting, compiling an idealized gate is equivalent to computing its unique normal form with respect to the rewriting system R⊂Q⟨X⟩\mathcal R\subset \mathbb Q\langle X\rangle that encodes the hardware constraints and capabilities. This system R\mathcal R is generated using two different mechanisms: 1) using the Knuth-Bendix completion algorithm on the algebra Q⟨X⟩\mathbb Q\langle X\rangle, and 2) using the Buchberger algorithm on the shuffle algebra Q[L]\mathbb Q[L] where LL is the set of Lyndon words on XX.Comment: Key words: Quantum circuit compilation, NISQ computers, rewriting systems, Knuth-Bendix, Shuffle algebra, Lyndon words, Buchberger algorith
    • …
    corecore