91,872 research outputs found
Labyrinth: Compiling Imperative Control Flow to Parallel Dataflows
Parallel dataflow systems have become a standard technology for large-scale
data analytics. Complex data analysis programs in areas such as machine
learning and graph analytics often involve control flow, i.e., iterations and
branching. Therefore, systems for advanced analytics should include control
flow constructs that are efficient and easy to use. A natural approach is to
provide imperative control flow constructs similar to those of mainstream
programming languages: while-loops, if-statements, and mutable variables, whose
values can change between iteration steps.
However, current parallel dataflow systems execute programs written using
imperative control flow constructs by launching a separate dataflow job after
every control flow decision (e.g., for every step of a loop). The performance
of this approach is suboptimal, because (a) launching a dataflow job incurs
scheduling overhead; and (b) it prevents certain optimizations across iteration
steps.
In this paper, we introduce Labyrinth, a method to compile programs written
using imperative control flow constructs to a single dataflow job, which
executes the whole program, including all iteration steps. This way, we achieve
both efficiency and ease of use. We also conduct an experimental evaluation,
which shows that Labyrinth has orders of magnitude smaller per-iteration-step
overhead than launching new dataflow jobs, and also allows for significant
optimizations across iteration steps
Compiling universal quantum circuits
We propose a method of compiling that permits to identify quantum circuits
able to simulate arbitrary -qubit unitary operations via the adjustment of
angles in single-qubit gates therein. The method of compiling itself extends
older quantum control techniques and stays computationally tractable for
several qubits. As an application we identify compiling universal circuits for
, and qubits consisting of , and CNOTs respectively.Comment: 5 pages, 4 figures. The application to QFT circuits has been
eliminated, short circuits composed by CNOTs are presented instea
CodeTrolley: Hardware-Assisted Control Flow Obfuscation
Many cybersecurity attacks rely on analyzing a binary executable to find
exploitable sections of code. Code obfuscation is used to prevent attackers
from reverse engineering these executables. In this work, we focus on control
flow obfuscation - a technique that prevents attackers from statically
determining which code segments are original, and which segments are added in
to confuse attackers. We propose a RISC-V-based hardware-assisted deobfuscation
technique that deobfuscates code at runtime based on a secret safely stored in
hardware, along with an LLVM compiler extension for obfuscating binaries.
Unlike conventional tools, our work does not rely on compiling
hard-to-reverse-engineer code, but on securing a secret key. As such, it can be
seen as a lightweight alternative to on-the-fly binary decryption.Comment: 2019 Boston Area Architecture Workshop (BARC'19
Automatic Full Compilation of Julia Programs and ML Models to Cloud TPUs
Google's Cloud TPUs are a promising new hardware architecture for machine
learning workloads. They have powered many of Google's milestone machine
learning achievements in recent years. Google has now made TPUs available for
general use on their cloud platform and as of very recently has opened them up
further to allow use by non-TensorFlow frontends. We describe a method and
implementation for offloading suitable sections of Julia programs to TPUs via
this new API and the Google XLA compiler. Our method is able to completely fuse
the forward pass of a VGG19 model expressed as a Julia program into a single
TPU executable to be offloaded to the device. Our method composes well with
existing compiler-based automatic differentiation techniques on Julia code, and
we are thus able to also automatically obtain the VGG19 backwards pass and
similarly offload it to the TPU. Targeting TPUs using our compiler, we are able
to evaluate the VGG19 forward pass on a batch of 100 images in 0.23s which
compares favorably to the 52.4s required for the original model on the CPU. Our
implementation is less than 1000 lines of Julia, with no TPU specific changes
made to the core Julia compiler or any other Julia packages.Comment: Submitted to SysML 201
Lowering IrGL to CUDA
The IrGL intermediate representation is an explicitly parallel representation
for irregular programs that targets GPUs. In this report, we describe IrGL
constructs, examples of their use and how IrGL is compiled to CUDA by the
Galois GPU compiler
Quantum Compiling with Approximation of Multiplexors
A quantum compiling algorithm is an algorithm for decomposing ("compiling")
an arbitrary unitary matrix into a sequence of elementary operations (SEO).
Suppose is an \nb-bit unstructured unitary matrix (a unitary matrix
with no special symmetries) that we wish to compile. For \nb>10, expressing
as a SEO requires more than a million CNOTs. This calls for a method
for finding a unitary matrix that: (1)approximates well, and (2) is
expressible with fewer CNOTs than . The purpose of this paper is to
propose one such approximation method. Various quantum compiling algorithms
have been proposed in the literature that decompose an arbitrary unitary matrix
into a sequence of U(2)-multiplexors, each of which is then decomposed into a
SEO. Our strategy for approximating is to approximate these
intermediate U(2)-multiplexors. In this paper, we will show how one can
approximate a U(2)-multiplexor by another U(2)-multiplexor that is expressible
with fewer CNOTs.Comment: Ver1:18 pages (files: 1 .tex, 1 .sty, 7 .eps); Ver2:26 pages (files:
1 .tex, 1 .sty, 7 .eps, 7 .m) Ver2 = Ver1 + new material, including 7
Octave/Matlab m-file
Two Procedures for Compiling Influence Diagrams
Two algorithms are presented for "compiling" influence diagrams into a set of
simple decision rules. These decision rules define simple-to-execute, complete,
consistent, and near-optimal decision procedures. These compilation algorithms
can be used to derive decision procedures for human teams solving time
constrained decision problems.Comment: Appears in Proceedings of the Ninth Conference on Uncertainty in
Artificial Intelligence (UAI1993
Noise tailoring for scalable quantum computation via randomized compiling
Quantum computers are poised to radically outperform their classical
counterparts by manipulating coherent quantum systems. A realistic quantum
computer will experience errors due to the environment and imperfect control.
When these errors are even partially coherent, they present a major obstacle to
achieving robust computation. Here, we propose a method for introducing
independent random single-qubit gates into the logical circuit in such a way
that the effective logical circuit remains unchanged. We prove that this
randomization tailors the noise into stochastic Pauli errors, leading to
dramatic reductions in worst-case and cumulative error rates, while introducing
little or no experimental overhead. Moreover we prove that our technique is
robust to variation in the errors over the gate sets and numerically illustrate
the dramatic reductions in worst-case error that are achievable. Given such
tailored noise, gates with significantly lower fidelity are sufficient to
achieve fault-tolerant quantum computation, and, importantly, the worst case
error rate of the tailored noise can be directly and efficiently measured
through randomized benchmarking experiments. Remarkably, our method enables the
realization of fault-tolerant quantum computation under the error rates
observed in recent experiments.Comment: 7+6 pages, comments welcom
Towards a Study of Meta-Predicate Semantics
We describe and compare design choices for meta-predicate semantics, as found
in representative Prolog module systems and in Logtalk. We look at the
consequences of these design choices from a pragmatic perspective, discussing
explicit qualification semantics, computational reflection support,
expressiveness of meta-predicate declarations, safety of meta-predicate
definitions, portability of meta-predicate definitions, and meta-predicate
performance. Our aim is to provide useful insight for debating meta-predicate
semantics and portability issues based on actual implementations and common
usage patterns.Comment: Online proceedings of the Joint Workshop on Implementation of
Constraint Logic Programming Systems and Logic-based Methods in Programming
Environments (CICLOPS-WLPE 2010), Edinburgh, Scotland, U.K., July 15, 201
Knuth-Bendix Completion Algorithm and Shuffle Algebras For Compiling NISQ Circuits
Compiling quantum circuits lends itself to an elegant formulation in the
language of rewriting systems on non commutative polynomial algebras . The alphabet is the set of the allowed hardware 2-qubit
gates. The set of gates that we wish to implement from are elements of a
free monoid (obtained by concatenating the letters of ). In this
setting, compiling an idealized gate is equivalent to computing its unique
normal form with respect to the rewriting system that encodes the hardware constraints and capabilities. This
system is generated using two different mechanisms: 1) using the
Knuth-Bendix completion algorithm on the algebra ,
and 2) using the Buchberger algorithm on the shuffle algebra
where is the set of Lyndon words on .Comment: Key words: Quantum circuit compilation, NISQ computers, rewriting
systems, Knuth-Bendix, Shuffle algebra, Lyndon words, Buchberger algorith
- …