2,118 research outputs found
Dynamic Control Flow in Large-Scale Machine Learning
Many recent machine learning models rely on fine-grained dynamic control flow
for training and inference. In particular, models based on recurrent neural
networks and on reinforcement learning depend on recurrence relations,
data-dependent conditional execution, and other features that call for dynamic
control flow. These applications benefit from the ability to make rapid
control-flow decisions across a set of computing devices in a distributed
system. For performance, scalability, and expressiveness, a machine learning
system must support dynamic control flow in distributed and heterogeneous
environments.
This paper presents a programming model for distributed machine learning that
supports dynamic control flow. We describe the design of the programming model,
and its implementation in TensorFlow, a distributed machine learning system.
Our approach extends the use of dataflow graphs to represent machine learning
models, offering several distinctive features. First, the branches of
conditionals and bodies of loops can be partitioned across many machines to run
on a set of heterogeneous devices, including CPUs, GPUs, and custom ASICs.
Second, programs written in our model support automatic differentiation and
distributed gradient computations, which are necessary for training machine
learning models that use control flow. Third, our choice of non-strict
semantics enables multiple loop iterations to execute in parallel across
machines, and to overlap compute and I/O operations.
We have done our work in the context of TensorFlow, and it has been used
extensively in research and production. We evaluate it using several real-world
applications, and demonstrate its performance and scalability.Comment: Appeared in EuroSys 2018. 14 pages, 16 figure
A formally verified compiler back-end
This article describes the development and formal verification (proof of
semantic preservation) of a compiler back-end from Cminor (a simple imperative
intermediate language) to PowerPC assembly code, using the Coq proof assistant
both for programming the compiler and for proving its correctness. Such a
verified compiler is useful in the context of formal methods applied to the
certification of critical software: the verification of the compiler guarantees
that the safety properties proved on the source code hold for the executable
compiled code as well
Verifying the Safety of a Flight-Critical System
This paper describes our work on demonstrating verification technologies on a
flight-critical system of realistic functionality, size, and complexity. Our
work targeted a commercial aircraft control system named Transport Class Model
(TCM), and involved several stages: formalizing and disambiguating requirements
in collaboration with do- main experts; processing models for their use by
formal verification tools; applying compositional techniques at the
architectural and component level to scale verification. Performed in the
context of a major NASA milestone, this study of formal verification in
practice is one of the most challenging that our group has performed, and it
took several person months to complete it. This paper describes the methodology
that we followed and the lessons that we learned.Comment: 17 pages, 5 figure
Highly parallel computation
Highly parallel computing architectures are the only means to achieve the computation rates demanded by advanced scientific problems. A decade of research has demonstrated the feasibility of such machines and current research focuses on which architectures designated as multiple instruction multiple datastream (MIMD) and single instruction multiple datastream (SIMD) have produced the best results to date; neither shows a decisive advantage for most near-homogeneous scientific problems. For scientific problems with many dissimilar parts, more speculative architectures such as neural networks or data flow may be needed
Beyond Dataflow
This paper presents some recent advanced dataflow architectures. While the dataflow concept offers the potential of high performance, the performance of an actual dataflow implementation can be restricted by a limited number of functional units, limited memory bandwidth, and the need to associatively match pending operations with available functional units. Since the early 1970s, there have been significant developments in both fundamental research and practical realizations of dataflow models of computation. In particular, there has been active research and development in multithreaded architectures that evolved from the dataflow model. Also some other techniques for combining control-flow and dataflow emerged, such as coarse-grain dataflow, dataflow with complex machine operations, RISC dataflow, and micro dataflow. These developments have also had certain impact on the conception of highperformance superscalar processors in the “post-RISC” era
System level synthesis of dataflow programs: HEVC decoder case study
International audienceWhile dealing with increasing complexity of signal processing algorithms, the primary motivation for the development of High-Level Synthesis (HLS) tools for the automatic generation of Register Transfer Level (RTL) description from high-level description language is the reduction of time-to-market. However, most existing HLS tools operate at the component level, thus the entire system is not taken into consideration. We provide an original technique that raises the level of abstraction to the system level in order to obtain RTL description from a dataflow description. First, we design image processing algorithms using an actor oriented language under the Reconfigurable Video Coding (RVC) standard. Once the design is achieved, we use a dataflow compilation infrastructure called Open RVC-CAL Compiler (Orcc) to generate a C-based code. Afterward, a Xilinx HLS tool called Vivado is used for an automatic generation of synthesizable hardware implementation. In this paper, we show that a simulated hardware code generation of High Efficiency Video Coding (HEVC) under the RVC specifications is rapidly obtained with promising preliminary results
- …