Search CORE

17,199 research outputs found

AutoAccel: Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture

Author: Cong Jason
Wei Peng
Yu Cody Hao
Zhang Peng
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/07/2018
Field of study

CPU-FPGA heterogeneous architectures are attracting ever-increasing attention in an attempt to advance computational capabilities and energy efficiency in today's datacenters. These architectures provide programmers with the ability to reprogram the FPGAs for flexible acceleration of many workloads. Nonetheless, this advantage is often overshadowed by the poor programmability of FPGAs whose programming is conventionally a RTL design practice. Although recent advances in high-level synthesis (HLS) significantly improve the FPGA programmability, it still leaves programmers facing the challenge of identifying the optimal design configuration in a tremendous design space. This paper aims to address this challenge and pave the path from software programs towards high-quality FPGA accelerators. Specifically, we first propose the composable, parallel and pipeline (CPP) microarchitecture as a template of accelerator designs. Such a well-defined template is able to support efficient accelerator designs for a broad class of computation kernels, and more importantly, drastically reduce the design space. Also, we introduce an analytical model to capture the performance and resource trade-offs among different design configurations of the CPP microarchitecture, which lays the foundation for fast design space exploration. On top of the CPP microarchitecture and its analytical model, we develop the AutoAccel framework to make the entire accelerator generation automated. AutoAccel accepts a software program as an input and performs a series of code transformations based on the result of the analytical-model-based design space exploration to construct the desired CPP microarchitecture. Our experiments show that the AutoAccel-generated accelerators outperform their corresponding software implementations by an average of 72x for a broad class of computation kernels

arXiv.org e-Print Archive

Scipedia

Proving Safety with Trace Automata and Bounded Model Checking

Author: A. Finkel
A.V. Aho
D. Kroening
D. Kroening
D. Kroening
D. Kroning
E. Clarke
G. Nelson
H. Hojjat
I. Dillig
K.L. McMillan
L. Kovács
L. Moura de
M. Bozga
M. Heizmann
P. Schrammel
V. D’Silva
Publication venue
Publication date: 21/10/2014
Field of study

Loop under-approximation is a technique that enriches C programs with additional branches that represent the effect of a (limited) range of loop iterations. While this technique can speed up the detection of bugs significantly, it introduces redundant execution traces which may complicate the verification of the program. This holds particularly true for verification tools based on Bounded Model Checking, which incorporate simplistic heuristics to determine whether all feasible iterations of a loop have been considered. We present a technique that uses \emph{trace automata} to eliminate redundant executions after performing loop acceleration. The method reduces the diameter of the program under analysis, which is in certain cases sufficient to allow a safety proof using Bounded Model Checking. Our transformation is precise---it does not introduce false positives, nor does it mask any errors. We have implemented the analysis as a source-to-source transformation, and present experimental results showing the applicability of the technique

arXiv.org e-Print Archive

CiteSeerX

Oxford University Research Archive

DiVinE-CUDA - A Tool for GPU Accelerated LTL Model Checking

Author: A. L. Lafuente
D. Bosnacki
I. v Cern'a
J. Barnat
J. Barnat
J. Barnat
Jaco van de Pol
Jiří Barnat
L. Brim
Lubos Brim
Luboš Brim
M. E. Fuess
M. Y. Vardi
Michael Garland
Milan Češka
S. Edelkamp
Stefan Edelkamp
U. Stern
Publication venue: 'Open Publishing Association'
Publication date: 01/12/2009
Field of study

In this paper we present a tool that performs CUDA accelerated LTL Model Checking. The tool exploits parallel algorithm MAP adjusted to the NVIDIA CUDA architecture in order to efficiently detect the presence of accepting cycles in a directed graph. Accepting cycle detection is the core algorithmic procedure in automata-based LTL Model Checking. We demonstrate that the tool outperforms non-accelerated version of the algorithm and we discuss where the limits of the tool are and what we intend to do in the future to avoid them

arXiv.org e-Print Archive

Directory of Open Access Journals

Verifying collision avoidance behaviours for unmanned surface vehicles using probabilistic model checking

Author: Lu Yu
Niu Hanlin
Savvaris Al
Tsourdos Antonios
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Collision avoidance is an essential safety requirement for unmanned surface vehicles (USVs). Normally, its practical verification is non-trivial, due to the stochastic behaviours of both the USVs and the intruders. This paper presents the probabilistic timed automata (PTAs) based formalism for three collision avoidance behaviours of USVs in uncertain dynamic environments, which are associated with the crossing situation in COLREGs. Steering right, acceleration, and deceleration are considered potential evasive manoeuvres. The state-of-the-art prism model checker is applied to analyse the underlying models. This work provides a framework and practical application of the probabilistic model checking for decision making in collision avoidance for USVs

Performance Considerations for an Embedded Implementation of OMA DRM 2

Author: Sannino Roberto
Thull Daniel
Publication venue
Publication date: 25/10/2007
Field of study

As digital content services gain importance in the mobile world, Digital Rights Management (DRM) applications will become a key component of mobile terminals. This paper examines the effect dedicated hardware macros for specific cryptographic functions have on the performance of a mobile terminal that supports version 2 of the open standard for Digital Rights Management defined by the Open Mobile Alliance (OMA). Following a general description of the standard, the paper contains a detailed analysis of the cryptographic operations that have to be carried out before protected content can be accessed. The combination of this analysis with data on execution times for specific algorithms realized in hardware and software has made it possible to build a model which has allowed us to assert that hardware acceleration for specific cryptographic algorithms can significantly reduce the impact DRM has on a mobile terminal's processing performance and battery life.Comment: Submitted on behalf of EDAA (http://www.edaa.com/

arXiv.org e-Print Archive

CiteSeerX