17,199 research outputs found
AutoAccel: Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture
CPU-FPGA heterogeneous architectures are attracting ever-increasing attention
in an attempt to advance computational capabilities and energy efficiency in
today's datacenters. These architectures provide programmers with the ability
to reprogram the FPGAs for flexible acceleration of many workloads.
Nonetheless, this advantage is often overshadowed by the poor programmability
of FPGAs whose programming is conventionally a RTL design practice. Although
recent advances in high-level synthesis (HLS) significantly improve the FPGA
programmability, it still leaves programmers facing the challenge of
identifying the optimal design configuration in a tremendous design space.
This paper aims to address this challenge and pave the path from software
programs towards high-quality FPGA accelerators. Specifically, we first propose
the composable, parallel and pipeline (CPP) microarchitecture as a template of
accelerator designs. Such a well-defined template is able to support efficient
accelerator designs for a broad class of computation kernels, and more
importantly, drastically reduce the design space. Also, we introduce an
analytical model to capture the performance and resource trade-offs among
different design configurations of the CPP microarchitecture, which lays the
foundation for fast design space exploration. On top of the CPP
microarchitecture and its analytical model, we develop the AutoAccel framework
to make the entire accelerator generation automated. AutoAccel accepts a
software program as an input and performs a series of code transformations
based on the result of the analytical-model-based design space exploration to
construct the desired CPP microarchitecture. Our experiments show that the
AutoAccel-generated accelerators outperform their corresponding software
implementations by an average of 72x for a broad class of computation kernels
Proving Safety with Trace Automata and Bounded Model Checking
Loop under-approximation is a technique that enriches C programs with
additional branches that represent the effect of a (limited) range of loop
iterations. While this technique can speed up the detection of bugs
significantly, it introduces redundant execution traces which may complicate
the verification of the program. This holds particularly true for verification
tools based on Bounded Model Checking, which incorporate simplistic heuristics
to determine whether all feasible iterations of a loop have been considered.
We present a technique that uses \emph{trace automata} to eliminate redundant
executions after performing loop acceleration. The method reduces the diameter
of the program under analysis, which is in certain cases sufficient to allow a
safety proof using Bounded Model Checking. Our transformation is precise---it
does not introduce false positives, nor does it mask any errors. We have
implemented the analysis as a source-to-source transformation, and present
experimental results showing the applicability of the technique
DiVinE-CUDA - A Tool for GPU Accelerated LTL Model Checking
In this paper we present a tool that performs CUDA accelerated LTL Model
Checking. The tool exploits parallel algorithm MAP adjusted to the NVIDIA CUDA
architecture in order to efficiently detect the presence of accepting cycles in
a directed graph. Accepting cycle detection is the core algorithmic procedure
in automata-based LTL Model Checking. We demonstrate that the tool outperforms
non-accelerated version of the algorithm and we discuss where the limits of the
tool are and what we intend to do in the future to avoid them
Verifying collision avoidance behaviours for unmanned surface vehicles using probabilistic model checking
Collision avoidance is an essential safety requirement for unmanned surface vehicles (USVs). Normally, its practical verification is non-trivial, due to the stochastic behaviours of both the USVs and the intruders. This paper presents the probabilistic timed automata (PTAs) based formalism for three collision avoidance behaviours of USVs in uncertain dynamic environments, which are associated with the crossing situation in COLREGs. Steering right, acceleration, and deceleration are considered potential evasive manoeuvres. The state-of-the-art prism model checker is applied to analyse the underlying models. This work provides a framework and practical application of the probabilistic model checking for decision making in collision avoidance for USVs
Performance Considerations for an Embedded Implementation of OMA DRM 2
As digital content services gain importance in the mobile world, Digital
Rights Management (DRM) applications will become a key component of mobile
terminals. This paper examines the effect dedicated hardware macros for
specific cryptographic functions have on the performance of a mobile terminal
that supports version 2 of the open standard for Digital Rights Management
defined by the Open Mobile Alliance (OMA). Following a general description of
the standard, the paper contains a detailed analysis of the cryptographic
operations that have to be carried out before protected content can be
accessed. The combination of this analysis with data on execution times for
specific algorithms realized in hardware and software has made it possible to
build a model which has allowed us to assert that hardware acceleration for
specific cryptographic algorithms can significantly reduce the impact DRM has
on a mobile terminal's processing performance and battery life.Comment: Submitted on behalf of EDAA (http://www.edaa.com/
- …