17,199 research outputs found

    AutoAccel: Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture

    Full text link
    CPU-FPGA heterogeneous architectures are attracting ever-increasing attention in an attempt to advance computational capabilities and energy efficiency in today's datacenters. These architectures provide programmers with the ability to reprogram the FPGAs for flexible acceleration of many workloads. Nonetheless, this advantage is often overshadowed by the poor programmability of FPGAs whose programming is conventionally a RTL design practice. Although recent advances in high-level synthesis (HLS) significantly improve the FPGA programmability, it still leaves programmers facing the challenge of identifying the optimal design configuration in a tremendous design space. This paper aims to address this challenge and pave the path from software programs towards high-quality FPGA accelerators. Specifically, we first propose the composable, parallel and pipeline (CPP) microarchitecture as a template of accelerator designs. Such a well-defined template is able to support efficient accelerator designs for a broad class of computation kernels, and more importantly, drastically reduce the design space. Also, we introduce an analytical model to capture the performance and resource trade-offs among different design configurations of the CPP microarchitecture, which lays the foundation for fast design space exploration. On top of the CPP microarchitecture and its analytical model, we develop the AutoAccel framework to make the entire accelerator generation automated. AutoAccel accepts a software program as an input and performs a series of code transformations based on the result of the analytical-model-based design space exploration to construct the desired CPP microarchitecture. Our experiments show that the AutoAccel-generated accelerators outperform their corresponding software implementations by an average of 72x for a broad class of computation kernels

    Proving Safety with Trace Automata and Bounded Model Checking

    Full text link
    Loop under-approximation is a technique that enriches C programs with additional branches that represent the effect of a (limited) range of loop iterations. While this technique can speed up the detection of bugs significantly, it introduces redundant execution traces which may complicate the verification of the program. This holds particularly true for verification tools based on Bounded Model Checking, which incorporate simplistic heuristics to determine whether all feasible iterations of a loop have been considered. We present a technique that uses \emph{trace automata} to eliminate redundant executions after performing loop acceleration. The method reduces the diameter of the program under analysis, which is in certain cases sufficient to allow a safety proof using Bounded Model Checking. Our transformation is precise---it does not introduce false positives, nor does it mask any errors. We have implemented the analysis as a source-to-source transformation, and present experimental results showing the applicability of the technique

    DiVinE-CUDA - A Tool for GPU Accelerated LTL Model Checking

    Full text link
    In this paper we present a tool that performs CUDA accelerated LTL Model Checking. The tool exploits parallel algorithm MAP adjusted to the NVIDIA CUDA architecture in order to efficiently detect the presence of accepting cycles in a directed graph. Accepting cycle detection is the core algorithmic procedure in automata-based LTL Model Checking. We demonstrate that the tool outperforms non-accelerated version of the algorithm and we discuss where the limits of the tool are and what we intend to do in the future to avoid them

    Verifying collision avoidance behaviours for unmanned surface vehicles using probabilistic model checking

    Get PDF
    Collision avoidance is an essential safety requirement for unmanned surface vehicles (USVs). Normally, its practical verification is non-trivial, due to the stochastic behaviours of both the USVs and the intruders. This paper presents the probabilistic timed automata (PTAs) based formalism for three collision avoidance behaviours of USVs in uncertain dynamic environments, which are associated with the crossing situation in COLREGs. Steering right, acceleration, and deceleration are considered potential evasive manoeuvres. The state-of-the-art prism model checker is applied to analyse the underlying models. This work provides a framework and practical application of the probabilistic model checking for decision making in collision avoidance for USVs

    Performance Considerations for an Embedded Implementation of OMA DRM 2

    Full text link
    As digital content services gain importance in the mobile world, Digital Rights Management (DRM) applications will become a key component of mobile terminals. This paper examines the effect dedicated hardware macros for specific cryptographic functions have on the performance of a mobile terminal that supports version 2 of the open standard for Digital Rights Management defined by the Open Mobile Alliance (OMA). Following a general description of the standard, the paper contains a detailed analysis of the cryptographic operations that have to be carried out before protected content can be accessed. The combination of this analysis with data on execution times for specific algorithms realized in hardware and software has made it possible to build a model which has allowed us to assert that hardware acceleration for specific cryptographic algorithms can significantly reduce the impact DRM has on a mobile terminal's processing performance and battery life.Comment: Submitted on behalf of EDAA (http://www.edaa.com/
    • …
    corecore