19,125 research outputs found

    FASTCUDA: Open Source FPGA Accelerator & Hardware-Software Codesign Toolset for CUDA Kernels

    Get PDF
    Using FPGAs as hardware accelerators that communicate with a central CPU is becoming a common practice in the embedded design world but there is no standard methodology and toolset to facilitate this path yet. On the other hand, languages such as CUDA and OpenCL provide standard development environments for Graphical Processing Unit (GPU) programming. FASTCUDA is a platform that provides the necessary software toolset, hardware architecture, and design methodology to efficiently adapt the CUDA approach into a new FPGA design flow. With FASTCUDA, the CUDA kernels of a CUDA-based application are partitioned into two groups with minimal user intervention: those that are compiled and executed in parallel software, and those that are synthesized and implemented in hardware. A modern low power FPGA can provide the processing power (via numerous embedded micro-CPUs) and the logic capacity for both the software and hardware implementations of the CUDA kernels. This paper describes the system requirements and the architectural decisions behind the FASTCUDA approach

    AutoAccel: Automated Accelerator Generation and Optimization with Composable, Parallel and Pipeline Architecture

    Full text link
    CPU-FPGA heterogeneous architectures are attracting ever-increasing attention in an attempt to advance computational capabilities and energy efficiency in today's datacenters. These architectures provide programmers with the ability to reprogram the FPGAs for flexible acceleration of many workloads. Nonetheless, this advantage is often overshadowed by the poor programmability of FPGAs whose programming is conventionally a RTL design practice. Although recent advances in high-level synthesis (HLS) significantly improve the FPGA programmability, it still leaves programmers facing the challenge of identifying the optimal design configuration in a tremendous design space. This paper aims to address this challenge and pave the path from software programs towards high-quality FPGA accelerators. Specifically, we first propose the composable, parallel and pipeline (CPP) microarchitecture as a template of accelerator designs. Such a well-defined template is able to support efficient accelerator designs for a broad class of computation kernels, and more importantly, drastically reduce the design space. Also, we introduce an analytical model to capture the performance and resource trade-offs among different design configurations of the CPP microarchitecture, which lays the foundation for fast design space exploration. On top of the CPP microarchitecture and its analytical model, we develop the AutoAccel framework to make the entire accelerator generation automated. AutoAccel accepts a software program as an input and performs a series of code transformations based on the result of the analytical-model-based design space exploration to construct the desired CPP microarchitecture. Our experiments show that the AutoAccel-generated accelerators outperform their corresponding software implementations by an average of 72x for a broad class of computation kernels

    FPGA based data acquisition system for COMPASS experiment

    Full text link
    This paper discusses the present data acquisition system (DAQ) of the COMPASS experiment at CERN and presents development of a new DAQ. The new DAQ must preserve present data format and be able to communicate with FPGA cards. Parts of the new DAQ are based on state machines and they are implemented in C++ with usage of the QT framework, the DIM library, and the IPBus technology. Prototype of the system is prepared and communication through DIM between parts was tested. An implementation of the IPBus technology was prepared and tested. The new DAQ proved to be able to fulfill requirements.Comment: 8 pages, CHEP 201

    BDAQ53, a versatile pixel detector readout and test system for the ATLAS and CMS HL-LHC upgrades

    Full text link
    BDAQ53 is a readout system and verification framework for hybrid pixel detector readout chips of the RD53 family. These chips are designed for the upgrade of the inner tracking detectors of the ATLAS and CMS experiments. BDAQ53 is used in applications where versatility and rapid customization are required, such as in laboratory testing environments, test beam campaigns, and permanent setups for quality control measurements. It consists of custom and commercial hardware, a Python-based software framework, and FPGA firmware. BDAQ53 is developed as open source software with both software and firmware being hosted in a public repository.Comment: 6 pages, 6 figure
    corecore