34 research outputs found

    FBLAS: Streaming Linear Algebra on FPGA

    Full text link
    Spatial computing architectures pose an attractive alternative to mitigate control and data movement overheads typical of load-store architectures. In practice, these devices are rarely considered in the HPC community due to the steep learning curve, low productivity and lack of available libraries for fundamental operations. High-level synthesis (HLS) tools are facilitating hardware programming, but optimizing for these architectures requires factoring in new transformations and resources/performance trade-offs. We present FBLAS, an open-source HLS implementation of BLAS for FPGAs, that enables reusability, portability and easy integration with existing software and hardware codes. FBLAS' implementation allows scaling hardware modules to exploit on-chip resources, and module interfaces are designed to natively support streaming on-chip communications, allowing them to be composed to reduce off-chip communication. With FBLAS, we set a precedent for FPGA library design, and contribute to the toolbox of customizable hardware components necessary for HPC codes to start productively targeting reconfigurable platforms

    An enegy-efficient FPGA accelerator for convolutional neural networks

    Get PDF
    This project focuses on a state-of-the-art DNN specifically build for image clas sification. We develop a new architecture design that will run on an experimental Intel accelerator platform called HARP. We obtain low-power solution and evaluate the possibilities and requierements of HARPplatform
    corecore