745 research outputs found

    FPGA Architecture Optimization Using Geometric Programming

    Get PDF
    Volume 4 No 13 of the periodical Progression. Published November, February, May and August by The Radiant Healing Centre. SPCL PER BT 732 P76 V.1,1932-V.5,193

    HERO: Heterogeneous Embedded Research Platform for Exploring RISC-V Manycore Accelerators on FPGA

    Full text link
    Heterogeneous embedded systems on chip (HESoCs) co-integrate a standard host processor with programmable manycore accelerators (PMCAs) to combine general-purpose computing with domain-specific, efficient processing capabilities. While leading companies successfully advance their HESoC products, research lags behind due to the challenges of building a prototyping platform that unites an industry-standard host processor with an open research PMCA architecture. In this work we introduce HERO, an FPGA-based research platform that combines a PMCA composed of clusters of RISC-V cores, implemented as soft cores on an FPGA fabric, with a hard ARM Cortex-A multicore host processor. The PMCA architecture mapped on the FPGA is silicon-proven, scalable, configurable, and fully modifiable. HERO includes a complete software stack that consists of a heterogeneous cross-compilation toolchain with support for OpenMP accelerator programming, a Linux driver, and runtime libraries for both host and PMCA. HERO is designed to facilitate rapid exploration on all software and hardware layers: run-time behavior can be accurately analyzed by tracing events, and modifications can be validated through fully automated hard ware and software builds and executed tests. We demonstrate the usefulness of HERO by means of case studies from our research

    Hybrid FPGA: Architecture and Interface

    No full text
    Hybrid FPGAs (Field Programmable Gate Arrays) are composed of general-purpose logic resources with different granularities, together with domain-specific coarse-grained units. This thesis proposes a novel hybrid FPGA architecture with embedded coarse-grained Floating Point Units (FPUs) to improve the floating point capability of FPGAs. Based on the proposed hybrid FPGA architecture, we examine three aspects to optimise the speed and area for domain-specific applications. First, we examine the interface between large coarse-grained embedded blocks (EBs) and fine-grained elements in hybrid FPGAs. The interface includes parameters for varying: (1) aspect ratio of EBs, (2) position of the EBs in the FPGA, (3) I/O pins arrangement of EBs, (4) interconnect flexibility of EBs, and (5) location of additional embedded elements such as memory. Second, we examine the interconnect structure for hybrid FPGAs. We investigate how large and highdensity EBs affect the routing demand for hybrid FPGAs over a set of domain-specific applications. We then propose three routing optimisation methods to meet the additional routing demand introduced by large EBs: (1) identifying the best separation distance between EBs, (2) adding routing switches on EBs to increase routing flexibility, and (3) introducing wider channel width near the edge of EBs. We study and compare the trade-offs in delay, area and routability of these three optimisation methods. Finally, we employ common subgraph extraction to determine the number of floating point adders/subtractors, multipliers and wordblocks in the FPUs. The wordblocks include registers and can implement fixed point operations. We study the area, speed and utilisation trade-offs of the selected FPU subgraphs in a set of floating point benchmark circuits. We develop an optimised coarse-grained FPU, taking into account both architectural and system-level issues. Furthermore, we investigate the trade-offs between granularities and performance by composing small FPUs into a large FPU. The results of this thesis would help design a domain-specific hybrid FPGA to meet user requirements, by optimising for speed, area or a combination of speed and area

    Exploring FPGA Implementation for Binarized Neural Network Inference

    Get PDF
    Deep convolutional neural network has taken an important role in machine learning algorithm. It is widely used in different areas such as computer vision, robotics, and biology. However, the models of deep neural networks become larger and more computation complexity which is a big obstacle for such huge model to implement on embedded systems. Recent works have shown the binarized neural networks (BNN), utilizing binarized (i.e. +1 and -1) convolution kernel and binarized activation function, can significantly reduce the parameter size and computation cost, which makes it hardware-friendly for Field-Programmable Gate Arrays (FPGAs) implementation with efficient energy cost. This thesis proposes to implement a new parallel convolutional binarized neural network (i.e. PC-BNN) on FPGA with accurate inference. The embedded PC-BNN is designed for image classification on CIFAR-10 dataset and explores the hardware architecture and optimization of customized CNN topology. The parallel-convolution binarized neural network has two parallel binarized convolution layers which replaces the original single binarized convolution layer. It achieves around 86% on CIFAR-10 dataset and owns 2.3Mb parameter size. We implement our PC-BNN inference into the Xilinx PYNQ Z1 FPGA board which only has 4.9Mb on-chip Block RAM. Since the ultra-small network parameter, the whole model parameters can be stored on on-chip memory which can greatly reduce energy consumption and computation latency. Meanwhile, we design a new pipeline streaming architecture for PC-BNN hardware inference which can further increase the performance. The experiment results show that our PC-BNN inference on FPGA achieves 930 frames per second and 387.5 FPS/Watt, which are among the best throughput and energy efficiency compared to most recent works

    Field Programmable Gate Arrays (FPGAs) II

    Get PDF
    This Edited Volume Field Programmable Gate Arrays (FPGAs) II is a collection of reviewed and relevant research chapters, offering a comprehensive overview of recent developments in the field of Computer and Information Science. The book comprises single chapters authored by various researchers and edited by an expert active in the Computer and Information Science research area. All chapters are complete in itself but united under a common research study topic. This publication aims at providing a thorough overview of the latest research efforts by international authors on Computer and Information Science, and open new possible research paths for further novel developments
    • ā€¦
    corecore