176 research outputs found

    Analysis of performance variation in 16nm FinFET FPGA devices

    Get PDF

    Optimising and evaluating designs for reconfigurable hardware

    No full text
    Growing demand for computational performance, and the rising cost for chip design and manufacturing make reconfigurable hardware increasingly attractive for digital system implementation. Reconfigurable hardware, such as field-programmable gate arrays (FPGAs), can deliver performance through parallelism while also providing flexibility to enable application builders to reconfigure them. However, reconfigurable systems, particularly those involving run-time reconfiguration, are often developed in an ad-hoc manner. Such an approach usually results in low designer productivity and can lead to inefficient designs. This thesis covers three main achievements that address this situation. The first achievement is a model that captures design parameters of reconfigurable hardware and performance parameters of a given application domain. This model supports optimisations for several design metrics such as performance, area, and power consumption. The second achievement is a technique that enhances the relocatability of bitstreams for reconfigurable devices, taking into account heterogeneous resources. This method increases the flexibility of modules represented by these bitstreams while reducing configuration storage size and design compilation time. The third achievement is a technique to characterise the power consumption of FPGAs in different activity modes. This technique includes the evaluation of standby power and dedicated low-power modes, which are crucial in meeting the requirements for battery-based mobile devices

    Power Measurement Methodology for FPGA Devices

    Get PDF
    The efficiency of power optimization tools depends on information on design power provided by the power estimation models. Power models targeting different power groups can enable fast identification of the most power consuming parts of design and their properties. The accuracy of these estimation models is highly dependent on the accuracy of the method used for their characterization. The highest precision is achieved by using physical onboard measurements. In this paper, we present a measurement methodology that is primarily aimed at calibrating and validating high-level dynamic power estimation models. The measurements have been carefully designed to enable the separation of the interconnect power from the logic power and the power of the clock circuitry, so that each of these power groups can be used for the corresponding model validation. The standard measurement uncertainty is lower than 2% of the measured value even with a very small number of repeated measurements. Additionally, the accuracy of a commercial low-level power estimation tool has been also assessed for comparison purposes. The results indicate that the tool is not suitable for power estimation of data path-oriented designs

    Efficient algorithm and architecture for implementation of multiplier circuits in modern EPGAs

    Get PDF
    High speed multiplication in Field Programmable Gate Arrays is often performed either using logic cells or with built-in DSP blocks. The latter provides the highest performance for arithmetic operations while being also optimized in terms of power and area utilization. Scalability of input operands is limited to that of a single DSP block and the current CAD tools provide little help when the designer needs to build larger arithmetic blocks. The present thesis proposes an effective approach to the problem of building large integer multipliers out of smaller ones by giving two algorithms to the system designer, for a given FPGA technology. Large word length is required in applications such as cryptography and video processing. The first proposed algorithm partitions large input multipliers into an architecture-aware design. The second algorithm then places the generated design in an optimal layout minimizing interconnect delay. The thesis concludes with simulation and hardware generated data to support the proposed algorithms

    Floorplan-driven High-level Synthesis Algorithms for Latency Reduction Targeting FPGA Designs

    Get PDF
    早大学位記番号:新8126早稲田大

    Hybrid FPGA: Architecture and Interface

    No full text
    Hybrid FPGAs (Field Programmable Gate Arrays) are composed of general-purpose logic resources with different granularities, together with domain-specific coarse-grained units. This thesis proposes a novel hybrid FPGA architecture with embedded coarse-grained Floating Point Units (FPUs) to improve the floating point capability of FPGAs. Based on the proposed hybrid FPGA architecture, we examine three aspects to optimise the speed and area for domain-specific applications. First, we examine the interface between large coarse-grained embedded blocks (EBs) and fine-grained elements in hybrid FPGAs. The interface includes parameters for varying: (1) aspect ratio of EBs, (2) position of the EBs in the FPGA, (3) I/O pins arrangement of EBs, (4) interconnect flexibility of EBs, and (5) location of additional embedded elements such as memory. Second, we examine the interconnect structure for hybrid FPGAs. We investigate how large and highdensity EBs affect the routing demand for hybrid FPGAs over a set of domain-specific applications. We then propose three routing optimisation methods to meet the additional routing demand introduced by large EBs: (1) identifying the best separation distance between EBs, (2) adding routing switches on EBs to increase routing flexibility, and (3) introducing wider channel width near the edge of EBs. We study and compare the trade-offs in delay, area and routability of these three optimisation methods. Finally, we employ common subgraph extraction to determine the number of floating point adders/subtractors, multipliers and wordblocks in the FPUs. The wordblocks include registers and can implement fixed point operations. We study the area, speed and utilisation trade-offs of the selected FPU subgraphs in a set of floating point benchmark circuits. We develop an optimised coarse-grained FPU, taking into account both architectural and system-level issues. Furthermore, we investigate the trade-offs between granularities and performance by composing small FPUs into a large FPU. The results of this thesis would help design a domain-specific hybrid FPGA to meet user requirements, by optimising for speed, area or a combination of speed and area

    Radiation Mitigation and Power Optimization Design Tools for Reconfigurable Hardware in Orbit

    Get PDF
    The Reconfigurable Hardware in Orbit (RHinO)project is focused on creating a set of design tools that facilitate and automate design techniques for reconfigurable computing in space, using SRAM-based field-programmable-gate-array (FPGA) technology. In the second year of the project, design tools that leverage an established FPGA design environment have been created to visualize and analyze an FPGA circuit for radiation weaknesses and power inefficiencies. For radiation, a single event Upset (SEU) emulator, persistence analysis tool, and a half-latch removal tool for Xilinx/Virtex-II devices have been created. Research is underway on a persistence mitigation tool and multiple bit upsets (MBU) studies. For power, synthesis level dynamic power visualization and analysis tools have been completed. Power optimization tools are under development and preliminary test results are positive

    Design and application of reconfigurable circuits and systems

    No full text
    Open Acces

    Application development process for GNAT, a SOC networked system

    Get PDF
    The market for smart devices was identified years ago, and yet commercial progress into this field has not made significant progress. The reason such devices are so painfully slow to market is that the gap between the technologically possible and the market capitalizable is too vast. In order for inventions to succeed commercially, they must bridge the gap to tomorrow\u27s technology with marketability today. This thesis demonstrates a design methodology that enables such commercial success for one variety of smart device, the Ambient Intelligence Node (AIN). Commercial Off-The Shelf (COTS) design tools allowing a Model-Driven Architecture (MDA) approach are combined via custom middleware to form an end-to-end design flow for rapid prototyping and commercialization. A walkthrough of this design methodology demonstrates its effectiveness in the creation of Global Network Academic Test (GNAT), a sample AIN. It is shown how designers are given the flexibility to incorporate IP Blocks available in the Global Economy to reduce Time-To-Market and cost. Finally, new kinds of products and solutions built on the higher levels of design abstraction permitted by MDA design methods are explored

    A Modular Platform for Adaptive Heterogeneous Many-Core Architectures

    Get PDF
    Multi-/many-core heterogeneous architectures are shaping current and upcoming generations of compute-centric platforms which are widely used starting from mobile and wearable devices to high-performance cloud computing servers. Heterogeneous many-core architectures sought to achieve an order of magnitude higher energy efficiency as well as computing performance scaling by replacing homogeneous and power-hungry general-purpose processors with multiple heterogeneous compute units supporting multiple core types and domain-specific accelerators. Drifting from homogeneous architectures to complex heterogeneous systems is heavily adopted by chip designers and the silicon industry for more than a decade. Recent silicon chips are based on a heterogeneous SoC which combines a scalable number of heterogeneous processing units from different types (e.g. CPU, GPU, custom accelerator). This shifting in computing paradigm is associated with several system-level design challenges related to the integration and communication between a highly scalable number of heterogeneous compute units as well as SoC peripherals and storage units. Moreover, the increasing design complexities make the production of heterogeneous SoC chips a monopoly for only big market players due to the increasing development and design costs. Accordingly, recent initiatives towards agile hardware development open-source tools and microarchitecture aim to democratize silicon chip production for academic and commercial usage. Agile hardware development aims to reduce development costs by providing an ecosystem for open-source hardware microarchitectures and hardware design processes. Therefore, heterogeneous many-core development and customization will be relatively less complex and less time-consuming than conventional design process methods. In order to provide a modular and agile many-core development approach, this dissertation proposes a development platform for heterogeneous and self-adaptive many-core architectures consisting of a scalable number of heterogeneous tiles that maintain design regularity features while supporting heterogeneity. The proposed platform hides the integration complexities by supporting modular tile architectures for general-purpose processing cores supporting multi-instruction set architectures (multi-ISAs) and custom hardware accelerators. By leveraging field-programmable-gate-arrays (FPGAs), the self-adaptive feature of the many-core platform can be achieved by using dynamic and partial reconfiguration (DPR) techniques. This dissertation realizes the proposed modular and adaptive heterogeneous many-core platform through three main contributions. The first contribution proposes and realizes a many-core architecture for heterogeneous ISAs. It provides a modular and reusable tilebased architecture for several heterogeneous ISAs based on open-source RISC-V ISA. The modular tile-based architecture features a configurable number of processing cores with different RISC-V ISAs and different memory hierarchies. To increase the level of heterogeneity to support the integration of custom hardware accelerators, a novel hybrid memory/accelerator tile architecture is developed and realized as the second contribution. The hybrid tile is a modular and reusable tile that can be configured at run-time to operate as a scratchpad shared memory between compute tiles or as an accelerator tile hosting a local hardware accelerator logic. The hybrid tile is designed and implemented to be seamlessly integrated into the proposed tile-based platform. The third contribution deals with the self-adaptation features by providing a reconfiguration management approach to internally control the DPR process through processing cores (RISC-V based). The internal reconfiguration process relies on a novel DPR controller targeting FPGA design flow for RISC-V-based SoC to change the types and functionalities of compute tiles at run-time
    corecore