1,932 research outputs found

    From FPGA to ASIC: A RISC-V processor experience

    Get PDF
    This work document a correct design flow using these tools in the Lagarto RISC- V Processor and the RTL design considerations that must be taken into account, to move from a design for FPGA to design for ASIC

    P4-compatible High-level Synthesis of Low Latency 100 Gb/s Streaming Packet Parsers in FPGAs

    Full text link
    Packet parsing is a key step in SDN-aware devices. Packet parsers in SDN networks need to be both reconfigurable and fast, to support the evolving network protocols and the increasing multi-gigabit data rates. The combination of packet processing languages with FPGAs seems to be the perfect match for these requirements. In this work, we develop an open-source FPGA-based configurable architecture for arbitrary packet parsing to be used in SDN networks. We generate low latency and high-speed streaming packet parsers directly from a packet processing program. Our architecture is pipelined and entirely modeled using templated C++ classes. The pipeline layout is derived from a parser graph that corresponds a P4 code after a series of graph transformation rounds. The RTL code is generated from the C++ description using Xilinx Vivado HLS and synthesized with Xilinx Vivado. Our architecture achieves 100 Gb/s data rate in a Xilinx Virtex-7 FPGA while reducing the latency by 45% and the LUT usage by 40% compared to the state-of-the-art.Comment: Accepted for publication at the 26th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays February 25 - 27, 2018 Monterey Marriott Hotel, Monterey, California, 7 pages, 7 figures, 1 tabl

    A Design Methodology for Space-Time Adapter

    Full text link
    This paper presents a solution to efficiently explore the design space of communication adapters. In most digital signal processing (DSP) applications, the overall architecture of the system is significantly affected by communication architecture, so the designers need specifically optimized adapters. By explicitly modeling these communications within an effective graph-theoretic model and analysis framework, we automatically generate an optimized architecture, named Space-Time AdapteR (STAR). Our design flow inputs a C description of Input/Output data scheduling, and user requirements (throughput, latency, parallelism...), and formalizes communication constraints through a Resource Constraints Graph (RCG). The RCG properties enable an efficient architecture space exploration in order to synthesize a STAR component. The proposed approach has been tested to design an industrial data mixing block example: an Ultra-Wideband interleaver.Comment: ISBN : 978-1-59593-606-

    Exploring abstract interfaces in system-on-chip integration

    Get PDF
    Modern mobile devices are marvels of computation. They can encode high defnition video, processing and compressing over 350MB/s of image data in real time. They have no trouble driving displays with as much resolution as a full laptop, and smartphone manufacturers boast of running games with console quality graphics. Mobile devices pack all of this computational power into a 12\ handheld package by integrating a number of specialized hardware accelerators (IP) along with conventional CPU and GPUs in a system on chip (SoC). Unfortunately, creating these specialized systems is becoming increasingly expensive. Since hardware accelerators come from a number of different sources and design cycles, different accelerator blocks will often contain incompatible hardware interfaces. Therefore, a large portion of SoC design cost comes in the form of designers manually interfacing each accelerator into a system. This work includes everything from building custom logic to wire up a block, to developing the drivers and API needed to take advantage of the hardware. My research focuses on generating these interfaces, including the physical hardware used to tie IP blocks into a system and the associated software collateral. Leveraging recent trends such as High Level Synthesis and other hardware generator methodologies, I propose an IP interface abstraction and parameterization designed to describe the interface of most current IP blocks. By encoding this knowledge at a higher level of abstraction, I am able to construct and demonstrate a hardware generator that maps an interface protocol description into synthesizable register transfer language (RTL), and that can automatically create hardware bridges between different interconnect standards. iv To ease the integration of the next generation of IP blocks-blocks that are automatically generated based of of user specification. I propose a set of interface primitives. \hen integrated into an IP generator, these primitives can automatically generate an interface that my interface system can tie to the rest of the system. I also demonstrate how the information stored in these types of primitives can be used to automatically generate a low level software driver that manages access to the IP blocks. Finally, I show how the simulation environment provided with an IP generator can be used to provide a domain appropriate application programming interface (API) to drive the software. Using an image signal processor generator as my platform, I demonstrate the construction of a map between the simulation software and hardware driver that enables a full one-button flow from algorithm development to applications running on specialized hardware within a working system

    Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device

    Get PDF
    Currently, most designers face a daunting task to research different design flows and learn the intricacies of specific software from various manufacturers in hardware/software co-design. An urgent need of creating a scalable hardware/software co-design platform has become a key strategic element for developing hardware/software integrated systems. In this paper, we propose a new design flow for building a scalable co-design platform on FPGA-based system-on-chip. We employ an integrated approach to implement a histogram oriented gradients (HOG) and a support vector machine (SVM) classification on a programmable device for pedestrian tracking. Not only was hardware resource analysis reported, but the precision and success rates of pedestrian tracking on nine open access image data sets are also analysed. Finally, our proposed design flow can be used for any real-time image processingrelated products on programmable ZYNQ-based embedded systems, which benefits from a reduced design time and provide a scalable solution for embedded image processing products

    Rapid prototyping from algorithm to FPGA prototype

    Get PDF
    Abstract. Wireless data usage continuously increases in today’s world setting higher requirements for wireless networks. Ever increasing requirements result in more complex hardware (HW) implementation, especially telecommunication System-on-Chips (SoC) performance is playing a key-role in this development. Complexity increases design workload, therefore, it makes design flow times longer. High-Level Synthesis (HLS) tools have been designed to automate and accelerate design by moving manual work on a higher level. This Master’s Thesis studies MathWorks HLS workflow usage for rapid prototyping of Wireless Communication SoC Intellectual Property (IP). This thesis introduces design and FPGA prototyping flow of Application-Specific Integrated Circuit (ASIC). It presents good design practices targeted for HLS. It also studies MathWorks Hardware Description Language (HDL) generation flow with HDL Coder, possible problems during the flow and solutions to overcome the problems. The HLS flow is examined with an example design that scales and limits the power of IQ-data. This work verifies the design in a Field-Programmable Gate Array (FPGA) environment. It concentrates on evaluating the usage and benefits of MathWorks HLS workflow targeted for rapid prototyping of SoCs. The Example IP is a Simulink model containing MATLAB algorithms and System Objects. The design is optimized on algorithm level and synthesized into VHDL. The generated Register-Transfer Level (RTL) is verified in co-simulation against the algorithm model. Optimization and verification methods are evaluated. The HDL model is further processed through logic-synthesis using the 3rd party synthesis tool run automatically with a script created by MathWorks workflow. The generated design is tested on FPGA with FPGA-in-the-loop simulation configuration. FPGA prototyping flow benefits for rapid prototyping are evaluated. Coding styles to generate synthesizable HDL code and simulation methods to improve simulation speed of hardware-like algorithm were discussed. MathWorks HLS workflow was evaluated for rapid prototype purposes from algorithm to FPGA. Optimization methods and capability for production quality RTL for ASIC target were also discussed. MathWorks’ tool flow provided promising results for rapid prototyping. It generated human-readable HDL that was successfully synthesized on FPGA. The FPGA model was simulated in FPGA-in-the-loop configuration successfully. It also provided good area and speed results for the ASIC target when the algorithm was written strictly from the hardware perspective. The process was found to be distinct and efficient.Nopea prototypointi algoritmista FPGA-prototyypiksi. Tiivistelmä. Langattoman datan käyttö kasvaa jatkuvasti nykymaailmassa ja asettaa korkeammat vaatimukset langattomille verkoille. Kasvavat vaatimukset tekevät laitteistototeutuksesta kompleksisempaa, erityisesti tietoliikenteessä käytettävien järjestelmäpiirien (SoC) tehokkuus on avainasemassa. Tämä kasvattaa suunnittelun työmäärää ja näin ollen suunnitteluvuohon kuluva aika pidentyy. Korkean tason synteesi (HLS) on kehitetty automatisoimaan ja nopeuttamaan digitaalisuunnittelua siirtämällä manuaalista työtä korkeammalle tasolle. Tämä diplomityö tutkii MathWorks:n HLS-vuon käyttöä langattomaan viestintään suunniteltavien SoC:ien tekijänoikeudenalaisten standardoitujen lohkojen (IP) nopeaan prototypointiin. Työ esittelee perinteisen asiakaspiirin (ASIC) suunnitteluvuon, FPGA-prototypointivuon ja suunnitteluperiaatteet HLS:ää varten. Työssä käydään läpi MathWorks:n laitteistokuvauskielen (HDL) generointivuo HDL Coder:lla, mahdollisia ongelmakohtia vuossa ja ratkaisuja ongelmiin. HLS-vuota tutkitaan esimerkkimallin avulla, joka skaalaa ja rajoittaa IQ-datan tehoa. Esimerkkimallin toiminta tarkistetaan ohjelmoitavan logiikkapiirin (FPGA) kanssa. Työ keskittyy arvioimaan MathWorks:n HLS-vuon käyttöä ja hyötyä nopeaan prototypointiin SoC:ien kehityksessä. Esimerkkinä käytetään Simulink-mallia, joka sisältää MATLAB-funktioita ja System Object-olioita. Algoritmitasolla optimoitu malli syntesoidaan VHDL:ksi ja rekisterinsiirtotason (RTL) mallin toiminta tarkistetaan yhteissimulaatiolla alkuperäistä algoritmimallia vasten. Optimointi- ja verifiointimenetelmien toimivuutta ja tehokkuutta arvioidaan. Generoitu HDL-malli syntesoidaan kolmannen osapuolen logiikkasynteesi-työkalulla, joka käynnistetään MathWorks:n työkaluvuon generoimalla komentosarjalla. Luotu malli ohjelmoidaan FPGA:lle ja sen toiminta tarkistetaan FPGA-simulaatiolla. Syntesoituvan HDL-koodin generointiin vaadittavia koodaustyylejä ja algoritmimallin simulointinopeutta parantavia menetelmiä tutkittiin. MathWorks:n HLS-vuon soveltuvuutta nopeaan prototypointiin algoritmista FPGA-prototyypiksi pohdittiin. Lisäksi optimointimenetelmiä ja vuon soveltuvuutta tuotantolaatuisen RTL:n generoimiseen arvioitiin. MathWorks:n työkaluvuo osoitti lupaavia tuloksia nopean prototypoinnin näkökulmasta. Se loi luettavaa HDL-koodia, joka syntesoitui FPGA:lle. Malli ajettiin onnistuneesti FPGA:lla. Vuon avulla saavutettiin hyviä tuloksia pinta-alan ja nopeuden suhteen, kun malli optimoitiin asiakaspiirille. Tämä vaati mallin kuvaamista tarkasti laitteiston näkökulmasta. Prosessi oli kokonaisuudessaan selkeä ja tehokas

    Enabling Automated Bug Detection for IP-based Designs using High-Level Synthesis

    Get PDF
    Modern System-on-Chip (SoC) architectures are increasingly composed of Intellectual Property (IP) blocks, usually designed and provided by different vendors. This burdens system designers with complex system-level integration and verification. In this paper, we propose an approach that leverages HLS techniques to automatically find bugs in designs composed of multiple IP blocks. Our method is particularly suitable for industrial adoption because it works without exposing sensitive information (e.g., the design specification or the component generation process). This advocates the definition and the adoption of an interoperable format for cross-vendor hardware bug detection

    RTL Design Quality Checks for Soft IPs

    Get PDF
    Soft IPs are architectural modules which are delivered in the form of synthesizable RTL level codes written in some HDL (hardware descriptive language) like Verilog or VHDL or System Verilog. They are technology independent and offer high degree of modification flexibility. RTL is the complete abstraction of our design. Since SOC complexity is growing day by day with new technologies and requirement, it will be very much difficult to debug and fix issues after physical level. So to reduce effort and increase efficiency and accuracy it is necessary to fix most of the bugs in RTL level. Also if we are using soft IP, then our bug free IP can be used by third party. So early detection of bugs helps us not to go back to entire design and do all the process again and again. One of the important issue at RTL level of a design is the Clock Domain Crossing (CDC) problem. This is the issue which affects the performance at each and every stage of the design flow. Failure in fixing these issues at the earlier stage makes the design unreliable and design performance collapses. The main issue in real time clock designs are the metastability issue. Although we cannot check or see these issues using our simulator but we have to make preventions at RTL level. This is done by restructuring the design and adding required synchronizers. One more important area of consideration in VLSI design is power consumption. In modern low power designs low power is a key factor. So design consuming less power is preferred over design consuming more power. This decision should be made as early as possible. RTL quality check helps us on this aspect. Using different tools power estimation can be performed at RTL stage which saves lots of efforts in redesigning. This project aims at checking clock domain crossing faults at RTL stage and doing redesign of circuit to eliminate those faults. Also an effort is made to compare quality of two designs in terms of delay, power consumption and area
    corecore