262 research outputs found

    FPGA design methodology for industrial control systems—a review

    Get PDF
    This paper reviews the state of the art of fieldprogrammable gate array (FPGA) design methodologies with a focus on industrial control system applications. This paper starts with an overview of FPGA technology development, followed by a presentation of design methodologies, development tools and relevant CAD environments, including the use of portable hardware description languages and system level programming/design tools. They enable a holistic functional approach with the major advantage of setting up a unique modeling and evaluation environment for complete industrial electronics systems. Three main design rules are then presented. These are algorithm refinement, modularity, and systematic search for the best compromise between the control performance and the architectural constraints. An overview of contributions and limits of FPGAs is also given, followed by a short survey of FPGA-based intelligent controllers for modern industrial systems. Finally, two complete and timely case studies are presented to illustrate the benefits of an FPGA implementation when using the proposed system modeling and design methodology. These consist of the direct torque control for induction motor drives and the control of a diesel-driven synchronous stand-alone generator with the help of fuzzy logic

    HL-Pow: A Learning-Based Power Modeling Framework for High-Level Synthesis

    Full text link
    High-level synthesis (HLS) enables designers to customize hardware designs efficiently. However, it is still challenging to foresee the correlation between power consumption and HLS-based applications at an early design stage. To overcome this problem, we introduce HL-Pow, a power modeling framework for FPGA HLS based on state-of-the-art machine learning techniques. HL-Pow incorporates an automated feature construction flow to efficiently identify and extract features that exert a major influence on power consumption, simply based upon HLS results, and a modeling flow that can build an accurate and generic power model applicable to a variety of designs with HLS. By using HL-Pow, the power evaluation process for FPGA designs can be significantly expedited because the power inference of HL-Pow is established on HLS instead of the time-consuming register-transfer level (RTL) implementation flow. Experimental results demonstrate that HL-Pow can achieve accurate power modeling that is only 4.67% (24.02 mW) away from onboard power measurement. To further facilitate power-oriented optimizations, we describe a novel design space exploration (DSE) algorithm built on top of HL-Pow to trade off between latency and power consumption. This algorithm can reach a close approximation of the real Pareto frontier while only requiring running HLS flow for 20% of design points in the entire design space.Comment: published as a conference paper in ASP-DAC 202

    Compiling dataflow graphs into hardware

    Get PDF
    Department Head: L. Darrell Whitley.2005 Fall.Includes bibliographical references (pages 121-126).Conventional computers are programmed by supplying a sequence of instructions that perform the desired task. A reconfigurable processor is "programmed" by specifying the interconnections between hardware components, thereby creating a "hardwired" system to do the particular task. For some applications such as image processing, reconfigurable processors can produce dramatic execution speedups. However, programming a reconfigurable processor is essentially a hardware design discipline, making programming difficult for application programmers who are only familiar with software design techniques. To bridge this gap, a programming language, called SA-C (Single Assignment C, pronounced "sassy"), has been designed for programming reconfigurable processors. The process involves two main steps - first, the SA-C compiler analyzes the input source code and produces a hardware-independent intermediate representation of the program, called a dataflow graph (DFG). Secondly, this DFG is combined with hardware-specific information to create the final configuration. This dissertation describes the design and implementation of a system that performs the DFG to hardware translation. The DFG is broken up into three sections: the data generators, the inner loop body, and the data collectors. The second of these, the inner loop body, is used to create a computational structure that is unique for each program. The other two sections are implemented by using prebuilt modules, parameterized for the particular problem. Finally, a "glue module" is created to connect the various pieces into a complete interconnection specification. The dissertation also explores optimizations that can be applied while processing the DFG, to improve performance. A technique for pipelining the inner loop body is described that uses an estimation tool for the propagation delay of the nodes within the dataflow graph. A scheme is also described that identifies subgraphs with the dataflow graph that can be replaced with lookup tables. The lookup tables provide a faster implementation than random logic in some instances

    FPGA-SPICE: A Simulation-Based Architecture Evaluation Framework for FPGAs

    Get PDF
    In this paper, we developed a simulation-based architecture evaluation framework for field-programmable gate arrays (FPGAs), called FPGA-SPICE, which enables automatic layout-level estimation and electrical simulations of FPGA architectures. FPGA-SPICE can automatically generate Verilog and SPICE netlists based on realistic FPGA configurations and a high-level eTtensible Markup Language-based FPGA architectural description language. The outputted Verilog netlists can be used to generate layouts of full FPGA fabrics through a semicustom design flow. SPICE simulation decks can be generated at three levels of complexity, namely, full-chip-level, grid-level, and component-level, providing different tradeoff between accuracy and simulation time. In order to enable such level of analysis, we presented two SPICE netlist partitioning techniques: loads extraction and parasitic net activity estimation. Electrical simulations showed that averaged over the selected benchmarks, the grid-/component-level approach can achieve 6.1x/7.5x execution speed-up with 9.9%/8.3% accuracy loss, respectively, compared to the full-chip level simulation. FPGA-SPICE was showcased through three different case studies: 1) an area breakdown analysis for static random access memory-based FPGAs, showing that configuration memories are a dominant factor; 2) a power breakdown comparison to analytical models, analyzing the source of accuracy loss; and 3) a robustness evaluation against process corners, studying their impact on energy consumption of full FPGA fabrics

    Configurable data center switch architectures

    Get PDF
    In this thesis, we explore alternative architectures for implementing con_gurable Data Center Switches along with the advantages that can be provided by such switches. Our first contribution centers around determining switch architectures that can be implemented on Field Programmable Gate Array (FPGA) to provide configurable switching protocols. In the process, we identify a gap in the availability of frameworks to realistically evaluate the performance of switch architectures in data centers and contribute a simulation framework that relies on realistic data center traffic patterns. Our framework is then used to evaluate the performance of currently existing as well as newly proposed FPGA-amenable switch designs. Through collaborative work with Meng and Papaphilippou, we establish that only small-medium range switches can be implemented on today's FPGAs. Our second contribution is a novel switch architecture that integrates a custom in-network hardware accelerator with a generic switch to accelerate Deep Neural Network training applications in data centers. Our proposed accelerator architecture is prototyped on an FPGA, and a scalability study is conducted to demonstrate the trade-offs of an FPGA implementation when compared to an ASIC implementation. In addition to the hardware prototype, we contribute a light weight load-balancing and congestion control protocol that leverages the unique communication patterns of ML data-parallel jobs to enable fair sharing of network resources across different jobs. Our large-scale simulations demonstrate the ability of our novel switch architecture and light weight congestion control protocol to both accelerate the training time of machine learning jobs by up to 1.34x and benefit other latency-sensitive applications by reducing their 99%-tile completion time by up to 4.5x. As for our final contribution, we identify the main requirements of in-network applications and propose a Network-on-Chip (NoC)-based architecture for supporting a heterogeneous set of applications. Observing the lack of tools to support such research, we provide a tool that can be used to evaluate NoC-based switch architectures.Open Acces

    Design and implementation of an ETSI-SDR OFDM transmitter with power amplifier linearizer

    Get PDF
    Satellite radio has attained great popularity because of its wide range of geographical coverage and high signal quality as compared to the terrestrial broadcasts. Most Satellite Digital Radio (SDR) based systems favor multi-carrier transmission schemes, especially, orthogonal frequency division multiplexing (OFDM) transmission because of high data transfer rate and spectral efficiency. It is a challenging task to find a suitable platform that supports fast data rates and superior processing capabilities required for the development and deployment of the new SDR standards. Field programmable gate array (FPGA) devices have the potential to become suitable development platform for such standards. Another challenging factor in SDR systems is the distortion of variable envelope signals used in OFDM transmission by the nonlinear RF power amplifiers (PA) used in the base station transmitters. An attractive option is to use a linearizer that would compensate for the nonlinear effects of the PA. In this research, an OFDM transmitter, according to European Telecommunications Standard Institute (ETSI) SDR Technical Specifications 2007-2008, was designed and implemented on a low-cost Xilinx FPGA platform. A weakly nonlinear PA, operating in the L-band SDR frequency (1.450-1.490GHz), was used for signal transmission. An FPGA-based, low-cost, adaptive linearizer was designed and implemented based on the digital predistortion (DPD) reference design from Xilinx, to correct the distortion effects of the PA on the transmitted signal

    Investigating Single Precision Floating General Matrix Multiply in Heterogeneous Hardware

    Get PDF
    The fundamental operation of matrix multiplication is ubiquitous across a myriad of disciplines. Yet, the identification of new optimizations for matrix multiplication remains relevant for emerging hardware architectures and heterogeneous systems. Frameworks such as OpenCL enable computation orchestration on existing systems, and its availability using the Intel High Level Synthesis compiler allows users to architect new designs for reconfigurable hardware using C/C++. Using the HARPv2 as a vehicle for exploration, we investigate the utility of several of the most notable matrix multiplication optimizations to better understand the performance portability of OpenCL and the implications for such optimizations on this and future heterogeneous architectures. Our results give targeted insights into the applicability of best practices that were for existing architectures when used on emerging heterogeneous systems
    • …
    corecore