110 research outputs found

    Implementation of Genetic Algorithms in FPGA-based Reconfigurable Computing Systems

    Get PDF
    Genetic Algorithms (GAs) are used to solve many optimization problems in science and engineering. GA is a heuristics approach which relies largely on random numbers to determine the approximate solution of an optimization problem. We use the Mersenne Twister Algorithm (MTA) to generate a non-overlapping sequence of random numbers with a period of 219937-1. The random numbers are generated from a state vector that consists of 624 elements. Our work on state vector generation and the GA implementation targets the solution of a flow-line scheduling problem where the flow-lines have jobs to process and the goal is to find a suitable completion time for all jobs using a GA. The state vector generation algorithm (MTA) performs poorly in traditional von Neumann architectures due to its poor temporal and spatial locality. Therefore its performance is limited by the speed at which we can access memory. With an approximate increase of processor performance by 60% per year and a drop of memory latency only 7% per year, a new approach is needed for performance improvement. On the other hand, the GA implementation in a general-purpose microprocessor, though performs reasonably well, has scope for performance gain in a parallel implementation. The parallel implementation of the GA can work as a kernel for applications that uses a GA to reach a solution. Our approach is to implement the state vector generation process and the GA in an FPGA-based Reconfigurable Computing (RC) system with the goal of improving the overall performance. Application design for FPGA-based RC systems is not trivial and the performance improvement is not guaranteed. Designing for RC systems requires algorithmic parallelism in order to exploit the inherent parallelism of the FPGA. We are using a high-level language that provides a level of abstraction from the lower-level hardware in the RC system making it difficult to fully exploit some of the architectural benefits of the FPGA. Considering these factors, we improve the state vector generation process algorithmically. Our implementation generates state vectors 5X faster than the previous implementation in an Intel Xeon microprocessor of 2GHz. The modified algorithm is also implemented in a Xilinx Virtex-4 FPGA that results in a 2.4X speedup. Improvement in this preprocessing step accelerates GA application performance as random numbers are generated from these state vectors for the genetic operators. We simulate the basic operations of a GA in an FPGA to study its behavior in a parallel environment and analyze the results. The initial FPGA implementation of the GA runs about 7X slower than its microprocessor counterpart. The reasons are explained along with suggestions for improvement and future work

    PyHGL: A Python-based Hardware Generation Language Framework

    Full text link
    Hardware generation languages (HGLs) increase hardware design productivity by creating parameterized modules and test benches. Unfortunately, existing tools are not widely adopted due to several demerits, including limited support for asynchronous circuits and unknown states, lack of concise and efficient language features, and low integration of simulation and verification functions. This paper introduces PyHGL, an open-source Python framework that aims to provide a simple and unified environment for hardware generation, simulation, and verification. PyHGL language is a syntactical superset of Python, which greatly reduces the lines of code (LOC) and improves productivity by providing unique features such as dynamic typing, vectorized operations, and automatic port deduction. In addition, PyHGL integrates an event-driven simulator that simulates the asynchronous behaviors of digital circuits using three-state logic. We also propose an algorithm that eliminates the calculation and transmission overhead of unknown state propagation for binary stimuli. The results suggest that PyHGL code is up to 6.1x denser than traditional RTL and generates high-quality synthesizable RTL code. Moreover, the optimized simulator achieves 2.9x speed up and matches the performance of a commonly used open-source logic simulator

    FPGA Implementation of Procedures for Video Quality Assessment

    Get PDF
    Video resolutions used in a variety of media are constantly rising. While manufacturers struggle to perfect their screens, it is also important to ensure high quality of displayed image. Overall quality can be measured using Mean Opinion Score (MOS). Video quality can be aected by miscellaneous artifacts, appearing at every stage of video creation and transmission. In this paper, we present a solution to calculate four distinct video quality metrics that can be applied to a real-time video quality assessment system. Our assessment module is capable of processing 8K resolution in real time set at the level of 30 frames per second. The throughput of 2.19 GB/s surpasses the performance of pure software solutions. The module was created using a high-level language to concentrate on the architectural optimization

    Best practices for building hardware designs for living computational science applications

    Get PDF
    Scientific computing or Computational science, is a field of study where engineers and scientists use computer simulations to solve equations that model the physical world. In some cases, these equations come from the first principles of physics. In the past, these simulations were run on a single processor machine. However, due to various technological reasons, the performance of these machines are not likely to improve at the same rate as in the past. In order to improve the performance per watt of these simulations, special-purpose hardware accelerators can be used. This work mainly focuses on using FPGA-based hardware accelerators. In order to run these simulations on an FPGA accelerator, the application code needs to be re-factored into software and hardware sections. These faster simulations have motivated scientists to capture more behavior of the physical world. As additional behavior is captured, the application code needs to be re-factored each time, and a significant effort is required to re-build the design. Unfortunately, these multiple cycles of re-design reduces the overall productivity of scientists and engineers. This work proposes a set of hardware design guidelines for changing computational science codes or living computational science codes. These guidelines co-evolve the hardware with the software, reducing the overall effort of re-design and improving productivity. The design guidelines are evaluated for effectiveness, communicability, and broad applicability. Experimental results have shown that the overall re-design effort is reduced, and these guidelines are broadly applicable to a wide variety of scientific computing applications

    Optimization of DSSS Receivers Using Hardware-in-the-Loop Simulations

    Get PDF
    Over the years, there has been significant interest in defining a hardware abstraction layer to facilitate code reuse in software defined radio (SDR) applications. Designers are looking for a way to enable application software to specify a waveform, configure the platform, and control digital signal processing (DSP) functions in a hardware platform in a way that insulates it from the details of realization. This thesis presents a tool-based methodolgy for developing and optimizing a Direct Sequence Spread Spectrum (DSSS) transceiver deployed in custom hardware like Field Programmble Gate Arrays (FPGAs). The system model consists of a tranmitter which employs a quadrature phase shift keying (QPSK) modulation scheme, an additive white Gaussian noise (AWGN) channel, and a receiver whose main parts consist of an analog-to-digital converter (ADC), digital down converter (DDC), image rejection low-pass filter (LPF), carrier phase locked loop (PLL), tracking locked loop, down-sampler, spread spectrum correlators, and rectangular-to-polar converter. The design methodology is based on a new programming model for FPGAs developed in the industry by Xilinx Inc. The Xilinx System Generator for DSP software tool provides design portability and streamlines system development by enabling engineers to create and validate a system model in Xilinx FPGAs. By providing hierarchical modeling and automatic HDL code generation for programmable devices, designs can be easily verified through hardware-in-the-loop (HIL) simulations. HIL provides a significant increase in simulation speed which allows optimization of the receiver design with respect to the datapath size for different functional parts of the receiver. The parameterized datapath points used in the simulation are ADC resolution, DDC datapath size, LPF datapath size, correlator height, correlator datapath size, and rectangular-to-polar datapath size. These parameters are changed in the software enviornment and tested for bit error rate (BER) performance through real-time hardware simualtions. The final result presents a system design with minimum harware area occupancy relative to an acceptable BER degradation

    High-level synthesis for FPGAs: From prototyping to deployment

    Get PDF
    Abstract-Escalating System-on-Chip design complexity is pushing the design community to raise the level of abstraction beyond RTL. Despite the unsuccessful adoptions of early generations of commercial high-level synthesis (HLS) systems, we believe that the tipping point for transitioning to HLS methodology is happening now, especially for FPGA designs. The latest generation of HLS tools has made significant progress in providing wide language coverage and robust compilation technology, platform-based modeling, advancement in core HLS algorithms, and a domain-specific approach. In this paper we use AutoESL's AutoPilot HLS tool coupled with domain-specific system-level implementation platforms developed by Xilinx as an example to demonstrate the effectiveness of state-of-art C-to-FPGA synthesis solutions targeting multiple application domains. Complex industrial designs targeting Xilinx FPGAs are also presented as case studies, including comparison of HLS solutions versus optimized manual designs. Index Terms-Domain-specific design, field-programmable gate array (FPGA), high-level synthesis (HLS), quality of results (QoR)

    A reconfigurable accelerator card for high performance computing

    Get PDF
    Includes abstract.Includes bibliographical references (leaves 68-70).This thesis describes the design, implementation, and testing of a reconfigurable accelerator card. The goal of the project was to provide a hardware platform for future students to carry out research into reconfigurable computing. Our accelerator design is an expansion card for a traditional Von Neumann host machine, and contains two field-programmable gate arrays. By inserting the card into a host machine, intrinsically parallel processing tasks can be exported to the FPGAs. This is similar to the way in which video game rendering tasks can be exported to the GFC on a graphics accelerator. We show how an FPGA is a suitable processing element, in terms of performance per watt, for many computing tasks. We set out to design and build a reconfigurable card that harnessed the latest FPGAs and fastest available I/O interfaces. The resultant design is one which can run within a host machine, in an array of host machines, or as a stand-alone processing node

    NASA Tech Briefs, December 2011

    Get PDF
    Topics covered include: 1) SNE Industrial Fieldbus Interface; 2) Composite Thermal Switch; 3) XMOS XC-2 Development Board for Mechanical Control and Data Collection; 4) Receiver Gain Modulation Circuit; 5) NEXUS Scalable and Distributed Next-Generation Avionics Bus for Space Missions; 6) Digital Interface Board to Control Phase and Amplitude of Four Channels; 7) CoNNeCT Baseband Processor Module; 8) Cryogenic 160-GHz MMIC Heterodyne Receiver Module; 9) Ka-Band, Multi-Gigabit-Per-Second Transceiver; 10) All-Solid-State 2.45-to-2.78-THz Source; 11) Onboard Interferometric SAR Processor for the Ka-Band Radar Interferometer (KaRIn); 12) Space Environments Testbed; 13) High-Performance 3D Articulated Robot Display; 14) Athena; 15) In Situ Surface Characterization; 16) Ndarts; 17) Cryo-Etched Black Silicon for Use as Optical Black; 18) Advanced CO2 Removal and Reduction System; 19) Correcting Thermal Deformations in an Active Composite Reflector; 20) Umbilical Deployment Device; 21) Space Mirror Alignment System; 22) Thermionic Power Cell To Harness Heat Energies for Geothermal Applications; 23) Graph Theory Roots of Spatial Operators for Kinematics and Dynamics; 24) Spacesuit Soft Upper Torso Sizing Systems; 25) Radiation Protection Using Single-Wall Carbon Nanotube Derivatives; 26) PMA-PhyloChip DNA Microarray to Elucidate Viable Microbial Community Structure; 27) Lidar Luminance Quantizer; 28) Distributed Capacitive Sensor for Sample Mass Measurement; 29) Base Flow Model Validation; 30) Minimum Landing Error Powered-Descent Guidance for Planetary Missions; 31) Framework for Integrating Science Data Processing Algorithms Into Process Control Systems; 32) Time Synchronization and Distribution Mechanisms for Space Networks; 33) Local Estimators for Spacecraft Formation Flying; 34) Software-Defined Radio for Space-to-Space Communications; 35) Reflective Occultation Mask for Evaluation of Occulter Designs for Planet Finding; and 36) Molecular Adsorber Coatin
    • …
    corecore