6,626 research outputs found

    A Fixed Point VHDL Component Library for a High Efficiency Reconfigurable Radio Design Methodology

    Get PDF
    Advances in Field Programmable Gate Array (FPGA) technologies enable the implementation of reconfigurable radio systems for both ground and space applications. The development of such systems challenges the current design paradigms and requires more robust design techniques to meet the increased system complexity. Among these techniques is the development of component libraries to reduce design cycle time and to improve design verification, consequently increasing the overall efficiency of the project development process while increasing design success rates and reducing engineering costs. This paper describes the reconfigurable radio component library developed at the Software Defined Radio Applications Research Center (SARC) at Goddard Space Flight Center (GSFC) Microwave and Communications Branch (Code 567). The library is a set of fixed-point VHDL components that link the Digital Signal Processing (DSP) simulation environment with the FPGA design tools. This provides a direct synthesis path based on the latest developments of the VHDL tools as proposed by the BEE VBDL 2004 which allows for the simulation and synthesis of fixed-point math operations while maintaining bit and cycle accuracy. The VHDL Fixed Point Reconfigurable Radio Component library does not require the use of the FPGA vendor specific automatic component generators and provide a generic path from high level DSP simulations implemented in Mathworks Simulink to any FPGA device. The access to the component synthesizable, source code provides full design verification capability

    Benchmarking CPUs and GPUs on embedded platforms for software receiver usage

    Get PDF
    Smartphones containing multi-core central processing units (CPUs) and powerful many-core graphics processing units (GPUs) bring supercomputing technology into your pocket (or into our embedded devices). This can be exploited to produce power-efficient, customized receivers with flexible correlation schemes and more advanced positioning techniques. For example, promising techniques such as the Direct Position Estimation paradigm or usage of tracking solutions based on particle filtering, seem to be very appealing in challenging environments but are likewise computationally quite demanding. This article sheds some light onto recent embedded processor developments, benchmarks Fast Fourier Transform (FFT) and correlation algorithms on representative embedded platforms and relates the results to the use in GNSS software radios. The use of embedded CPUs for signal tracking seems to be straight forward, but more research is required to fully achieve the nominal peak performance of an embedded GPU for FFT computation. Also the electrical power consumption is measured in certain load levels.Peer ReviewedPostprint (published version

    Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

    Get PDF
    In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

    Optimizations for real-time implementation of H264/AVC video encoder on DSP processor

    Get PDF
    International audienceReal-time H.264/AVC high definition video encoding represents a challenging workload to most existing programmable processors. The new technologies of programmable processors such as Graphic Processor Unit (GPU) and multicore Digital signal Processor (DSP) offer a very promising solution to overcome these constraints. In this paper, an optimized implementation of H264/AVC video encoder on a single core among the six cores of TMS320C6472 DSP for Common Intermediate Format (CIF) (352x288) resolution is presented in order to move afterwards to a multicore implementation for standard and high definitions (SD,HD).Algorithmic optimization is applied to the intra prediction module to reduce the computational time. Furthermore, based on the DSP architectural features, various structural and hardware optimizations are adopted to minimize external memory access. The parallelism between CPU processing and data transfers is fully exploited using an Enhanced Direct Memory Access controller (EDMA). Experimental results show that the whole proposed optimizations, on a single core running at 700 MHz for CIF resolution, improve the encoding speed by up to 42.91%. They allow reaching the real-time encoding 25 f/s without inducing any Peak Signal to Noise Ratio (PSNR) degradation or bit-rate increase and make possible to achieve real time implementation for SD and HD resolutions when exploiting multicore features

    An FPGA implementation of OFDM transceiver for LTE applications

    Get PDF
    The paper presents a real-time transceiver using an Orthogonal Frequency-Division Multiplexing (OFDM) signaling scheme. The transceiver is implemented on a Field- Programmable Gate Array (FPGA) through Xilinx System Generator for DSP and includes all the blocks needed for the transmission path of OFDM. The transmitter frame can be reconfigured for different pilot and data schemes. In the receiver, time-domain synchronization is achieved thr ough a joint maximum likelihood (ML) symbol arrival-time and carrier frequency offset (CFO) estimator through the redundant information contained in the cyclic prefix (CP). A least-squares channel estimation retrieves the channel state information and a simple zero-forcing scheme has been implemented for channel equalization. Results show that a rough implementation of the signal path can be impleme nted by using only Xilinx System Generator for DSP
    corecore