6 research outputs found

    ASIC Design of Radix-2, 8-point FFT Processor 

    Get PDF
    In split radix architecture, large sizes Fast Fourier Transforms (FFT) are decomposed into small independent computations to reduce storage burden. Radix-2, 8 point is one the popular choice in split radix for small independent computation. Author proposes the FFT processor architecture for this small independent computation i.e. radix-2, 8-point FFT. This paper brief architecture comprising Butterfly Unit (BU), register set and controller. The novelty of this architecture is that it replaces the series of Processing Elements (PE) by single BU. BU computes two halves of the computations concurrently. Arithmetic computations are performed in floating point form to overcome the nonlinearities. All computations are controlled by tailored instruction set. All instructions are of same size and have same execution time. Twiddle constants are implicitly available in the instruction. Internal computations are stored in register set to avoid the load and store operations with memory. The mean square error of the computation is reduced by 41.95 % and 55.76 % in magnitude and phase respectively as compared with computations performed by rounding the twiddle constant. This FFT processor is synthesized, placed and routed for 45 nm technology of nangate open cell library. The BU of this architecture is 18 % smaller and 5 % faster as compared with smallest and fastest BU reported previously. The hardware cost metric i.e.    Dp mm2 ns2 mW = 1.37 of proposed processor and 32.51 % less as compared with the previous work

    ASIC Design of Radix-2,8-Point FFT Processor

    Get PDF
    230-238In split radix architecture, large sizes Fast Fourier Transforms (FFT) are decomposed into small independent computations to reduce storage burden. Radix-2, 8-point is one the popular choice in split radix for small independent computation. Authors proposes the FFT processor architecture for this small independent computation i.e. radix-2, 8-point FFT. This paper brief architecture comprising Butterfly Unit (BU), register set and controller. The novelty of this architecture is that it replaces the series of Processing Elements (PE) by single BU. BU computes two halves of the computations concurrently. Arithmetic computations are performed in floating point form to overcome the nonlinearities. All computations are controlled by tailored instruction set. All instructions are of same size and have same execution time. Twiddle constants are implicitly available in the instruction. Internal computations are stored in register set to avoid the load and store operations with memory. The mean square error of the computation is reduced by 41.95% and 55.76% in magnitude and phase respectively as compared with computations performed by rounding the twiddle constant. This FFT processor is synthesized, placed and routed for 45 nm technology of nangate open cell library. The BU of this architecture is 18.89% smaller and 5.13% faster as compared with smallest and fastest BU reported previously. The hardware cost metric i.e. Dp mm2 ns2 mW of proposed processor is 1.37. This cost metric is also 32.51% less as compared with the previous work

    Improving Energy Efficiency of OFDM Using Adaptive Precision Reconfigurable FFT

    Get PDF
    International audienceBeing an essential issue in digital systems, especially battery-powered devices, energy efficiency has been the subject of intensive research. In this research, a multi-precision FFT module with dynamic run-time reconfigurability is proposed to trade off accuracy with the energy efficiency of OFDM in an SDR-based architecture. To support variable-size FFT, a reconfigurable memory-based architecture is investigated. It is revealed that the radix-4 FFT has the minimum computational complexity in this architecture. Regarding implementation constraints such as fixed-width memory, a noise model is exploited to statistically analyze the proposed architecture. The required FFT word-lengths for different criteria—namely BER, modulation scheme, FFT size, and SNR—are computed analytically and confirmed by simulations in AWGN and Rayleigh fading channels. At run-time, the most energy-efficient word-length is chosen and the FFT is reconfigured until the required application-specific BER is met. Evaluations show that the implementation area and the number of memory accesses are reduced. The results obtained from synthesizing basic operators of the proposed design on an FPGA show energy consumption experienced a saving of over 80 %

    Doctor of Philosophy

    Get PDF
    dissertationThe design of integrated circuit (IC) requires an exhaustive verification and a thorough test mechanism to ensure the functionality and robustness of the circuit. This dissertation employs the theory of relative timing that has the advantage of enabling designers to create designs that have significant power and performance over traditional clocked designs. Research has been carried out to enable the relative timing approach to be supported by commercial electronic design automation (EDA) tools. This allows asynchronous and sequential designs to be designed using commercial cad tools. However, two very significant holes in the flow exist: the lack of support for timing verification and manufacturing test. Relative timing (RT) utilizes circuit delay to enforce and measure event sequencing on circuit design. Asynchronous circuits can optimize power-performance product by adjusting the circuit timing. A thorough analysis on the timing characteristic of each and every timing path is required to ensure the robustness and correctness of RT designs. All timing paths have to conform to the circuit timing constraints. This dissertation addresses back-end design robustness by validating full cyclical path timing verification with static timing analysis and implementing design for testability (DFT). Circuit reliability and correctness are necessary aspects for the technology to become commercially ready. In this study, scan-chain, a commercial DFT implementation, is applied to burst-mode RT designs. In addition, a novel testing approach is developed along with scan-chain to over achieve 90% fault coverage on two fault models: stuck-at fault model and delay fault model. This work evaluates the cost of DFT and its coverage trade-off then determines the best implementation. Designs such as a 64-point fast Fourier transform (FFT) design, an I2C design, and a mixed-signal design are built to demonstrate power, area, performance advantages of the relative timing methodology and are used as a platform for developing the backend robustness. Results are verified by performing post-silicon timing validation and test. This work strengthens overall relative timed circuit flow, reliability, and testability

    Doctor of Philosophy

    Get PDF
    dissertationAsynchronous design has a very promising potential even though it has largely received a cold reception from industry. Part of this reluctance has been due to the necessity of custom design languages and computer aided design (CAD) flows to design, optimize, and validate asynchronous modules and systems. Next generation asynchronous flows should support modern programming languages (e.g., Verilog) and application specific integrated circuits (ASIC) CAD tools. They also have to support multifrequency designs with mixed synchronous (clocked) and asynchronous (unclocked) designs. This work presents a novel relative timing (RT) based methodology for generating multifrequency designs using synchronous CAD tools and flows. Synchronous CAD tools must be constrained for them to work with asynchronous circuits. Identification of these constraints and characterization flow to automatically derive the constraints is presented. The effect of the constraints on the designs and the way they are handled by the synchronous CAD tools are analyzed and reported in this work. The automation of the generation of asynchronous design templates and also the constraint generation is an important problem. Algorithms for automation of reset addition to asynchronous circuits and power and/or performance optimizations applied to the circuits using logical effort are explored thus filling an important hole in the automation flow. Constraints representing cyclic asynchronous circuits as directed acyclic graphs (DAGs) to the CAD tools is necessary for applying synchronous CAD optimizations like sizing, path delay optimizations and also using static timing analysis (STA) on these circuits. A thorough investigation for the requirements of cycle cutting while preserving timing paths is presented with an algorithm to automate the process of generating them. A large set of designs for 4 phase handshake protocol circuit implementations with early and late data validity are characterized for area, power and performance. Benchmark circuits with automated scripts to generate various configurations for better understanding of the designs are proposed and analyzed. Extension to the methodology like addition of scan insertion using automatic test pattern generation (ATPG) tools to add testability of datapath in bundled data asynchronous circuit implementations and timing closure approaches are also described. Energy, area, and performance of purely asynchronous circuits and circuits with mixed synchronous and asynchronous blocks are explored. Results indicate the benefits that can be derived by generating circuits with asynchronous components using this methodology

    HW/SW Co-Design Framework fĂĽr Hochgeschwindigkeits-OFDM Signalverarbeitung

    Get PDF
    Im Rahmen dieser Arbeit wurde ein HW/SW Co-Design Framework zur Erstellung angepasster Multiprozessor System-on-Chips entwickelt, womit sich für moderne OFDM-Systeme neue Kompromisse zwischen Leistungsfähigkeit und Flexibilität erzielen lassen. Anhand unterschiedlicher Experimente zur Hochgeschwindigkeits-OFDM Übertragung wurde die Funktionalität der Systeme nachgewiesen sowie Datenraten im Gb/s-Bereich erzielt, was bisher lediglich unflexiblen, dedizierten Schaltkreisen vorbehalten war
    corecore