1,251 research outputs found

    Pipelined Two-Operand Modular Adders

    Get PDF
    Pipelined two-operand modular adder (TOMA) is one of basic components used in digital signal processing (DSP) systems that use the residue number system (RNS). Such modular adders are used in binary/residue and residue/binary converters, residue multipliers and scalers as well as within residue processing channels. The design of pipelined TOMAs is usually obtained by inserting an appriopriate number of latch layers inside a nonpipelined TOMA structure. Hence their area is also determined by the number of latches and the delay by the number of latch layers. In this paper we propose a new pipelined TOMA that is based on a new TOMA, that has the smaller area and smaller delay than other known structures. Comparisons are made using data from the very large scale of integration (VLSI) standard cell library

    Auto-Generation of Pipelined Hardware Designs for Polar Encoder

    Full text link
    This paper presents a general framework for auto-generation of pipelined polar encoder architectures. The proposed framework could be well represented by a general formula. Given arbitrary code length NN and the level of parallelism MM, the formula could specify the corresponding hardware architecture. We have written a compiler which could read the formula and then automatically generate its register-transfer level (RTL) description suitable for FPGA or ASIC implementation. With this hardware generation system, one could explore the design space and make a trade-off between cost and performance. Our experimental results have demonstrated the efficiency of this auto-generator for polar encoder architectures

    CAD Tool Design for NCL and MTNCL Asynchronous Circuits

    Get PDF
    This thesis presents an implementation of a method developed to readily convert Boolean designs into an ultra-low power asynchronous design methodology called MTNCL, which combines multi-threshold CMOS (MTCMOS) with NULL Convention Logic (NCL) systems. MTNCL provides the leakage power advantages of an all high-Vt implementation with a reasonable speed penalty compared to the all low-Vt implementation, and has negligible area overhead. The proposed tool utilizes industry-standard CAD tools. This research also presents an Automated Gate-Level Pipelining with Bit-Wise Completion (AGLPBW) method to maximize throughput of delay-insensitive full-word pipelined NCL circuits. These methods have been integrated into the Mentor Graphics and Synopsis CAD tools, using a C-program, which performs the majority of the computations, such that the method can be easily ported to other CAD tool suites. Both methods have been successfully tested on circuits, including a 4-bit Ă— 4-bit multiplier, an unsigned Booth2 multiplier, and a 4-bit/8-operation arithmetic logic unit (ALU

    High throughput spatial convolution filters on FPGAs

    Get PDF
    Digital signal processing (DSP) on field- programmable gate arrays (FPGAs) has long been appealing because of the inherent parallelism in these computations that can be easily exploited to accelerate such algorithms. FPGAs have evolved significantly to further enhance the mapping of these algorithms, included additional hard blocks, such as the DSP blocks found in modern FPGAs. Although these DSP blocks can offer more efficient mapping of DSP computations, they are primarily designed for 1-D filter structures. We present a study on spatial convolutional filter implementations on FPGAs, optimizing around the structure of the DSP blocks to offer high throughput while maintaining the coefficient flexibility that other published architectures usually sacrifice. We show that it is possible to implement large filters for large 4K resolution image frames at frame rates of 30–60 FPS, while maintaining functional flexibility

    Architectural Approaches For Gallium Arsenide Exploitation In High-Speed Computer Design

    Get PDF
    Continued advances in the capability of Gallium Arsenide (GaAs)technology have finally drawn serious interest from computer system designers. The recent demonstration of very large scale integration (VLSI) laboratory designs incorporating very fast GaAs logic gates herald a significant role for GaAs technology in high-speed computer design:1 In this thesis we investigate design approaches to best exploit this promising technology in high-performance computer systems. We find significant differences between GaAs and Silicon technologies which are of relevance for computer design. The advantage that GaAs enjoys over Silicon in faster transistor switching speed is countered by a lower transistor count capability for GaAs integrated circuits. In addition, inter-chip signal propagation speeds in GaAs systems do not experience the same speedup exhibited by GaAs transistors; thus, GaAs designs are penalized more severely by inter-chip communication. The relatively low density of GaAs chips and the high cost of communication between them are significant obstacles to the full exploitation of the fast transistors of GaAs technology. A fast GaAs processor may be excessively underutilized unless special consideration is given to its information (instructions and data) requirements. Desirable GaAs system design approaches encourage low hardware resource requirements, and either minimize the processor’s need for off-chip information, maximize the rate of off-chip information transfer, or overlap off-chip information transfer with useful computation. We show the impact that these considerations have on the design of the instruction format, arithmetic unit, memory system, and compiler for a GaAs computer system. Through a simulation study utilizing a set of widely-used benchmark programs, we investigate several candidate instruction pipelines and candidate instruction formats in a GaAs environment. We demonstrate the clear performance advantage of an instruction pipeline based upon a pipelined memory system over a typical Silicon-like pipeline. We also show the performance advantage of packed instruction formats over typical Silicon instruction formats, and present a packed format which performs better than the experimental packed Stanford MIPS format

    FIR filter optimization for video processing on FPGAs

    Get PDF

    Design and Analysis of an Adaptive Asynchronous System Architecture for Energy Efficiency

    Get PDF
    Power has become a critical design parameter for digital CMOS integrated circuits. With performance still garnering much concern, a central idea has emerged: minimizing power consumption while maintaining performance. The use of dynamic voltage scaling (DVS) with parallelism has shown to be an effective way of saving power while maintaining performance. However, the potency of DVS and parallelism in traditional, clocked synchronous systems is limited because of the strict timing requirements such systems must comply with. Delay-insensitive (DI) asynchronous systems have the potential to benefit more from these techniques due to their flexible timing requirements and high modularity. This dissertation presents the design and analysis of a real-time adaptive DVS architecture for paralleled Multi-Threshold NULL Convention Logic (MTNCL) systems. Results show that energy-efficient systems with low area overhead can be created using this approach

    Automatic rapid prototyping of semi-custom VLSI circuits using FPGAs

    Get PDF
    Journal ArticleWe describe a technique for translating semi-custom VLSI circuits automatically, integrating two design environments, into field programmable gate arrays (FPGAs) for rapid and inexpensive prototyping. The VLSI circuits are designed using a cell-matrix based environment that produces chips with density comparable to full custom VLSI design. These circuits are translated automatically into FPGAs for testing and system development. A four-bit pipelined array multiplier is used as an example of this translation. The multiplier is implemented in CMOS in both synchronous and asynchronous pipelined versions, and translated into Actel FPGAs both automatically, and by hand for comparison. The six test chips were all found to be fully functional, and the translation efficiency in terms of chip speed and area is shown. This result demonstrates the potential of this approach to system development

    Ultra-Low Power and Radiation Hardened Asynchronous Circuit Design

    Get PDF
    This dissertation proposes an ultra-low power design methodology called bit-wise MTNCL for bit-wise pipelined asynchronous circuits, which combines multi-threshold CMOS (MTCMOS) with bit-wise pipelined NULL Convention Logic (NCL) systems. It provides the leakage power advantages of an all high-Vt implementation with a reasonable speed penalty compared to the all low-Vt implementation, and has negligible area overhead. It was enhanced to handle indeterminate standby states. The original MTNCL concept was enhanced significantly by sleeping Registers and Completion Logic as well as Combinational circuits to reduce area, leakage power, and energy per operation. This dissertation also develops an architecture that allows NCL circuits to recover from a Single Event Upset (SEU) or Single Event Latchup (SEL) fault without any data loss. Finally, an accurate throughput derivation formula for pipelined NCL circuits was developed, which can be used for static timing analysis
    • …
    corecore