Search CORE

4,172 research outputs found

A Reconfigurable Digital Multiplier and 4:2 Compressor Cells Design

Author: Chang Peng
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2008
Field of study

With the continually growing use of portable computing devices and increasingly complex software applications, there is a constant push for low power high speed circuitry to support this technology. Because of the high usage and large complex circuitry required to carry out arithmetic operations used in applications such as digital signal processing, there has been a great focus on increasing the efficiency of computer arithmetic circuitry. A key player in the realm of computer arithmetic is the digital multiplier and because of its size and power consumption, it has moved to the forefront of today\u27s research. A digital reconfigurable multiplier architecture will be introduced. Regulated by a 2-bit control signal, the multiplier is capable of double and single precision multiplication, as well as fault tolerant and dual throughput single precision execution. The architecture proposed in this thesis is centered on a recursive multiplication algorithm, where a large multiplication is carried out using recursions of simpler submultiplier modules. Within each sub-multiplier module, instead of carry save adder arrays, 4:2 compressor rows are utilized for partial product reduction, which present greater efficiency, thus result in lower delay and power consumption of the whole multiplier. In addition, a study of various digital logic circuit styles are initially presented, and then three different designs of 4:2 compressor in Domino Logic are presented and simulation results confirm the property of proposed design in terms of delay, power consumption and operation frequenc

Scholarship at UWindsor

A versatile, scalable, and open memory architecture in CMOS 0.18 μm

Author: Leboeuf Karl
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2008
Field of study

A lookup table is a permanent memory storate element in which every stored value corresponds to a unique address. Range addressable lookup tables differ in that every stored value corresponds to a range of addresses. This type of memory has important applications in a recently proposed central processing unit which employs a multi-digit logarithmic number system that is well suited for digital signal processing applications. This thesis details the work done to improve range addressable lookup tables in terms of operating speed and area utilization. Two range addressable lookup table designs are proposed. Ideal design parameters are determined. An integrated circuit test platform is proposed to determine the real-world ability of these lookup tables. A case study exploring how non-linear functions can be approximated with range addressable lookup tables is presented

Scholarship at UWindsor

A Structured Design Methodology for High Performance VLSI Arrays

Author
Publication venue
Publication date: 01/01/2012
Field of study

abstract: The geometric growth in the integrated circuit technology due to transistor scaling also with system-on-chip design strategy, the complexity of the integrated circuit has increased manifold. Short time to market with high reliability and performance is one of the most competitive challenges. Both custom and ASIC design methodologies have evolved over the time to cope with this but the high manual labor in custom and statistic design in ASIC are still causes of concern. This work proposes a new circuit design strategy that focuses mostly on arrayed structures like TLB, RF, Cache, IPCAM etc. that reduces the manual effort to a great extent and also makes the design regular, repetitive still achieving high performance. The method proposes making the complete design custom schematic but using the standard cells. This requires adding some custom cells to the already exhaustive library to optimize the design for performance. Once schematic is finalized, the designer places these standard cells in a spreadsheet, placing closely the cells in the critical paths. A Perl script then generates Cadence Encounter compatible placement file. The design is then routed in Encounter. Since designer is the best judge of the circuit architecture, placement by the designer will allow achieve most optimal design. Several designs like IPCAM, issue logic, TLB, RF and Cache designs were carried out and the performance were compared against the fully custom and ASIC flow. The TLB, RF and Cache were the part of the HEMES microprocessor.Dissertation/ThesisPh.D. Electrical Engineering 201

ASU Digital Repository

STUDY OF SINGLE-EVENT EFFECTS ON DIGITAL SYSTEMS

Author: Wang Haibin
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

Microelectronic devices and systems have been extensively utilized in a variety of radiation environments, ranging from the low-earth orbit to the ground level. A high-energy particle from such an environment may cause voltage/current transients, thereby inducing Single Event Effect (SEE) errors in an Integrated Circuit (IC). Ever since the first SEE error was reported in 1975, this community has made tremendous progress in investigating the mechanisms of SEE and exploring radiation tolerant techniques. However, as the IC technology advances, the existing hardening techniques have been rendered less effective because of the reduced spacing and charge sharing between devices. The Semiconductor Industry Association (SIA) roadmap has identified radiation-induced soft errors as the major threat to the reliable operation of electronic systems in the future. In digital systems, hardening techniques of their core components, such as latches, logic, and clock network, need to be addressed. Two single event tolerant latch designs taking advantage of feedback transistors are presented and evaluated in both single event resilience and overhead. These feedback transistors are turned OFF in the hold mode, thereby yielding a very large resistance. This, in turn, results in a larger feedback delay and higher single event tolerance. On the other hand, these extra transistors are turned ON when the cell is in the write mode. As a result, no significant write delay is introduced. Both designs demonstrate higher upset threshold and lower cross-section when compared to the reference cells. Dynamic logic circuits have intrinsic single event issues in each stage of the operations. The worst case occurs when the output is evaluated logic high, where the pull-up networks are turned OFF. In this case, the circuit fails to recover the output by pulling the output up to the supply rail. A capacitor added to the feedback path increases the node capacitance of the output and the feedback delay, thereby increasing the single event critical charge. Another differential structure that has two differential inputs and outputs eliminates single event upset issues at the expense of an increased number of transistors. Clock networks in advanced technology nodes may cause significant errors in an IC as the devices are more sensitive to single event strikes. Clock mesh is a widely used clocking scheme in a digital system. It was fabricated in a 28nm technology and evaluated through the use of heavy ions and laser irradiation experiments. Superior resistance to radiation strikes was demonstrated during these tests. In addition to mitigating single event issues by using hardened designs, built-in current sensors can be used to detect single event induced currents in the n-well and, if implemented, subsequently execute fault correction actions. These sensors were simulated and fabricated in a 28nm CMOS process. Simulation, as well as, experimental results, substantiates the validity of this sensor design. This manifests itself as an alternative to existing hardening techniques. In conclusion, this work investigates single event effects in digital systems, especially those in deep-submicron or advanced technology nodes. New hardened latch, dynamic logic, clock, and current sensor designs have been presented and evaluated. Through the use of these designs, the single event tolerance of a digital system can be achieved at the expense of varying overhead in terms of area, power, and delay

eCommons@USASK

University of Saskatchewan Research Archive

Asynchronous design of a multi-dimensional logarithmic number system processor for digital hearing instruments.

Author: Wu Jiansong
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2004
Field of study

This thesis presents an asynchronous Multi-Dimensional Logarithmic Number System (MDLNS) processor that exhibits very low power dissipation. The target application is for a hearing instrument DSP. The MDLNS is a newly developed number system that has the advantage of reducing hardware complexity compared to the classical Logarithmic Number System (LNS). A synchronous implementation of a 2-digit 2DLNS filterbank, using the MDLNS to construct a FIR filterbank, has successfully proved that this novel number representation can benefit this digital hearing instrument application in the requirement of small size and low power. In this thesis we demonstrate that the combination of using the MDLNS, along with an asynchronous design methodology, produces impressive power savings compared to the previous synchronous design. A 4-phase bundled-data full-handshaking protocol is applied to the asynchronous control design. We adopt the Differential Cascade Voltage Switch Logic (DCVSL) circuit family for the design of the computation cells in this asynchronous MDLNS processor. Besides the asynchronous design methodology, we also use finite ring calculations to reduce adder bit-width to provide improvements compared to the previous MDLNS filterbank architecture. Spectre power simulation results from simulations of this asynchronous MDLNS processor demonstrate that over 70 percent power savings have been achieved compared to the synchronous design. This full-custom asynchronous MDLNS processor has been submitted for fabrication in the TSMC 0.18mum CMOS technology. A further contribution in this thesis is the development of a novel synchronizing method of design for testability (DfT), which is offered as a possible solution for asynchronous DfT methods.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .W85. Source: Masters Abstracts International, Volume: 43-01, page: 0288. Advisers: G. A. Jullien; W. C. Miller. Thesis (M.A.Sc.)--University of Windsor (Canada), 2004

Scholarship at UWindsor

Power Reductions with Energy Recovery Using Resonant Topologies

Author: Bezzam Ignatius S.A.
Publication venue: Scholar Commons
Publication date: 05/05/2015
Field of study

The problem of power densities in system-on-chips (SoCs) and processors has become more exacerbated recently, resulting in high cooling costs and reliability issues. One of the largest components of power consumption is the low skew clock distribution network (CDN), driving large load capacitance. This can consume as much as 70% of the total dynamic power that is lost as heat, needing elaborate sensing and cooling mechanisms. To mitigate this, resonant clocking has been utilized in several applications over the past decade. An improved energy recovering reconfigurable generalized series resonance (GSR) solution with all the critical support circuitry is developed in this work. This LC resonant clock driver is shown to save about 50% driver power (\u3e40% overall), on a 22nm process node and has 50% less skew than a non-resonant driver at 2GHz. It can operate down to 0.2GHz to support other energy savings techniques like dynamic voltage and frequency scaling (DVFS). As an example, GSR can be configured for the simpler pulse series resonance (PSR) operation to enable further power saving for double data rate (DDR) applications, by using de-skewing latches instead of flip-flop banks. A PSR based subsystem for 40% savings in clocking power with 40% driver active area reduction xii is demonstrated. This new resonant driver generates tracking pulses at each transition of clock for dual edge operation across DVFS. PSR clocking is designed to drive explicit-pulsed latches with negative setup time. Simulations using 45nm IBM/PTM device and interconnect technology models, clocking 1024 flip-flops show the reductions, compared to non-resonant clocking. DVFS range from 2GHz/1.3V to 200MHz/0.5V is obtained. The PSR frequency is set \u3e3× the clock rate, needing only 1/10th the inductance of prior-art LC resonance schemes. The skew reductions are achieved without needing to increase the interconnect widths owing to negative set-up times. Applications in data circuits are shown as well with a 90nm example. Parallel resonant and split-driver non-resonant configurations as well are derived from GSR. Tradeoffs in timing performance versus power, based on theoretical analysis, are compared for the first time and verified. This enables synthesis of an optimal topology for a given application from the GSR

Scholar Commons - Santa Clara University

Floating point multiply/add unit for the M-machine node processor

Author: Hartman Daniel K
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1996
Field of study

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.Includes bibliographical references (p. 177).by Daniel K. Hartman.M.Eng

DSpace@MIT

Energy Efficient Design for Deep Sub-micron CMOS VLSIs

Author: Elgebaly Mohamed
Publication venue: 'University of Waterloo'
Publication date: 01/01/2005
Field of study

Over the past decade, low power, energy efficient VLSI design has been the focal point of active research and development. The rapid technology scaling, the growing integration capacity, and the mounting active and leakage power dissipation are contributing to the growing complexity of modern VLSI design. Careful power planning on all design levels is required. This dissertation tackles the low-power, low-energy challenges in deep sub-micron technologies on the architecture and circuit levels. Voltage scaling is one of the most efficient ways for reducing power and energy. For ultra-low voltage operation, a new circuit technique which allows bulk CMOS circuits to work in the sub-0. 5V supply territory is presented. The threshold voltage of the slow PMOS transistor is controlled dynamically to get a lower threshold voltage during the active mode. Due to the reduced threshold voltage, switching speed becomes faster while active leakage current is increased. A technique to dynamically manage active leakage current is presented. Energy reduction resulting from using the proposed structure is demonstrated through simulations of different circuits with different levels of complexity. As technology scales, the mounting leakage current and degraded noise immunity impact performance especially that of high performance dynamic circuits. Dual threshold technology shows a good potential for leakage reduction while meeting performance goals. A model for optimally selecting threshold voltages and transistor sizes in wide fan-in dynamic circuits is presented. On the circuit level, a novel circuit level technique which handles the trade-off between noise immunity and energy dissipation for wide fan-in dynamic circuits is presented. Energy efficiency of the proposed wide fan-in dynamic circuit is further enhanced through efficient low voltage operation. Another direct consequence of technology scaling is the growing impact of interconnect parasitics and process variations on performance. Traditionally, worst case process, parasitics, and environmental conditions are considered. Designing for worst case guarantees a fail-safe operation but requires a large delay and voltage margins. This large margin can be recovered if the design can adapt to the actual silicon conditions. Dynamic voltage scaling is considered a key enabler in reducing such margin. An on-chip process identifier to recover the margin required due to process variations is described. The proposed architecture adjusts supply voltage using a hybrid between the one-time voltage setting and the continuous monitoring modes of operation. The interconnect impact on delay is minimized through a novel adaptive voltage scaling architecture. The proposed system recovers the large delay and voltage margins required by conventional systems by closely tracking the actual critical path at anytime. By tracking the actual critical path, the proposed system is robust and more energy efficient compared to both the conventional open-loop and closed-loop systems

University of Waterloo's Institutional Repository

Low power predictable memory and processing architectures

Author: Chen Jiaoyan
Publication venue: 'University College Cork'
Publication date: 01/01/2013
Field of study

Great demand in power optimized devices shows promising economic potential and draws lots of attention in industry and research area. Due to the continuously shrinking CMOS process, not only dynamic power but also static power has emerged as a big concern in power reduction. Other than power optimization, average-case power estimation is quite significant for power budget allocation but also challenging in terms of time and effort. In this thesis, we will introduce a methodology to support modular quantitative analysis in order to estimate average power of circuits, on the basis of two concepts named Random Bag Preserving and Linear Compositionality. It can shorten simulation time and sustain high accuracy, resulting in increasing the feasibility of power estimation of big systems. For power saving, firstly, we take advantages of the low power characteristic of adiabatic logic and asynchronous logic to achieve ultra-low dynamic and static power. We will propose two memory cells, which could run in adiabatic and non-adiabatic mode. About 90% dynamic power can be saved in adiabatic mode when compared to other up-to-date designs. About 90% leakage power is saved. Secondly, a novel logic, named Asynchronous Charge Sharing Logic (ACSL), will be introduced. The realization of completion detection is simplified considerably. Not just the power reduction improvement, ACSL brings another promising feature in average power estimation called data-independency where this characteristic would make power estimation effortless and be meaningful for modular quantitative average case analysis. Finally, a new asynchronous Arithmetic Logic Unit (ALU) with a ripple carry adder implemented using the logically reversible/bidirectional characteristic exhibiting ultra-low power dissipation with sub-threshold region operating point will be presented. The proposed adder is able to operate multi-functionally

Irish Universities

Cork Open Research Archive