373 research outputs found

    Power Reduction Techniques in Clock Distribution Networks with Emphasis on LC Resonant Clocking

    Get PDF
    In this thesis we propose a set of independent techniques in the overall concept of LC resonant clocking where each technique reduces power consumption and improve system performance. Low-power design is becoming a crucial design objective due to the growing demand on portable applications and the increasing difficulties in cooling and heat removal. The clock distribution network delivers the clock signal which acts as a reference to all sequential elements in the synchronous system. The clock distribution network consumes a considerable amount of power in synchronous digital systems. Resonant clocking is an emerging promising technique to reduce the power of the clock network. The inductor used in resonant clocking enables the conversion of the electric energy stored on the clock capacitance to magnetic energy in the inductor and vice versa. In this thesis, the concept of the slack in the clock skew has been extended for an LC fully-resonant clock distribution network. This extra slack in comparison to standard clock distribution networks can be used to reduce routing complexity, achieve reduction in wire elongation, total wire length, and power consumption. Simulation results illustrate that by utilizing the proposed approach, an average reduction of 53% in the number of wire elongations and 11% reduction in total wire length can be achieved. A dual-edge clocking scheme introduced in the literature to enable the operation of the flip-flop at the rising- and falling edges of the clock has been modified. The interval by which the charging elements in the flip-flop are being switched-on was reduced causing a reduction in power consumption. Simulating the flip-flop in STMicroelectronics 90-nm technology shows correct functionality of the Sense Amplifier flip-flop with a resonant clock signal of 500 MHz and a throughput of 1 GHz under process, voltage, and temperature (PVT) variations. Modeling the resonant system with the proposed flip-flop illustrates that dual-edge compared to single-edge triggering can achieve up to 58% reduction in power consumption when the clock capacitance is the dominating factor. The application of low-swing clocking to LC resonant clock distribution network has been investigated on-chip. The proposed low-swing resonant clocking scheme operates with one voltage supply and does not require an additional supply voltage. The Differential Conditional Capturing flip-flop introduced in the literature was modified to operate with a low-swing sinusoidal clock. Low-swing resonant clocking achieved around 5.8% reduction in total power with 5.7% area overhead. Modeling the clock network with the proposed flip-flop illustrates that low-swing clocking can achieve up to 58% reduction in the power consumption of the resonant clock. An analytical approach was introduced to estimate the required driver strength in the clock generator. Using the proposed approach early in the design stage reduces area and power overhead by eliminating the need for programmable switches in the driving circuit

    Design of variation-tolerant synchronizers for multiple clock and voltage domains

    Get PDF
    PhD ThesisParametric variability increasingly affects the performance of electronic circuits as the fabrication technology has reached the level of 32nm and beyond. These parameters may include transistor Process parameters (such as threshold voltage), supply Voltage and Temperature (PVT), all of which could have a significant impact on the speed and power consumption of the circuit, particularly if the variations exceed the design margins. As systems are designed with more asynchronous protocols, there is a need for highly robust synchronizers and arbiters. These components are often used as interfaces between communication links of different timing domains as well as sampling devices for asynchronous inputs coming from external components. These applications have created a need for new robust designs of synchronizers and arbiters that can tolerate process, voltage and temperature variations. The aim of this study was to investigate how synchronizers and arbiters should be designed to tolerate parametric variations. All investigations focused mainly on circuit-level and transistor level designs and were modeled and simulated in the UMC90nm CMOS technology process. Analog simulations were used to measure timing parameters and power consumption along with a “Monte Carlo” statistical analysis to account for process variations. Two main components of synchronizers and arbiters were primarily investigated: flip-flop and mutual-exclusion element (MUTEX). Both components can violate the input timing conditions, setup and hold window times, which could cause metastability inside their bistable elements and possibly end in failures. The mean-time between failures is an important reliability feature of any synchronizer delay through the synchronizer. The MUTEX study focused on the classical circuit, in addition to a number of tolerance, based on increasing internal gain by adding current sources, reducing the capacitive loading, boosting the transconductance of the latch, compensating the existing Miller capacitance, and adding asymmetry to maneuver the metastable point. The results showed that some circuits had little or almost no improvements, while five techniques showed significant improvements by reducing τ and maintaining high tolerance. Three design approaches are proposed to provide variation-tolerant synchronizers. wagging synchronizer proposed to First, the is significantly increase reliability over that of the conventional two flip-flop synchronizer. The robustness of the wagging technique can be enhanced by using robust τ latches or adding one more cycle of synchronization. The second approach is the Metastability Auto-Detection and Correction (MADAC) latch which relies on swiftly detecting a metastable event and correcting it by enforcing the previously stored logic value. This technique significantly reduces the resolution time down from uncertain synchronization technique is proposed to transfer signals between Multiple- Voltage Multiple-Clock Domains (MVD/MCD) that do not require conventional level-shifters between the domains or multiple power supplies within each domain. This interface circuit uses a synchronous set and feedback reset protocol which provides level-shifting and synchronization of all signals between the domains, from a wide range of voltage-supplies and clock frequencies. Overall, synchronizer circuits can tolerate variations to a greater extent by employing the wagging technique or using a MADAC latch, while MUTEX tolerance can suffice with small circuit modifications. Communication between MVD/MCD can be achieved by an asynchronous handshake without a need for adding level-shifters.The Saudi Arabian Embassy in London, Umm Al-Qura University, Saudi Arabi

    Ultra Low Power Digital Circuit Design for Wireless Sensor Network Applications

    Get PDF
    Ny forskning innenfor feltet trådløse sensornettverk åpner for nye og innovative produkter og løsninger. Biomedisinske anvendelser er blant områdene med størst potensial og det investeres i dag betydelige beløp for å bruke denne teknologien for å gjøre medisinsk diagnostikk mer effektiv samtidig som man åpner for fjerndiagnostikk basert på trådløse sensornoder integrert i et ”helsenett”. Målet er å forbedre tjenestekvalitet og redusere kostnader samtidig som brukerne skal oppleve forbedret livskvalitet som følge av økt trygghet og mulighet for å tilbringe mest mulig tid i eget hjem og unngå unødvendige sykehusbesøk og innleggelser. For å gjøre dette til en realitet er man avhengige av sensorelektronikk som bruker minst mulig energi slik at man oppnår tilstrekkelig batterilevetid selv med veldig små batterier. I sin avhandling ” Ultra Low power Digital Circuit Design for Wireless Sensor Network Applications” har PhD-kandidat Farshad Moradi fokusert på nye løsninger innenfor konstruksjon av energigjerrig digital kretselektronikk. Avhandlingen presenterer nye løsninger både innenfor aritmetiske og kombinatoriske kretser, samtidig som den studerer nye statiske minneelementer (SRAM) og alternative minnearkitekturer. Den ser også på utfordringene som oppstår når silisiumteknologien nedskaleres i takt med mikroprosessorutviklingen og foreslår løsninger som bidrar til å gjøre kretsløsninger mer robuste og skalerbare i forhold til denne utviklingen. De viktigste konklusjonene av arbeidet er at man ved å introdusere nye konstruksjonsteknikker både er i stand til å redusere energiforbruket samtidig som robusthet og teknologiskalerbarhet øker. Forskningen har vært utført i samarbeid med Purdue University og vært finansiert av Norges Forskningsråd gjennom FRINATprosjektet ”Micropower Sensor Interface in Nanometer CMOS Technology”

    Aika-digitaalimuunnin laajakaistaisiin aikapohjaisiin analogia-digitaalimuuntimiin

    Get PDF
    Modern deeply scaled semiconductor processes make the design of voltage-domain circuits increasingly challenging. On the contrary, the area and power consumption of digital circuits are improving with every new process node. Consequently, digital solutions are designed in place of their purely analog counterparts in applications such as analog-to-digital (A/D) conversion. Time-based analog-to-digital converters (ADC) employ digital-intensive architectures by processing analog quantities in time-domain. The quantization step of the time-based A/D-conversion is carried out by a time-to-digital converter (TDC). A free-running ring oscillator -based TDC design is presented for use in wideband time-based ADCs. The proposed architecture aims to maximize time resolution and full-scale range, and to achieve error resilient conversion performance with minimized power and area consumptions. The time resolution is maximized by employing a high-frequency multipath ring oscillator, and the full-scale range is extended using a high-speed gray counter. The error resilience is achieved by custom sense-amplifier -based sampling flip-flops, gray coded counter and a digital error correction algorithm for counter sampling error correction. The implemented design achieves up to 9-bit effective resolution at 250 MS/s with 4.3 milliwatt power consumption.Modernien puolijohdeteknologioiden skaalautumisen seurauksena jännitetason piirien suunnittelu tulee entistä haasteellisemmaksi. Toisaalta digitaalisten piirirakenteiden pinta-ala sekä tehonkulutus pienenevät prosessikehityksen myötä. Tästä syystä digitaalisia ratkaisuja suunnitellaan vastaavien puhtaasti analogisien rakenteiden tilalle. Analogia-digitaalimuunnos (A/D-muunnos) voidaan toteuttaa jännitetason sijaan aikatasossa käyttämällä aikapohjaisia A/D-muuntimia, jotka ovat rakenteeltaan pääosin digitaalisia. Kvantisointivaihe aikapohjaisessa A/D-muuntimessa toteutetaan aika-digitaalimuuntimella. Työ esittelee vapaasti oskilloivaan silmukkaoskillaattoriin perustuvan aika-digitaalimuuntimen, joka on suunniteltu käytettäväksi laajakaistaisessa aikapohjaisessa A/D-muuntimessa. Esitelty rakenne pyrkii maksimoimaan muuntimen aikaresoluution sekä muunnosalueen, sekä saavuttamaan virhesietoisen muunnostoiminnan minimoidulla tehon sekä pinta-alan kulutuksella. Aikaresoluutio on maksimoitu hyödyntämällä suuritaajuista monipolkuista silmukkaoskillaattoria, ja muunnosalue on maksimoitu nopealla Gray-koodi -laskuripiirillä. Muunnosprosessin virhesietoisuus on saavutettu toteuttamalla näytteistys herkillä kiikkuelementeillä, hyödyntämällä Gray-koodattua laskuria, sekä jälkiprosessoimalla laskurin näytteistetyt arvot virheenkorjausalgoritmilla. Esitelty muunnintoteutus saavuttaa 9 bitin efektiivisen resoluution 250 MS/s näytetaajuudella ja 4.3 milliwatin tehonkulutuksella

    Integrated Circuit Design for Radiation Sensing and Hardening.

    Full text link
    Beyond the 1950s, integrated circuits have been widely used in a number of electronic devices surrounding people’s lives. In addition to computing electronics, scientific and medical equipment have also been undergone a metamorphosis, especially in radiation related fields where compact and precision radiation detection systems for nuclear power plants, positron emission tomography (PET), and radiation hardened by design (RHBD) circuits for space applications fabricated in advanced manufacturing technologies are exposed to the non-negligible probability of soft errors by radiation impact events. The integrated circuit design for radiation measurement equipment not only leads to numerous advantages on size and power consumption, but also raises many challenges regarding the speed and noise to replace conventional design modalities. This thesis presents solutions to front-end receiver designs for radiation sensors as well as an error detection and correction method to microprocessor designs under the condition of soft error occurrence. For the first preamplifier design, a novel technique that enhances the bandwidth and suppresses the input current noise by using two inductors is discussed. With the dual-inductor TIA signal processing configuration, one can reduce the fabrication cost, the area overhead, and the power consumption in a fast readout package. The second front-end receiver is a novel detector capacitance compensation technique by using the Miller effect. The fabricated CSA exhibits minimal variation in the pulse shape as the detector capacitance is increased. Lastly, a modified D flip-flop is discussed that is called Razor-Lite using charge-sharing at internal nodes to provide a compact EDAC design for modern well-balanced processors and RHBD against soft errors by SEE.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/111548/1/iykwon_1.pd

    High-Speed and Low-Energy On-Chip Communication Circuits.

    Full text link
    Continuous technology scaling sharply reduces transistor delays, while fixed-length global wire delays have increased due to less wiring pitch with higher resistance and coupling capacitance. Due to this ever growing gap, long on-chip interconnects pose well-known latency, bandwidth, and energy challenges to high-performance VLSI systems. Repeaters effectively mitigate wire RC effects but do little to improve their energy costs. Moreover, the increased complexity and high level of integration requires higher wire densities, worsening crosstalk noise and power consumption of conventionally repeated interconnects. Such increasing concerns in global on-chip wires motivate circuits to improve wire performance and energy while reducing the number of repeaters. This work presents circuit techniques and investigation for high-performance and energy-efficient on-chip communication in the aspects of encoding, data compression, self-timed current injection, signal pre-emphasis, low-swing signaling, and technology mapping. The improved bus designs also consider the constraints of robust operation and performance/energy gains across process corners and design space. Measurement results from 5mm links on 65nm and 90nm prototype chips validate 2.5-3X improvement in energy-delay product.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/75800/1/jseo_1.pd

    A novel approach to embedding memory with crosstalk logic circuits for high performance applications

    Get PDF
    Title from PDF of title page viewed January 20, 2022Thesis advisor: Mostafizur RahmanVitaIncludes bibliographical references (page 33-36)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2021One of the essential elements of computing is the memory element. Flip-flops form an integral part of a System-on-Chip (SoC) and consume most of the area on the die. To meet the high-speed performance demands by data-intensive applications like artificial intelligence, cloud computing, and machine learning, we intend to integrate memory with the logic to get built-in memory logic circuits that operate based on the crosstalk computing logic. These circuits are called Crosstalk Built-in Memory Logic (CBML) circuits which exploit the detrimental interconnect crosstalk and astutely turn this unwanted effect into a computing principle with embedded memory. By virtue of our novel CBML circuit technique, the logic is computed by the novel computing technique called Crosstalk Computing, and the result is stored intrinsically within these complex circuits. The stored values will be retained irrespective of the change in input until the next logic evaluation cycle. This neoteric embedding of memory in logic provides high-speed operation with a reduced number of transistors. In this work, we present how Crosstalk Computing can be leveraged to embed memory in logic. We have manifested the in-built memory feature of the complex CBML circuits using 16 nanometer (nm) PTM models in HSPICE. Benchmarking is performed with the equivalent static CMOS circuits to compare the number of transistors, performance, and power. It is observed that the number of transistors consumed by the CBML logic circuits like 3-input AND, OR, is 40% less than the equivalent CMOS circuits. The cascaded CBML 4-bit Full-Adder (the key element prevalent in Arithmetic circuits, e.g., ALU, Counters, etc.,) is up to 46% less, and performance is improved by 27% than the equivalent CMOS circuits. This circuit serves as an example of a large-scale CBML circuit. Also, the performance improvement achieved by other circuits such as 3-input AND and the CARRY logic is up to 60% along with a 40% reduction in the number of transistors. The qualitative analysis with the existing flip-flop architectures shows that the transistor count reduction of the CBML circuits is around 45% less than the architectures like Semi-dynamic Single-Phase Pulsed FF (SDFF), Dual dynamic node FF (DDFF) which embed logic into their flip-flop circuits. Additionally, these existing architectures do not have capability to embed complex circuits like full adder into a single flip-flop circuit. Hence, CBML circuits have the potential to pave the way for special high-speed macros with specifically engineered structures.Introduction and Motivation -- Basic Flip-Flop Memory -- Literature survey on Existing Flip-Flop Architectures With Embedded Logic -- Overview: Basic Crosstalk Circuits -- Crosstalk Logic With Embedded Memory Feature -- Comparison and Benchmarking -- Conclusion and Future Wor

    Robust low-power digital circuit design in nano-CMOS technologies

    Get PDF
    Device scaling has resulted in large scale integrated, high performance, low-power, and low cost systems. However the move towards sub-100 nm technology nodes has increased variability in device characteristics due to large process variations. Variability has severe implications on digital circuit design by causing timing uncertainties in combinational circuits, degrading yield and reliability of memory elements, and increasing power density due to slow scaling of supply voltage. Conventional design methods add large pessimistic safety margins to mitigate increased variability, however, they incur large power and performance loss as the combination of worst cases occurs very rarely. In-situ monitoring of timing failures provides an opportunity to dynamically tune safety margins in proportion to on-chip variability that can significantly minimize power and performance losses. We demonstrated by simulations two delay sensor designs to detect timing failures in advance that can be coupled with different compensation techniques such as voltage scaling, body biasing, or frequency scaling to avoid actual timing failures. Our simulation results using 45 nm and 32 nm technology BSIM4 models indicate significant reduction in total power consumption under temperature and statistical variations. Future work involves using dual sensing to avoid useless voltage scaling that incurs a speed loss. SRAM cache is the first victim of increased process variations that requires handcrafted design to meet area, power, and performance requirements. We have proposed novel 6 transistors (6T), 7 transistors (7T), and 8 transistors (8T)-SRAM cells that enable variability tolerant and low-power SRAM cache designs. Increased sense-amplifier offset voltage due to device mismatch arising from high variability increases delay and power consumption of SRAM design. We have proposed two novel design techniques to reduce offset voltage dependent delays providing a high speed low-power SRAM design. Increasing leakage currents in nano-CMOS technologies pose a major challenge to a low-power reliable design. We have investigated novel segmented supply voltage architecture to reduce leakage power of the SRAM caches since they occupy bulk of the total chip area and power. Future work involves developing leakage reduction methods for the combination logic designs including SRAM peripherals

    Robust Design of Variation-Sensitive Digital Circuits

    Get PDF
    The nano-age has already begun, where typical feature dimensions are smaller than 100nm. The operating frequency is expected to increase up to 12 GHz, and a single chip will contain over 12 billion transistors in 2020, as given by the International Technology Roadmap for Semiconductors (ITRS) initiative. ITRS also predicts that the scaling of CMOS devices and process technology, as it is known today, will become much more difficult as the industry advances towards the 16nm technology node and further. This aggressive scaling of CMOS technology has pushed the devices to their physical limits. Design goals are governed by several factors other than power, performance and area such as process variations, radiation induced soft errors, and aging degradation mechanisms. These new design challenges have a strong impact on the parametric yield of nanometer digital circuits and also result in functional yield losses in variation-sensitive digital circuits such as Static Random Access Memory (SRAM) and flip-flops. Moreover, sub-threshold SRAM and flip-flops circuits, which are aggravated by the strong demand for lower power consumption, show larger sensitivity to these challenges which reduces their robustness and yield. Accordingly, it is not surprising that the ITRS considers variability and reliability as the most challenging obstacles for nanometer digital circuits robust design. Soft errors are considered one of the main reliability and robustness concerns in SRAM arrays in sub-100nm technologies due to low operating voltage, small node capacitance, and high packing density. The SRAM arrays soft errors immunity is also affected by process variations. We develop statistical design-oriented soft errors immunity variations models for super-threshold and sub-threshold SRAM cells accounting for die-to-die variations and within-die variations. This work provides new design insights and highlights the important design knobs that can be used to reduce the SRAM cells soft errors immunity variations. The developed models are scalable, bias dependent, and only require the knowledge of easily measurable parameters. This makes them useful in early design exploration, circuit optimization as well as technology prediction. The derived models are verified using Monte Carlo SPICE simulations, referring to an industrial hardware-calibrated 65nm CMOS technology. The demand for higher performance leads to very deep pipelining which means that hundreds of thousands of flip-flops are required to control the data flow under strict timing constraints. A violation of the timing constraints at a flip-flop can result in latching incorrect data causing the overall system to malfunction. In addition, the flip-flops power dissipation represents a considerable fraction of the total power dissipation. Sub-threshold flip-flops are considered the most energy efficient solution for low power applications in which, performance is of secondary importance. Accordingly, statistical gate sizing is conducted to different flip-flops topologies for timing yield improvement of super-threshold flip-flops and power yield improvement of sub-threshold flip-flops. Following that, a comparative analysis between these flip-flops topologies considering the required overhead for yield improvement is performed. This comparative analysis provides useful recommendations that help flip-flops designers on selecting the best flip-flops topology that satisfies their system specifications while taking the process variations impact and robustness requirements into account. Adaptive Body Bias (ABB) allows the tuning of the transistor threshold voltage, Vt, by controlling the transistor body voltage. A forward body bias reduces Vt, increasing the device speed at the expense of increased leakage power. Alternatively, a reverse body bias increases Vt, reducing the leakage power but slowing the device. Therefore, the impact of process variations is mitigated by speeding up slow and less leaky devices or slowing down devices that are fast and highly leaky. Practically, the implementation of the ABB is desirable to bias each device in a design independently, to mitigate within-die variations. However, supplying so many separate voltages inside a die results in a large area overhead. On the other hand, using the same body bias for all devices on the same die limits its capability to compensate for within-die variations. Thus, the granularity level of the ABB scheme is a trade-off between the within-die variations compensation capability and the associated area overhead. This work introduces new ABB circuits that exhibit lower area overhead by a factor of 143X than that of previous ABB circuits. In addition, these ABB circuits are resolution free since no digital-to-analog converters or analog-to-digital converters are required on their implementations. These ABB circuits are adopted to high performance critical paths, emulating a real microprocessor architecture, for process variations compensation and also adopted to SRAM arrays, for Negative Bias Temperature Instability (NBTI) aging and process variations compensation. The effectiveness of the new ABB circuits is verified by post layout simulation results and test chip measurements using triple-well 65nm CMOS technology. The highly capacitive nodes of wide fan-in dynamic circuits and SRAM bitlines limit the performance of these circuits. In addition, process variations mitigation by statistical gate sizing increases this capacitance further and fails in achieving the target yield improvement. We propose new negative capacitance circuits that reduce the overall parasitic capacitance of these highly capacitive nodes. These negative capacitance circuits are adopted to wide fan-in dynamic circuits for timing yield improvement up to 99.87% and to SRAM arrays for read access yield improvement up to 100%. The area and power overheads of these new negative capacitance circuits are amortized over the large die area of the microprocessor and the SRAM array. The effectiveness of the new negative capacitance circuits is verified by post layout simulation results and test chip measurements using 65nm CMOS technology
    corecore