Search CORE

285 research outputs found

CMOS array design automation techniques

Author: Feller A.
Lombardi T.
Noto R.
Ramondetta P.
Publication venue
Publication date
Field of study

A low cost, quick turnaround technique for generating custom metal oxide semiconductor arrays using the standard cell approach was developed, implemented, tested and validated. Basic cell design topology and guidelines are defined based on an extensive analysis that includes circuit, layout, process, array topology and required performance considerations particularly high circuit speed

Near-Threshold Computing: Past, Present, and Future.

Author: Pinckney Nathaniel Ross
Publication venue
Publication date
Field of study

Transistor threshold voltages have stagnated in recent years, deviating from constant-voltage scaling theory and directly limiting supply voltage scaling. To overcome the resulting energy and power dissipation barriers, energy efficiency can be improved through aggressive voltage scaling, and there has been increased interest in operating at near-threshold computing (NTC) supply voltages. In this region sizable energy gains are achieved with moderate performance loss, some of which can be regained through parallelism. This thesis first provides a methodical definition of how near to threshold is "near threshold" and continues with an in-depth examination of NTC across past, present, and future CMOS technologies. By systematically defining near-threshold, the trends and tradeoffs are analyzed, lending insight in how best to design and optimize near-threshold systems. NTC works best for technologies that feature good circuit delay scalability, therefore technologies without strong short-channel effects. Early planar technologies (prior to 90nm or so) featured good circuit scalability (8x energy gains), but lacked area in which to add cores for parallelization. Recent planar nodes (32nm – 20nm) feature more area for cores but suffer from poor delay scalability, and so are not well-suited for NTC (4x energy gains). The switch to FinFET CMOS technology allows for a return to strong voltage scalability (8x gain), reversing trends seen in planar technologies, while dark silicon has created an opportunity to add cores for parallelization. Improved FinFET voltage scalability even allows for latency reduction of a single task, as long as the task is sufficiently parallelizable (< 10% serial code). Finally, we will look at a technique for fast voltage boosting, called Shortstop, in which a core's operating voltage is raised in 10s of cycles. Shortstop can be used to quickly respond to single-threaded performance demands of a near-threshold system by leveraging the innate parasitic inductance of a dedicated dirty supply rail, further improving energy efficiency. The technique is demonstrated in a wirebond implementation and is able to boost a core up to 1.8x faster than a header-based approach, while reducing supply droop by 2-7x. An improved flip-chip architecture is also proposed.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113600/1/npfet_1.pd

Recommended from our members

Cross-Layer Pathfinding for Off-Chip Interconnects

Author: Srinivas Vaishnav
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Off-chip interconnects for integrated circuits (ICs) today induce a diverse design space, spanning many different applications that require transmission of data at various bandwidths, latencies and link lengths. Off-chip interconnect design solutions are also variously sensitive to system performance, power and cost metrics, while also having a strong impact on these metrics. The costs associated with off-chip interconnects include die area, package (PKG) and printed circuit board (PCB) area, technology and bill of materials (BOM). Choices made regarding off-chip interconnects are fundamental to product definition, architecture, design implementation and technology enablement. Given their cross-layer impact, it is imperative that a cross-layer approach be employed to architect and analyze off-chip interconnects up front, so that a top-down design flow can comprehend the cross-layer impacts and correctly assess the system performance, power and cost tradeoffs for off-chip interconnects. Chip architects are not exposed to all the tradeoffs at the physical and circuit implementation or technology layers, and often lack the tools to accurately assess off-chip interconnects. Furthermore, the collaterals needed for a detailed analysis are often lacking when the chip is architected; these include circuit design and layout, PKG and PCB layout, and physical floorplan and implementation. To address the need for a framework that enables architects to assess the system-level impact of off-chip interconnects, this thesis presents power-area-timing (PAT) models for off-chip interconnects, optimization and planning tools with the appropriate abstraction using these PAT models, and die/PKG/PCB co-design methods that help expose the off-chip interconnect cross-layer metrics to the die/PKG/PCB design flows. Together, these models, tools and methods enable cross-layer optimization that allows for a top-down definition and exploration of the design space and helps converge on the correct off-chip interconnect implementation and technology choice. The tools presented cover off-chip memory interfaces for mobile and server products, silicon photonic interfaces, 2.5D silicon interposers and 3D through-silicon vias (TSVs). The goal of the cross-layer framework is to assess the key metrics of the interconnect (such as timing, latency, active/idle/sleep power, and area/cost) at an appropriate level of abstraction by being able to do this across layers of the design flow. In additional to signal interconnect, this thesis also explores the need for such cross-layer pathfinding for power distribution networks (PDN), where the system-on-chip (SoC) floorplan and pinmap must be optimized before the collateral layouts for PDN analysis are ready. Altogether, the developed cross-layer pathfinding methodology for off-chip interconnects enables more rapid and thorough exploration of a vast design space of off-chip parallel and serial links, inter-die and inter-chiplet links and silicon photonics. Such exploration will pave the way for off-chip interconnect technology enablement that is optimized for system needs. The basis of the framework can be extended to cover other interconnect technology as well, since it fundamentally relates to system-level metrics that are common to all off-chip interconnects

eScholarship - University of California

Low-power spatial computing using dynamic threshold devices

Author: Beckett P
Publication venue: IEEE (Piscataway, NJ)
Publication date: 01/01/2005
Field of study

Asynchronous spatial computing systems exhibit only localized communication, their overall data-flow being controlled by handshaking. It is therefore straightforward to determine when a particular part of such a system is active. We show that using thin-body double-gate fully depleted SOI transistors, the shift in threshold voltage that can be produced by modulating the back-gate bias is sufficient to reduce subthreshold leakage power by a factor of more than 104 in typical circuits. Using TBFDSOI devices in spatial computing architectures will allow overall power to be greatly reduced while maintaining high performance

The Topological Processor for the future ATLAS Level-1 Trigger: from design to commissioning

Author: Simioni Eduard
Publication venue
Publication date: 01/01/2014
Field of study

The ATLAS detector at LHC will require a Trigger system to efficiently select events down to a manageable event storage rate of about 400 Hz. By 2015 the LHC instantaneous luminosity will be increased up to 3 x 10^34 cm-2s-1, this represents an unprecedented challenge faced by the ATLAS Trigger system. To cope with the higher event rate and efficiently select relevant events from a physics point of view, a new element will be included in the Level-1 Trigger scheme after 2015: the Topological Processor (L1Topo). The L1Topo system, currently developed at CERN, will consist initially of an ATCA crate and two L1Topo modules. A high density opto-electroconverter (AVAGO miniPOD) drives up to 1.6 Tb/s of data from the calorimeter and muon detectors into two high-end FPGA (Virtex7-690), to be processed in about 200 ns. The design has been optimized to guarantee excellent signal in- tegrity of the high-speed links and low latency data transmission on the Real Time Data Path (RTDP). The L1Topo receives data in a standalone protocol from the calorimeters and muon detectors to be processed into several VHDL topological algorithms. Those algorithms perform geometrical cuts, correlations and calculate complex observables such as the invariant mass. The output of such topological cuts is sent to the Central Trigger Processor. This talk focuses on the relevant high-density design characteristic of L1Topo, which allows several hundreds optical links to processed (up to 13 Gb/s each) using ordinary PCB material. Relevant test results performed on the L1Topo prototypes to characterize the high-speed links latency (eye diagram, bit error rate, margin analysis) and the logic resource utilization of the algorithms are discussed.Comment: 5 pages, 6 figure

arXiv.org e-Print Archive

Analysis and Design of Resilient VLSI Circuits

Author: Garg Rajesh
Publication venue
Publication date
Field of study

The reliable operation of Integrated Circuits (ICs) has become increasingly difficult to achieve in the deep sub-micron (DSM) era. With continuously decreasing device feature sizes, combined with lower supply voltages and higher operating frequencies, the noise immunity of VLSI circuits is decreasing alarmingly. Thus, VLSI circuits are becoming more vulnerable to noise effects such as crosstalk, power supply variations and radiation-induced soft errors. Among these noise sources, soft errors (or error caused by radiation particle strikes) have become an increasingly troublesome issue for memory arrays as well as combinational logic circuits. Also, in the DSM era, process variations are increasing at an alarming rate, making it more difficult to design reliable VLSI circuits. Hence, it is important to efficiently design robust VLSI circuits that are resilient to radiation particle strikes and process variations. The work presented in this dissertation presents several analysis and design techniques with the goal of realizing VLSI circuits which are tolerant to radiation particle strikes and process variations. This dissertation consists of two parts. The first part proposes four analysis and two design approaches to address radiation particle strikes. The analysis techniques for the radiation particle strikes include: an approach to analytically determine the pulse width and the pulse shape of a radiation induced voltage glitch in combinational circuits, a technique to model the dynamic stability of SRAMs, and a 3D device-level analysis of the radiation tolerance of voltage scaled circuits. Experimental results demonstrate that the proposed techniques for analyzing radiation particle strikes in combinational circuits and SRAMs are fast and accurate compared to SPICE. Therefore, these analysis approaches can be easily integrated in a VLSI design flow to analyze the radiation tolerance of such circuits, and harden them early in the design flow. From 3D device-level analysis of the radiation tolerance of voltage scaled circuits, several non-intuitive observations are made and correspondingly, a set of guidelines are proposed, which are important to consider to realize radiation hardened circuits. Two circuit level hardening approaches are also presented to harden combinational circuits against a radiation particle strike. These hardening approaches significantly improve the tolerance of combinational circuits against low and very high energy radiation particle strikes respectively, with modest area and delay overheads. The second part of this dissertation addresses process variations. A technique is developed to perform sensitizable statistical timing analysis of a circuit, and thereby improve the accuracy of timing analysis under process variations. Experimental results demonstrate that this technique is able to significantly reduce the pessimism due to two sources of inaccuracy which plague current statistical static timing analysis (SSTA) tools. Two design approaches are also proposed to improve the process variation tolerance of combinational circuits and voltage level shifters (which are used in circuits with multiple interacting power supply domains), respectively. The variation tolerant design approach for combinational circuits significantly improves the resilience of these circuits to random process variations, with a reduction in the worst case delay and low area penalty. The proposed voltage level shifter is faster, requires lower dynamic power and area, has lower leakage currents, and is more tolerant to process variations, compared to the best known previous approach. In summary, this dissertation presents several analysis and design techniques which significantly augment the existing work in the area of resilient VLSI circuit design

Operational experience, improvements, and performance of the CDF Run II silicon vertex detector

The Collider Detector at Fermilab (CDF) pursues a broad physics program at Fermilab's Tevatron collider. Between Run II commissioning in early 2001 and the end of operations in September 2011, the Tevatron delivered 12 fb-1 of integrated luminosity of p-pbar collisions at sqrt(s)=1.96 TeV. Many physics analyses undertaken by CDF require heavy flavor tagging with large charged particle tracking acceptance. To realize these goals, in 2001 CDF installed eight layers of silicon microstrip detectors around its interaction region. These detectors were designed for 2--5 years of operation, radiation doses up to 2 Mrad (0.02 Gy), and were expected to be replaced in 2004. The sensors were not replaced, and the Tevatron run was extended for several years beyond its design, exposing the sensors and electronics to much higher radiation doses than anticipated. In this paper we describe the operational challenges encountered over the past 10 years of running the CDF silicon detectors, the preventive measures undertaken, and the improvements made along the way to ensure their optimal performance for collecting high quality physics data. In addition, we describe the quantities and methods used to monitor radiation damage in the sensors for optimal performance and summarize the detector performance quantities important to CDF's physics program, including vertex resolution, heavy flavor tagging, and silicon vertex trigger performance.Comment: Preprint accepted for publication in Nuclear Instruments and Methods A (07/31/2013

arXiv.org e-Print Archive

eScholarship - University of California

Oxford University Research Archive

Generadores de pulso del orden de nanosegundos para control de calidad y diagnosis de las cámaras de telescopios Cherenkov

Author: Vegas Azcárate Ignacio
Publication venue: 'Universidad Complutense de Madrid (UCM)'
Publication date: 22/01/2016
Field of study

Tesis inédita de la Universidad Complutense de Madrid, Facultad de Ciencias Físicas, Departamento de Física Aplicada III (Electricidad y Electrónica), leída el 30-11-2015Depto. de Estructura de la Materia, Física Térmica y ElectrónicaFac. de Ciencias FísicasTRUEunpu

A Charge-Recycling Scheme and Ultra Low Voltage Self-Startup Charge Pump for Highly Energy Efficient Mixed Signal Systems-On-A-Chip

Author: Ulaganathan Chandradevi
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2012
Field of study

The advent of battery operated sensor-based electronic systems has provided a pressing need to design energy-efficient, ultra-low power integrated circuits as a means to improve the battery lifetime. This dissertation describes a scheme to lower the power requirement of a digital circuit through the use of charge-recycling and dynamic supply-voltage scaling techniques. The novel charge-recycling scheme proposed in this research demonstrates the feasibility of operating digital circuits using the charge scavenged from the leakage and dynamic load currents inherent to digital design. The proposed scheme efficiently gathers the “ground-bound” charge into storage capacitor banks. This reclaimed charge is then subsequently recycled to power the source digital circuit. The charge-recycling methodology has been implemented on a 12-bit Gray-code counter operating at frequencies of less than 50 MHz. The circuit has been designed in a 90-nm process and measurement results reveal more than 41% reduction in the average energy consumption of the counter. The total energy savings including the power consumed for the generation of control signals aggregates to an average of 23%. The proposed methodology can be applied to an existing digital path without any design change to the circuit but with only small loss to the performance. Potential applications of this scheme are described, specifically in wide-temperature dynamic power reduction and as a source for energy harvesters. The second part of this dissertation deals with the design and development of a self-starting, ultra-low voltage, switched-capacitor (SC) DC-DC converter that is essential to an energy harvesting system. The proposed charge-pump based SC-converter operates from 125-mV input and thus enables battery-less operation in ultra-low voltage energy harvesters. The charge pump does not require any external components or expensive post-fabrication processing to enable low-voltage operation. This design has been implemented in a 130-nm CMOS process. While the proposed charge pump provides significant efficiency enhancement in energy harvesters, it can also be incorporated within charge recycling systems to facilitate adaptable charge-recycling levels. In total, this dissertation provides key components needed for highly energy-efficient mixed signal systems-on-a-chip