285 research outputs found
CMOS array design automation techniques
A low cost, quick turnaround technique for generating custom metal oxide semiconductor arrays using the standard cell approach was developed, implemented, tested and validated. Basic cell design topology and guidelines are defined based on an extensive analysis that includes circuit, layout, process, array topology and required performance considerations particularly high circuit speed
Near-Threshold Computing: Past, Present, and Future.
Transistor threshold voltages have stagnated in recent years, deviating from constant-voltage scaling theory and directly limiting supply voltage scaling. To overcome the resulting energy and power dissipation barriers, energy efficiency can be improved through aggressive voltage scaling, and there has been increased interest in operating at near-threshold computing (NTC) supply voltages. In this region sizable energy gains are achieved with moderate performance loss, some of which can be regained through parallelism.
This thesis first provides a methodical definition of how near to threshold is "near threshold" and continues with an in-depth examination of NTC across past, present, and future CMOS technologies. By systematically defining near-threshold, the trends and tradeoffs are analyzed, lending insight in how best to design and optimize near-threshold systems.
NTC works best for technologies that feature good circuit delay scalability, therefore technologies without strong short-channel effects. Early planar technologies (prior to 90nm or so) featured good circuit scalability (8x energy gains), but lacked area in which to add cores for parallelization. Recent planar nodes (32nm – 20nm) feature more area for cores but suffer from poor delay scalability, and so are not well-suited for NTC (4x energy gains).
The switch to FinFET CMOS technology allows for a return to strong voltage scalability (8x gain), reversing trends seen in planar technologies, while dark silicon has created an opportunity to add cores for parallelization. Improved FinFET voltage scalability even allows for latency reduction of a single task, as long as the task is sufficiently parallelizable (< 10% serial code).
Finally, we will look at a technique for fast voltage boosting, called Shortstop, in which a core's operating voltage is raised in 10s of cycles. Shortstop can be used to quickly respond to single-threaded performance demands of a near-threshold system by leveraging the innate parasitic inductance of a dedicated dirty supply rail, further improving energy efficiency. The technique is demonstrated in a wirebond implementation and is able to boost a core up to 1.8x faster than a header-based approach, while reducing supply droop by 2-7x. An improved flip-chip architecture is also proposed.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113600/1/npfet_1.pd
Recommended from our members
Cross-Layer Pathfinding for Off-Chip Interconnects
Off-chip interconnects for integrated circuits (ICs) today induce a diverse design space, spanning many different applications that require transmission of data at various bandwidths, latencies and link lengths. Off-chip interconnect design solutions are also variously sensitive to system performance, power and cost metrics, while also having a strong impact on these metrics. The costs associated with off-chip interconnects include die area, package (PKG) and printed circuit board (PCB) area, technology and bill of materials (BOM). Choices made regarding off-chip interconnects are fundamental to product definition, architecture, design implementation and technology enablement. Given their cross-layer impact, it is imperative that a cross-layer approach be employed to architect and analyze off-chip interconnects up front, so that a top-down design flow can comprehend the cross-layer impacts and correctly assess the system performance, power and cost tradeoffs for off-chip interconnects. Chip architects are not exposed to all the tradeoffs at the physical and circuit implementation or technology layers, and often lack the tools to accurately assess off-chip interconnects. Furthermore, the collaterals needed for a detailed analysis are often lacking when the chip is architected; these include circuit design and layout, PKG and PCB layout, and physical floorplan and implementation. To address the need for a framework that enables architects to assess the system-level impact of off-chip interconnects, this thesis presents power-area-timing (PAT) models for off-chip interconnects, optimization and planning tools with the appropriate abstraction using these PAT models, and die/PKG/PCB co-design methods that help expose the off-chip interconnect cross-layer metrics to the die/PKG/PCB design flows. Together, these models, tools and methods enable cross-layer optimization that allows for a top-down definition and exploration of the design space and helps converge on the correct off-chip interconnect implementation and technology choice. The tools presented cover off-chip memory interfaces for mobile and server products, silicon photonic interfaces, 2.5D silicon interposers and 3D through-silicon vias (TSVs). The goal of the cross-layer framework is to assess the key metrics of the interconnect (such as timing, latency, active/idle/sleep power, and area/cost) at an appropriate level of abstraction by being able to do this across layers of the design flow. In additional to signal interconnect, this thesis also explores the need for such cross-layer pathfinding for power distribution networks (PDN), where the system-on-chip (SoC) floorplan and pinmap must be optimized before the collateral layouts for PDN analysis are ready. Altogether, the developed cross-layer pathfinding methodology for off-chip interconnects enables more rapid and thorough exploration of a vast design space of off-chip parallel and serial links, inter-die and inter-chiplet links and silicon photonics. Such exploration will pave the way for off-chip interconnect technology enablement that is optimized for system needs. The basis of the framework can be extended to cover other interconnect technology as well, since it fundamentally relates to system-level metrics that are common to all off-chip interconnects
Low-power spatial computing using dynamic threshold devices
Asynchronous spatial computing systems exhibit only localized communication, their overall data-flow being controlled by handshaking. It is therefore straightforward to determine when a particular part of such a system is active. We show that using thin-body double-gate fully depleted SOI transistors, the shift in threshold voltage that can be produced by modulating the back-gate bias is sufficient to reduce subthreshold leakage power by a factor of more than 104 in typical circuits. Using TBFDSOI devices in spatial computing architectures will allow overall power to be greatly reduced while maintaining high performance
The Topological Processor for the future ATLAS Level-1 Trigger: from design to commissioning
The ATLAS detector at LHC will require a Trigger system to efficiently select
events down to a manageable event storage rate of about 400 Hz. By 2015 the LHC
instantaneous luminosity will be increased up to 3 x 10^34 cm-2s-1, this
represents an unprecedented challenge faced by the ATLAS Trigger system. To
cope with the higher event rate and efficiently select relevant events from a
physics point of view, a new element will be included in the Level-1 Trigger
scheme after 2015: the Topological Processor (L1Topo). The L1Topo system,
currently developed at CERN, will consist initially of an ATCA crate and two
L1Topo modules. A high density opto-electroconverter (AVAGO miniPOD) drives up
to 1.6 Tb/s of data from the calorimeter and muon detectors into two high-end
FPGA (Virtex7-690), to be processed in about 200 ns. The design has been
optimized to guarantee excellent signal in- tegrity of the high-speed links and
low latency data transmission on the Real Time Data Path (RTDP). The L1Topo
receives data in a standalone protocol from the calorimeters and muon detectors
to be processed into several VHDL topological algorithms. Those algorithms
perform geometrical cuts, correlations and calculate complex observables such
as the invariant mass. The output of such topological cuts is sent to the
Central Trigger Processor. This talk focuses on the relevant high-density
design characteristic of L1Topo, which allows several hundreds optical links to
processed (up to 13 Gb/s each) using ordinary PCB material. Relevant test
results performed on the L1Topo prototypes to characterize the high-speed links
latency (eye diagram, bit error rate, margin analysis) and the logic resource
utilization of the algorithms are discussed.Comment: 5 pages, 6 figure
Analysis and Design of Resilient VLSI Circuits
The reliable operation of Integrated Circuits (ICs) has become increasingly difficult to
achieve in the deep sub-micron (DSM) era. With continuously decreasing device feature
sizes, combined with lower supply voltages and higher operating frequencies, the noise
immunity of VLSI circuits is decreasing alarmingly. Thus, VLSI circuits are becoming
more vulnerable to noise effects such as crosstalk, power supply variations and radiation-induced
soft errors. Among these noise sources, soft errors (or error caused by radiation
particle strikes) have become an increasingly troublesome issue for memory arrays as well
as combinational logic circuits. Also, in the DSM era, process variations are increasing
at an alarming rate, making it more difficult to design reliable VLSI circuits. Hence, it
is important to efficiently design robust VLSI circuits that are resilient to radiation particle
strikes and process variations. The work presented in this dissertation presents several
analysis and design techniques with the goal of realizing VLSI circuits which are tolerant
to radiation particle strikes and process variations.
This dissertation consists of two parts. The first part proposes four analysis and two
design approaches to address radiation particle strikes. The analysis techniques for the
radiation particle strikes include: an approach to analytically determine the pulse width
and the pulse shape of a radiation induced voltage glitch in combinational circuits, a technique
to model the dynamic stability of SRAMs, and a 3D device-level analysis of the
radiation tolerance of voltage scaled circuits. Experimental results demonstrate that the proposed techniques for analyzing radiation particle strikes in combinational circuits and
SRAMs are fast and accurate compared to SPICE. Therefore, these analysis approaches
can be easily integrated in a VLSI design flow to analyze the radiation tolerance of such
circuits, and harden them early in the design flow. From 3D device-level analysis of the radiation
tolerance of voltage scaled circuits, several non-intuitive observations are made and
correspondingly, a set of guidelines are proposed, which are important to consider to realize
radiation hardened circuits. Two circuit level hardening approaches are also presented
to harden combinational circuits against a radiation particle strike. These hardening approaches
significantly improve the tolerance of combinational circuits against low and very
high energy radiation particle strikes respectively, with modest area and delay overheads.
The second part of this dissertation addresses process variations. A technique is developed
to perform sensitizable statistical timing analysis of a circuit, and thereby improve the
accuracy of timing analysis under process variations. Experimental results demonstrate that
this technique is able to significantly reduce the pessimism due to two sources of inaccuracy
which plague current statistical static timing analysis (SSTA) tools. Two design approaches
are also proposed to improve the process variation tolerance of combinational circuits and
voltage level shifters (which are used in circuits with multiple interacting power supply
domains), respectively. The variation tolerant design approach for combinational circuits
significantly improves the resilience of these circuits to random process variations, with a
reduction in the worst case delay and low area penalty. The proposed voltage level shifter
is faster, requires lower dynamic power and area, has lower leakage currents, and is more
tolerant to process variations, compared to the best known previous approach.
In summary, this dissertation presents several analysis and design techniques which
significantly augment the existing work in the area of resilient VLSI circuit design
Operational experience, improvements, and performance of the CDF Run II silicon vertex detector
The Collider Detector at Fermilab (CDF) pursues a broad physics program at
Fermilab's Tevatron collider. Between Run II commissioning in early 2001 and
the end of operations in September 2011, the Tevatron delivered 12 fb-1 of
integrated luminosity of p-pbar collisions at sqrt(s)=1.96 TeV. Many physics
analyses undertaken by CDF require heavy flavor tagging with large charged
particle tracking acceptance. To realize these goals, in 2001 CDF installed
eight layers of silicon microstrip detectors around its interaction region.
These detectors were designed for 2--5 years of operation, radiation doses up
to 2 Mrad (0.02 Gy), and were expected to be replaced in 2004. The sensors were
not replaced, and the Tevatron run was extended for several years beyond its
design, exposing the sensors and electronics to much higher radiation doses
than anticipated. In this paper we describe the operational challenges
encountered over the past 10 years of running the CDF silicon detectors, the
preventive measures undertaken, and the improvements made along the way to
ensure their optimal performance for collecting high quality physics data. In
addition, we describe the quantities and methods used to monitor radiation
damage in the sensors for optimal performance and summarize the detector
performance quantities important to CDF's physics program, including vertex
resolution, heavy flavor tagging, and silicon vertex trigger performance.Comment: Preprint accepted for publication in Nuclear Instruments and Methods
A (07/31/2013
Generadores de pulso del orden de nanosegundos para control de calidad y diagnosis de las cámaras de telescopios Cherenkov
Tesis inĂ©dita de la Universidad Complutense de Madrid, Facultad de Ciencias FĂsicas, Departamento de FĂsica Aplicada III (Electricidad y ElectrĂłnica), leĂda el 30-11-2015Depto. de Estructura de la Materia, FĂsica TĂ©rmica y ElectrĂłnicaFac. de Ciencias FĂsicasTRUEunpu
A Charge-Recycling Scheme and Ultra Low Voltage Self-Startup Charge Pump for Highly Energy Efficient Mixed Signal Systems-On-A-Chip
The advent of battery operated sensor-based electronic systems has provided a pressing need to design energy-efficient, ultra-low power integrated circuits as a means to improve the battery lifetime. This dissertation describes a scheme to lower the power requirement of a digital circuit through the use of charge-recycling and dynamic supply-voltage scaling techniques. The novel charge-recycling scheme proposed in this research demonstrates the feasibility of operating digital circuits using the charge scavenged from the leakage and dynamic load currents inherent to digital design. The proposed scheme efficiently gathers the “ground-bound” charge into storage capacitor banks. This reclaimed charge is then subsequently recycled to power the source digital circuit.
The charge-recycling methodology has been implemented on a 12-bit Gray-code counter operating at frequencies of less than 50 MHz. The circuit has been designed in a 90-nm process and measurement results reveal more than 41% reduction in the average energy consumption of the counter. The total energy savings including the power consumed for the generation of control signals aggregates to an average of 23%. The proposed methodology can be applied to an existing digital path without any design change to the circuit but with only small loss to the performance. Potential applications of this scheme are described, specifically in wide-temperature dynamic power reduction and as a source for energy harvesters.
The second part of this dissertation deals with the design and development of a self-starting, ultra-low voltage, switched-capacitor (SC) DC-DC converter that is essential to an energy harvesting system. The proposed charge-pump based SC-converter operates from 125-mV input and thus enables battery-less operation in ultra-low voltage energy harvesters. The charge pump does not require any external components or expensive post-fabrication processing to enable low-voltage operation. This design has been implemented in a 130-nm CMOS process. While the proposed charge pump provides significant efficiency enhancement in energy harvesters, it can also be incorporated within charge recycling systems to facilitate adaptable charge-recycling levels.
In total, this dissertation provides key components needed for highly energy-efficient mixed signal systems-on-a-chip
- …