Search CORE

496 research outputs found

Recommended from our members

A RISC-V Vector Processor With Simultaneous-Switching Switched-Capacitor DC-DC Converters in 28 nm FDSOI

Author: Alon E
Asanović K
Avizienis R
Bailey S
Blagojević M
Chen PH
Chiu PF
Flatresse P
Jevtić R
Keller B
Kwak J
Le HP
Lee Y
Nikolić B
Puggelli A
Richards B
Sutardja N
Waterman A
Zimmer B
Publication venue: eScholarship, University of California
Publication date: 01/04/2016
Field of study

This work demonstrates a RISC-V vector microprocessor implemented in 28 nm FDSOI with fully integrated simultaneous-switching switched-capacitor DC-DC (SC DC-DC) converters and adaptive clocking that generates four on-chip voltages between 0.45 and 1 V using only 1.0 V core and 1.8 V IO voltage inputs. The converters achieve high efficiency at the system level by switching simultaneously to avoid charge-sharing losses and by using an adaptive clock to maximize performance for the resulting voltage ripple. Details about the implementation of the DC-DC switches, DC-DC controller, and adaptive clock are provided, and the sources of conversion loss are analyzed based on measured results. This system pushes the capabilities of dynamic voltage scaling by enabling fast transitions (20 ns), simple packaging (no off-chip passives), low area overhead (16%), high conversion efficiency (80%-86%), and high energy efficiency (26.2 DP GFLOPS/W) for mobile devices

eScholarship - University of California

On-Chip Digital Decoupling Capacitance Methodology

Author: Bohannon Eric Scott
Publication venue: RIT Scholar Works
Publication date: 01/01/2007
Field of study

Signal integrity has become a major problem in digital IC design. One cause of this problem is device scaling which results in a sharp reduction of supply voltage, creating stringent noise margin requirements to ensure functionality. Reductions in feature size also result in increased clock speeds leading to many different high frequency noise producing components. As on-chip area increases to allow for more computational capability, so does the amount of digital logic to be placed, magnifying the effects of noisy interconnect structures. Supply noise, modeled as AV = Ldi/dt , is caused by rapid current spikes during a rise or fall time. Decoupling capacitors often fill empty on-chip space for the purpose of limiting this noise. This work introduces a novel methodology that attempts to quantify and locate decoupling capacitors within a power distribution network. The bondwire attached on the periphery of the face of the die is taken to be the dominant source of inductance. It is shown that distributing capacitance closer to the switching elements is most effective at reducing supply noise. A chip has been designed using TSMC 90 nm technology that implements the ideas presented in this work. Simulation results show that noise fluctuations are high enough such that random placement of decoupling capacitance is not effective for large digital structures. The amount of interconnect generated on-chip noise increases with area, resulting in the need for an optimal decoupling scheme. As scaling continues, supply voltages and noise margins will decrease, creating the need for a robust decoupling capacitance methodology

RIT Scholar Works

Increasing the robustness of digital circuits with ring oscillator clocks

Author: Cortadella Jordi
Machado Lucas
Roca Pérez Antoni
Publication venue
Publication date: 01/01/2017
Field of study

Technology scaling enables lower supply voltages, but also increases power density of integrated circuits. In this context, power integrity becomes a major concern in the implementation of highperformance designs. This paper analyzes the influence of Ring Oscillator Clocks (ROCs) on mitigating the impacts of voltage noise. A design with an ROC as the clock source is able to work correctly even in the presence of severe and unpredictable voltage emergencies, without degrading the average performance and power metrics of the circuit. ROCs offer an instantaneous and continuous adaptation to the environment conditions, thus reducing the margins used to prevent timing failures. ROCs provide robustness independently of the power delivery network, thus relaxing the constraints required for the design of the PCB and package. As a by-product, the inherent jitter generated by ROCs produces a spreadspectrum effect that reduces electromagnetic emissions.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Efficient and Scalable Computing for Resource-Constrained Cyber-Physical Systems: A Layered Approach

Author: Zou An
Publication venue: Washington University Open Scholarship
Publication date: 15/05/2021
Field of study

With the evolution of computing and communication technology, cyber-physical systems such as self-driving cars, unmanned aerial vehicles, and mobile cognitive robots are achieving increasing levels of multifunctionality and miniaturization, enabling them to execute versatile tasks in a resource-constrained environment. Therefore, the computing systems that power these resource-constrained cyber-physical systems (RCCPSs) have to achieve high efficiency and scalability. First of all, given a fixed amount of onboard energy, these computing systems should not only be power-efficient but also exhibit sufficiently high performance to gracefully handle complex algorithms for learning-based perception and AI-driven decision-making. Meanwhile, scalability requires that the current computing system and its components can be extended both horizontally, with more resources, and vertically, with emerging advanced technology. To achieve efficient and scalable computing systems in RCCPSs, my research broadly investigates a set of techniques and solutions via a bottom-up layered approach. This layered approach leverages the characteristics of each system layer (e.g., the circuit, architecture, and operating system layers) and their interactions to discover and explore the optimal system tradeoffs among performance, efficiency, and scalability. At the circuit layer, we investigate the benefits of novel power delivery and management schemes enabled by integrated voltage regulators (IVRs). Then, between the circuit and microarchitecture/architecture layers, we present a voltage-stacked power delivery system that offers best-in-class power delivery efficiency for many-core systems. After this, using Graphics Processing Units (GPUs) as a case study, we develop a real-time resource scheduling framework at the architecture and operating system layers for heterogeneous computing platforms with guaranteed task deadlines. Finally, fast dynamic voltage and frequency scaling (DVFS) based power management across the circuit, architecture, and operating system layers is studied through a learning-based hierarchical power management strategy for multi-/many-core systems

Washington University St. Louis: Open Scholarship

Exploiting Adaptive Techniques to Improve Processor Energy Efficiency

Author: Chen Hu
Publication venue: DigitalCommons@USU
Publication date: 01/05/2016
Field of study

Rapid device-miniaturization keeps on inducing challenges in building energy efficient microprocessors. As the size of the transistors continuously decreasing, more uncertainties emerge in their operations. On the other hand, integrating more and more transistors on a single chip accentuates the need to lower its supply-voltage. This dissertation investigates one of the primary device uncertainties - timing error, in microprocessor performance bottleneck in NTC era. Then it proposes various innovative techniques to exploit these opportunities to maintain processor energy efficiency, in the context of emerging challenges. Evaluated with the cross-layer methodology, the proposed approaches achieve substantial improvements in processor energy efficiency, compared to other start-of-art techniques

DigitalCommons@USU

Near-Threshold Computing: Past, Present, and Future.

Author: Pinckney Nathaniel Ross
Publication venue
Publication date
Field of study

Transistor threshold voltages have stagnated in recent years, deviating from constant-voltage scaling theory and directly limiting supply voltage scaling. To overcome the resulting energy and power dissipation barriers, energy efficiency can be improved through aggressive voltage scaling, and there has been increased interest in operating at near-threshold computing (NTC) supply voltages. In this region sizable energy gains are achieved with moderate performance loss, some of which can be regained through parallelism. This thesis first provides a methodical definition of how near to threshold is "near threshold" and continues with an in-depth examination of NTC across past, present, and future CMOS technologies. By systematically defining near-threshold, the trends and tradeoffs are analyzed, lending insight in how best to design and optimize near-threshold systems. NTC works best for technologies that feature good circuit delay scalability, therefore technologies without strong short-channel effects. Early planar technologies (prior to 90nm or so) featured good circuit scalability (8x energy gains), but lacked area in which to add cores for parallelization. Recent planar nodes (32nm – 20nm) feature more area for cores but suffer from poor delay scalability, and so are not well-suited for NTC (4x energy gains). The switch to FinFET CMOS technology allows for a return to strong voltage scalability (8x gain), reversing trends seen in planar technologies, while dark silicon has created an opportunity to add cores for parallelization. Improved FinFET voltage scalability even allows for latency reduction of a single task, as long as the task is sufficiently parallelizable (< 10% serial code). Finally, we will look at a technique for fast voltage boosting, called Shortstop, in which a core's operating voltage is raised in 10s of cycles. Shortstop can be used to quickly respond to single-threaded performance demands of a near-threshold system by leveraging the innate parasitic inductance of a dedicated dirty supply rail, further improving energy efficiency. The technique is demonstrated in a wirebond implementation and is able to boost a core up to 1.8x faster than a header-based approach, while reducing supply droop by 2-7x. An improved flip-chip architecture is also proposed.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113600/1/npfet_1.pd

Deep Blue Documents at the University of Michigan

Effect of a Polywell geometry on a CMOS Photodiode Array

Author: Hinckley Steven
Jansz Paul V
Wild Graham
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2010
Field of study

The effect of a polywell geometry hybridized with a stacked gradient poly-homojunction architecture, on the response of a CMOs compatible photodiode array was simulated. Crosstalk and sensitivity improved compared to the polywell geometry alone, for both back and front illuminatio

Crossref

Research Online @ ECU

Constraint-Aware, Scalable, and Efficient Algorithms for Multi-Chip Power Module Layout Optimization

Author: Al Razi Imam
Publication venue: ScholarWorks@UARK
Publication date: 01/08/2022
Field of study

Moving towards an electrified world requires ultra high-density power converters. Electric vehicles, electrified aerospace, data centers, etc. are just a few fields among wide application areas of power electronic systems, where high-density power converters are essential. As a critical part of these power converters, power semiconductor modules and their layout optimization has been identified as a crucial step in achieving the maximum performance and density for wide bandgap technologies (i.e., GaN and SiC). New packaging technologies are also introduced to produce reliable and efficient multichip power module (MCPM) designs to push the current limits. The complexity of the emerging MCPM layouts is surpassing the capability of a manual, iterative design process to produce an optimum design with agile development requirements. An electronic design automation tool called PowerSynth has been introduced with ongoing research toward enhanced capabilities to speed up the optimized MCPM layout design process. This dissertation presents the PowerSynth progression timeline with the methodology updates and corresponding critical results compared to v1.1. The first released version (v1.1) of PowerSynth demonstrated the benefits of layout abstraction, and reduced-order modeling techniques to perform rapid optimization of the MCPM module compared to the traditional, manual, and iterative design approach. However, that version is limited by several key factors: layout representation technique, layout generation algorithms, iterative design-rule-checking (DRC), optimization algorithm candidates, etc. To address these limitations, and enhance PowerSynth’s capabilities, constraint-aware, scalable, and efficient algorithms have been developed and implemented. PowerSynth layout engine has evolved from v1.3 to v2.0 throughout the last five years to incorporate the algorithm updates and generate all 2D/2.5D/3D Manhattan layout solutions. These fundamental changes in the layout generation methodology have also called for updates in the performance modeling techniques and enabled exploring different optimization algorithms. The latest PowerSynth 2 architecture has been implemented to enable electro-thermo-mechanical and reliability optimization on 2D/2.5D/3D MCPM layouts, and set up a path toward cabinet-level optimization. PowerSynth v2.0 computer-aided design (CAD) flow has been hardware-validated through manufacturing and testing of an optimized novel 3D MCPM layout. The flow has shown significant speedup compared to the manual design flow with a comparable optimization result

ScholarWorks@UARK

UARK (University of Arkansas )

AI/ML Algorithms and Applications in VLSI Design and Technology

Author: Abbas Zia
Ahmad Amir
Amuru Deepthi
Cherupally Pavan K.
Gurram Sushanth R.
Vudumula Harsha V.
Zahra Andleeb
Publication venue
Publication date: 15/02/2023
Field of study

An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

arXiv.org e-Print Archive

Recommended from our members

Characterization and management of voltage noise in multi-core, multi-threaded processors

Author: Kim Youngtaek
Publication venue
Publication date: 14/07/2014
Field of study

textReliability is one of the important issues of recent microprocessor design. Processors must provide correct behavior as users expect, and must not fail at any time. However, unreliable operation can be caused by excessive supply voltage fluctuations due to an inductive part in a microprocessor power distribution network. This voltage fluctuation issue is referred to as inductive or di/dt noise, and requires thorough analysis and sophisticated design solutions. This dissertation proposes an automated stressmark generation framework to characterize di/dt noise effect, and suggests a practical solution for management of di/dt effects while achieving performance and energy goals. First, the di/dt noise issue is analyzed from theory to a practical view. Inductance is a parasitic part in power distribution network for microprocessor, and its characteristics such as resonant frequencies are reviewed. Then, it is shown that supply voltage fluctuation from resonant behavior is much harmful than single event voltage fluctuations. Voltage fluctuations caused by standard benchmarks such as SPEC CPU2006, PARSEC, Linpack, etc. are studied. Next, an AUtomated DI/dT stressmark generation framework, referred to as AUDIT, is proposed to identify maximum voltage droop in a microprocessor power distribution network. The di/dt stressmark generated from AUDIT framework is an instruction sequence, which draws periodic high and low current pulses that maximize voltage fluctuations including voltage droops. AUDIT uses a Genetic Algorithm in scheduling and optimizing candidate instruction sequences to create a maximum voltage droop. In addition, AUDIT provides with both simulation and hardware measurement methods for finding maximum voltage droops in different design and verification stages of a processor. Failure points in hardware due to voltage droops are analyzed. Finally, a hardware technique, floating-point (FP) issue throttling, is examined, which provides a reduction in worst case voltage droop. This dissertation shows the impact of floating point throttling on voltage droop, and translates this reduction in voltage droop to an increase in operating frequency because additional guardband is no longer required to guard against droops resulting from heavy floating point usage. This dissertation presents two techniques to dynamically determine when to tradeoff FP throughput for reduced voltage margin and increased frequency. These techniques can work in software level without any modification of existing hardware.Electrical and Computer Engineerin

Texas ScholarWorks