253 research outputs found
Circuits and Systems Advances in Near Threshold Computing
Modern society is witnessing a sea change in ubiquitous computing, in which people have embraced computing systems as an indispensable part of day-to-day existence. Computation, storage, and communication abilities of smartphones, for example, have undergone monumental changes over the past decade. However, global emphasis on creating and sustaining green environments is leading to a rapid and ongoing proliferation of edge computing systems and applications. As a broad spectrum of healthcare, home, and transport applications shift to the edge of the network, near-threshold computing (NTC) is emerging as one of the promising low-power computing platforms. An NTC device sets its supply voltage close to its threshold voltage, dramatically reducing the energy consumption. Despite showing substantial promise in terms of energy efficiency, NTC is yet to see widescale commercial adoption. This is because circuits and systems operating with NTC suffer from several problems, including increased sensitivity to process variation, reliability problems, performance degradation, and security vulnerabilities, to name a few. To realize its potential, we need designs, techniques, and solutions to overcome these challenges associated with NTC circuits and systems. The readers of this book will be able to familiarize themselves with recent advances in electronics systems, focusing on near-threshold computing
A Survey of Fault-Injection Methodologies for Soft Error Rate Modeling in Systems-on-Chips
The development of process technology has increased system performance, but the system failure probability has also significantly increased. It is important to consider the system reliability in addition to the cost, performance, and power consumption. In this paper, we describe the types of faults that occur in a system and where these faults originate. Then, fault-injection techniques, which are used to characterize the fault rate of a system-on-chip (SoC), are investigated to provide a guideline to SoC designers for the realization of resilient SoCs
Cmos Rf Cituits Sic] Variability And Reliability Resilient Design, Modeling, And Simulation
The work presents a novel voltage biasing design that helps the CMOS RF circuits resilient to variability and reliability. The biasing scheme provides resilience through the threshold voltage (VT) adjustment, and at the mean time it does not degrade the PA performance. Analytical equations are established for sensitivity of the resilient biasing under various scenarios. Power Amplifier (PA) and Low Noise Amplifier (LNA) are investigated case by case through modeling and experiment. PTM 65nm technology is adopted in modeling the transistors within these RF blocks. A traditional class-AB PA with resilient design is compared the same PA without such design in PTM 65nm technology. Analytical equations are established for sensitivity of the resilient biasing under various scenarios. A traditional class-AB PA with resilient design is compared the same PA without such design in PTM 65nm technology. The results show that the biasing design helps improve the robustness of the PA in terms of linear gain, P1dB, Psat, and power added efficiency (PAE). Except for post-fabrication calibration capability, the design reduces the majority performance sensitivity of PA by 50% when subjected to threshold voltage (VT) shift and 25% to electron mobility (μn) degradation. The impact of degradation mismatches is also investigated. It is observed that the accelerated aging of MOS transistor in the biasing circuit will further reduce the sensitivity of PA. In the study of LNA, a 24 GHz narrow band cascade LNA with adaptive biasing scheme under various aging rate is compared to LNA without such biasing scheme. The modeling and simulation results show that the adaptive substrate biasing reduces the sensitivity of noise figure and minimum noise figure subject to process variation and iii device aging such as threshold voltage shift and electron mobility degradation. Simulation of different aging rate also shows that the sensitivity of LNA is further reduced with the accelerated aging of the biasing circuit. Thus, for majority RF transceiver circuits, the adaptive body biasing scheme provides overall performance resilience to the device reliability induced degradation. Also the tuning ability designed in RF PA and LNA provides the circuit post-process calibration capability
ParaDox: Eliminating Voltage Margins via Heterogeneous Fault Tolerance.
Providing reliability is becoming a challenge for chip manufacturers, faced with simultaneously trying to improve miniaturization, performance and energy efficiency. This leads to very large margins on voltage and frequency, designed to avoid errors even in the worst case, along with significant hardware expenditure on eliminating voltage spikes and other forms of transient error, causing considerable inefficiency in power consumption and performance. We flip traditional ideas about reliability and performance around, by exploring the use of error resilience for power and performance gains. ParaMedic is a recent architecture that provides a solution for reliability with low overheads via automatic hardware error recovery. It works by splitting up checking onto many small cores in a heterogeneous multicore system with hardware logging support. However, its design is based on the idea that errors are exceptional. We transform ParaMedic into ParaDox, which shows high performance in both error-intensive and scarce-error scenarios, thus allowing correct execution even when undervolted and overclocked. Evaluation within error-intensive simulation environments confirms the error resilience of ParaDox and the low associated recovery cost. We estimate that compared to a non-resilient system with margins, ParaDox can reduce energy-delay product by 15% through undervolting, while completely recovering from any induced errors
Probabilistic Compute-in-Memory Design For Efficient Markov Chain Monte Carlo Sampling
Markov chain Monte Carlo (MCMC) is a widely used sampling method in modern
artificial intelligence and probabilistic computing systems. It involves
repetitive random number generations and thus often dominates the latency of
probabilistic model computing. Hence, we propose a compute-in-memory (CIM)
based MCMC design as a hardware acceleration solution. This work investigates
SRAM bitcell stochasticity and proposes a novel ``pseudo-read'' operation,
based on which we offer a block-wise random number generation circuit scheme
for fast random number generation. Moreover, this work proposes a novel
multi-stage exclusive-OR gate (MSXOR) design method to generate strictly
uniformly distributed random numbers. The probability error deviating from a
uniform distribution is suppressed under . Also, this work presents a
novel in-memory copy circuit scheme to realize data copy inside a CIM
sub-array, significantly reducing the use of R/W circuits for power saving.
Evaluated in a commercial 28-nm process development kit, this CIM-based MCMC
design generates 4-bit32-bit samples with an energy efficiency of
~pJ/sample and high throughput of up to M~samples/s. Compared to
conventional processors, the overall energy efficiency improves
to times
A Technique for Designing Variation Resilient Subthreshold Sram Cell
This paper presents a technique for designing a variability aware subthreshold SRAM cell. The architecture of the proposed cell is similar to the standard read-decoupled 8-transistor (RD8T) SRAM cell with the exception that the access FETS are replaced with transmission gates (TGs). In this work, various design metrics are assessed and compared with RD8T SRAM cell. The proposed design offers 2.14× and 1.75× improvement in TRA (read access time) and TWA (write access time) respectively compared with RD8T. It proves its robustness against process variations by featuring narrower spread in TRA distribution (2.35×) and TWA distribution (3.79×) compared with RD8T. The proposed bitcell offers 1.16× higher read current (IREAD) and 1.64× lower bitline leakage current (ILEAK) respectively compared with RD8T. It also shows its robustness by offering 1.34× (1.58×) tighter spread in IREAD (ILEAK) compared with RD8T. It exhibits 1.42× larger IREAD to ILEAK ratio. It shows 2.2× higher frequency @ 250 mV with read bitline capacitance of 10 fF. Besides, the proposed bitcell achieves same read stability and write-ability as that of RD8T at the cost of 3 extra transistors. The leakage power of the proposed design is close to that of RD8T.
ABSTRAK: Kertas kerja ini membentangkan teknik merekabentuk sel bawah ambang SRAM yang bolehubah. Senibina sel yang dicadangkan adalah sama dengan sel SRAM 8-transistor (RD8T) “pisahan-bacaan” piawai kecuali FET akses digantikan dengan sel pintu transmisi (TGs). Di dalam kajian ini, beberapa metrik rekabentuk dinilai dan dibandingkan dengan sel RD8T SRAM. Rekabentuk yang dicadangkan menawarkan peningkatan 2.14× dan 1.75× dalam TRA (masa akses baca) dan TWA (masa akses tulis) berbanding dengan RD8T. Ia membuktikan kekukuhan variasi proses dengan menampilkan tebaran yang lebih sempit dalam pengagihan TRA (2.35 ×) dan pengagihan TWA (3.79 ×) berbanding dengan RD8T. Sel-Bit yang dicadangkan mempunyai arus baca 1.16 × lebih tinggi (IREAD) dan arus bocor bitline 1.64 × lebih rendah (ILEAK) berbanding dengan RD8T. Ia juga membuktikan kekukuhan dengan menawarkan 1.34 × (1.58 ×) penyebaran sempit di IREAD (ILEAK) berbanding dengan RD8T dan nisbah IREAD / ILEAK 1.42 × lebih besar. Ia menunjukkan kekerapan 2.2 × lebih tinggi pada 250 mV dengan kemuatan membaca bitline sebanyak 10 fF. Selain itu, sel bit yang dicadangkan mencapai kestabilan membaca dan keupayaan menulis yang sama seperti RD8T dengan kos tambahan 3 transistor. Kebocoran kuasa rekabentuk yang dicadangkan hampir sama dengan RD8T.
KEYWORDS: variability; robust, subthreshold; random dopant fluctuation (RDF); read static noise margin (RSNM); write static noise margin (WSNM)
Design and Robustness Analysis on Non-volatile Storage and Logic Circuit
By combining the flexibility of MOS logic and the non-volatility of spintronic devices, spin-MOS logic and storage circuitry offer a promising approach to implement highly integrated, power-efficient, and nonvolatile computing and storage systems. Besides the persistent errors due to process variations, however, the functional correctness of Spin-MOS circuitry suffers from additional non-persistent errors that are incurred by the randomness of spintronic device operations, i.e., thermal fluctuations. This work quantitatively investigates the impact of thermal fluctuations on the operations of two typical Spin-MOS circuitry: one transistor and one magnetic tunnel junction (1T1J) spin-transfer torque random access memory (STT-RAM) cell and a nonvolatile latch design. A new nonvolatile latch design is proposed based on magnetic tunneling junction (MTJ) devices. In the standby mode, the latched data can be retained in the MTJs without consuming any power. Two types of operation errors can occur, namely, persistent and non-persistent errors. These are quantitatively analyzed by including models for process variations and thermal fluctuations during the read and write operations. A mixture importance sampling methodology is applied to enable yield-driven design and extend its application beyond memories to peripheral circuits and logic blocks. Several possible design techniques to reduce thermal induced non-persistent error rate are also discussed
Exploiting Adaptive Techniques to Improve Processor Energy Efficiency
Rapid device-miniaturization keeps on inducing challenges in building energy efficient microprocessors. As the size of the transistors continuously decreasing, more uncertainties emerge in their operations. On the other hand, integrating more and more transistors on a single chip accentuates the need to lower its supply-voltage. This dissertation investigates one of the primary device uncertainties - timing error, in microprocessor performance bottleneck in NTC era. Then it proposes various innovative techniques to exploit these opportunities to maintain processor energy efficiency, in the context of emerging challenges. Evaluated with the cross-layer methodology, the proposed approaches achieve substantial improvements in processor energy efficiency, compared to other start-of-art techniques
Stream Processor Development using Multi-Threshold NULL Convention Logic Asynchronous Design Methodology
Decreasing transistor feature size has led to an increase in the number of transistors in integrated circuits (IC), allowing for the implementation of more complex logic. However, such logic also requires more complex clock tree synthesis (CTS) to avoid timing violations as the clock must reach many more gates over larger areas. Thus, timing analysis requires significantly more computing power and designer involvement than in the past. For these reasons, IC designers have been pushed to nix conventional synchronous (SYNC) architecture and explore novel methodologies such as asynchronous, self-timed architecture. This dissertation evaluates the nominal active energy, voltage-scaled active energy, and leakage power dissipation across two cores of a stream processor: Smoothing Filter (SF) and Histogram Equalization (HEQ). Both cores were implemented in Multi-Threshold NULL Convention Logic (MTNCL) and clock-gated synchronous methodologies using a gate-level netlist to avoid any architectural discrepancies while guaranteeing impartial comparisons. MTNCL designs consumed more active energy than their synchronous counterparts due to the dual-rail encoding system; however, high-threshold-voltage (High-Vt) transistors used in MTNCL threshold gates reduced leakage power dissipation by up to 227%. During voltage-scaling simulations, MTNCL circuits showed a high level of robustness as the output results were logically valid across all voltage sweeps without any additional circuitry. SYNC circuits, however, needed extra logic, such as a DVS controller, to adjust the circuit’s speed when VDD changed. Although SYNC circuits still consumed less average energy, MTNCL circuit power gains accelerated when switching to lower voltage domains
- …