54 research outputs found

    Preliminary Radiation Testing of a State-of-the-Art Commercial 14nm CMOS Processor/System-on-a-Chip

    Get PDF
    Hardness assurance test results of Intel state-of-the-art 14nm Broadwell U-series processor / System-on-a-Chip (SoC) for total ionizing dose (TID) are presented, along with exploratory results from trials at a medical proton facility. Test method builds upon previous efforts [1] by utilizing commercial laptop motherboards and software stress applications as opposed to more traditional automated test equipment (ATE)

    Design for Reliability and Low Power in Emerging Technologies

    Get PDF
    Die fortlaufende Verkleinerung von Transistor-Strukturgrößen ist einer der wichtigsten Antreiber für das Wachstum in der Halbleitertechnologiebranche. Seit Jahrzehnten erhöhen sich sowohl Integrationsdichte als auch Komplexität von Schaltkreisen und zeigen damit einen fortlaufenden Trend, der sich über alle modernen Fertigungsgrößen erstreckt. Bislang ging das Verkleinern von Transistoren mit einer Verringerung der Versorgungsspannung einher, was zu einer Reduktion der Leistungsaufnahme führte und damit eine gleichbleibenden Leistungsdichte sicherstellte. Doch mit dem Beginn von Strukturgrößen im Nanometerbreich verlangsamte sich die fortlaufende Skalierung. Viele Schwierigkeiten, sowie das Erreichen von physikalischen Grenzen in der Fertigung und Nicht-Idealitäten beim Skalieren der Versorgungsspannung, führten zu einer Zunahme der Leistungsdichte und, damit einhergehend, zu erschwerten Problemen bei der Sicherstellung der Zuverlässigkeit. Dazu zählen, unter anderem, Alterungseffekte in Transistoren sowie übermäßige Hitzeentwicklung, nicht zuletzt durch stärkeres Auftreten von Selbsterhitzungseffekten innerhalb der Transistoren. Damit solche Probleme die Zuverlässigkeit eines Schaltkreises nicht gefährden, werden die internen Signallaufzeiten üblicherweise sehr pessimistisch kalkuliert. Durch den so entstandenen zeitlichen Sicherheitsabstand wird die korrekte Funktionalität des Schaltkreises sichergestellt, allerdings auf Kosten der Performance. Alternativ kann die Zuverlässigkeit des Schaltkreises auch durch andere Techniken erhöht werden, wie zum Beispiel durch Null-Temperatur-Koeffizienten oder Approximate Computing. Wenngleich diese Techniken einen Großteil des üblichen zeitlichen Sicherheitsabstandes einsparen können, bergen sie dennoch weitere Konsequenzen und Kompromisse. Bleibende Herausforderungen bei der Skalierung von CMOS Technologien führen außerdem zu einem verstärkten Fokus auf vielversprechende Zukunftstechnologien. Ein Beispiel dafür ist der Negative Capacitance Field-Effect Transistor (NCFET), der eine beachtenswerte Leistungssteigerung gegenüber herkömmlichen FinFET Transistoren aufweist und diese in Zukunft ersetzen könnte. Des Weiteren setzen Entwickler von Schaltkreisen vermehrt auf komplexe, parallele Strukturen statt auf höhere Taktfrequenzen. Diese komplexen Modelle benötigen moderne Power-Management Techniken in allen Aspekten des Designs. Mit dem Auftreten von neuartigen Transistortechnologien (wie zum Beispiel NCFET) müssen diese Power-Management Techniken neu bewertet werden, da sich Abhängigkeiten und Verhältnismäßigkeiten ändern. Diese Arbeit präsentiert neue Herangehensweisen, sowohl zur Analyse als auch zur Modellierung der Zuverlässigkeit von Schaltkreisen, um zuvor genannte Herausforderungen auf mehreren Designebenen anzugehen. Diese Herangehensweisen unterteilen sich in konventionelle Techniken ((a), (b), (c) und (d)) und unkonventionelle Techniken ((e) und (f)), wie folgt: (a)\textbf{(a)} Analyse von Leistungszunahmen in Zusammenhang mit der Maximierung von Leistungseffizienz beim Betrieb nahe der Transistor Schwellspannung, insbesondere am optimalen Leistungspunkt. Das genaue Ermitteln eines solchen optimalen Leistungspunkts ist eine besondere Herausforderung bei Multicore Designs, da dieser sich mit den jeweiligen Optimierungszielsetzungen und der Arbeitsbelastung verschiebt. (b)\textbf{(b)} Aufzeigen versteckter Interdependenzen zwischen Alterungseffekten bei Transistoren und Schwankungen in der Versorgungsspannung durch „IR-drops“. Eine neuartige Technik wird vorgestellt, die sowohl Über- als auch Unterschätzungen bei der Ermittlung des zeitlichen Sicherheitsabstands vermeidet und folglich den kleinsten, dennoch ausreichenden Sicherheitsabstand ermittelt. (c)\textbf{(c)} Eindämmung von Alterungseffekten bei Transistoren durch „Graceful Approximation“, eine Technik zur Erhöhung der Taktfrequenz bei Bedarf. Der durch Alterungseffekte bedingte zeitlich Sicherheitsabstand wird durch Approximate Computing Techniken ersetzt. Des Weiteren wird Quantisierung verwendet um ausreichend Genauigkeit bei den Berechnungen zu gewährleisten. (d)\textbf{(d)} Eindämmung von temperaturabhängigen Verschlechterungen der Signallaufzeit durch den Betrieb nahe des Null-Temperatur Koeffizienten (N-ZTC). Der Betrieb bei N-ZTC minimiert temperaturbedingte Abweichungen der Performance und der Leistungsaufnahme. Qualitative und quantitative Vergleiche gegenüber dem traditionellen zeitlichen Sicherheitsabstand werden präsentiert. (e)\textbf{(e)} Modellierung von Power-Management Techniken für NCFET-basierte Prozessoren. Die NCFET Technologie hat einzigartige Eigenschaften, durch die herkömmliche Verfahren zur Spannungs- und Frequenzskalierungen zur Laufzeit (DVS/DVFS) suboptimale Ergebnisse erzielen. Dies erfordert NCFET-spezifische Power-Management Techniken, die in dieser Arbeit vorgestellt werden. (f)\textbf{(f)} Vorstellung eines neuartigen heterogenen Multicore Designs in NCFET Technologie. Das Design beinhaltet identische Kerne; Heterogenität entsteht durch die Anwendung der individuellen, optimalen Konfiguration der Kerne. Amdahls Gesetz wird erweitert, um neue system- und anwendungsspezifische Parameter abzudecken und die Vorzüge des neuen Designs aufzuzeigen. Die Auswertungen der vorgestellten Techniken werden mithilfe von Implementierungen und Simulationen auf Schaltkreisebene (gate-level) durchgeführt. Des Weiteren werden Simulatoren auf Systemebene (system-level) verwendet, um Multicore Designs zu implementieren und zu simulieren. Zur Validierung und Bewertung der Effektivität gegenüber dem Stand der Technik werden analytische, gate-level und system-level Simulationen herangezogen, die sowohl synthetische als auch reale Anwendungen betrachten

    Impact of NCFET on Neural Network Accelerators

    Get PDF
    This is the first work to investigate the impact that Negative Capacitance Field-Effect Transistor (NCFET) brings on the efficiency and accuracy of future Neural Networks (NN). NCFET is at the forefront of emerging technologies, especially after it has become compatible with the existing fabrication process of CMOS. Neural Network inference accelerators are becoming ubiquitous in modern SoCs and there is an ever-increasing demand for tighter and tighter throughput constraints and lower energy consumption. To explore the benefits that NCFET brings to NN inference regarding frequency, energy, and accuracy, we investigate different configurations of the multiply-add (MADD) circuit, which is the core computational unit in any NN accelerator. We demonstrate that, compared to the baseline 7nm FinFET technology, its negative capacitance counterpart reduces the energy by 55%, without any frequency reduction. In addition, it enables leveraging higher computational precision, which results to a considerable improvement in the inference accuracy. Importantly, the achieved accuracy improvement comes also together with a significant energy reduction and without any loss in frequency

    Near-Threshold Computing: Past, Present, and Future.

    Full text link
    Transistor threshold voltages have stagnated in recent years, deviating from constant-voltage scaling theory and directly limiting supply voltage scaling. To overcome the resulting energy and power dissipation barriers, energy efficiency can be improved through aggressive voltage scaling, and there has been increased interest in operating at near-threshold computing (NTC) supply voltages. In this region sizable energy gains are achieved with moderate performance loss, some of which can be regained through parallelism. This thesis first provides a methodical definition of how near to threshold is "near threshold" and continues with an in-depth examination of NTC across past, present, and future CMOS technologies. By systematically defining near-threshold, the trends and tradeoffs are analyzed, lending insight in how best to design and optimize near-threshold systems. NTC works best for technologies that feature good circuit delay scalability, therefore technologies without strong short-channel effects. Early planar technologies (prior to 90nm or so) featured good circuit scalability (8x energy gains), but lacked area in which to add cores for parallelization. Recent planar nodes (32nm – 20nm) feature more area for cores but suffer from poor delay scalability, and so are not well-suited for NTC (4x energy gains). The switch to FinFET CMOS technology allows for a return to strong voltage scalability (8x gain), reversing trends seen in planar technologies, while dark silicon has created an opportunity to add cores for parallelization. Improved FinFET voltage scalability even allows for latency reduction of a single task, as long as the task is sufficiently parallelizable (< 10% serial code). Finally, we will look at a technique for fast voltage boosting, called Shortstop, in which a core's operating voltage is raised in 10s of cycles. Shortstop can be used to quickly respond to single-threaded performance demands of a near-threshold system by leveraging the innate parasitic inductance of a dedicated dirty supply rail, further improving energy efficiency. The technique is demonstrated in a wirebond implementation and is able to boost a core up to 1.8x faster than a header-based approach, while reducing supply droop by 2-7x. An improved flip-chip architecture is also proposed.PhDElectrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/113600/1/npfet_1.pd

    Strain integration and performance optimization in sub-20nm FDSOI CMOS technology

    Get PDF
    La technologie CMOS à base de Silicium complètement déserté sur isolant (FDSOI) est considérée comme une option privilégiée pour les applications à faible consommation telles que les applications mobiles ou les objets connectés. Elle doit cela à son architecture garantissant un excellent comportement électrostatique des transistors ainsi qu'à l'intégration de canaux contraints améliorant la mobilité des porteurs. Ce travail de thèse explore des solutions innovantes en FDSOI pour nœuds 20nm et en deçà, comprenant l'ingénierie de la contrainte mécanique à travers des études sur les matériaux, les dispositifs, les procédés d'intégration et les dessins des circuits. Des simulations mécaniques, caractérisations physiques (µRaman), et intégrations expérimentales de canaux contraints (sSOI, SiGe) ou de procédés générant de la contrainte (nitrure, fluage de l'oxyde enterré) nous permettent d'apporter des recommandations pour la technologie et le dessin physique des transistors en FDSOI. Dans ce travail de thèse, nous avons étudié le transport dans les dispositifs à canal court, ce qui nous a amené à proposer une méthode originale pour extraire simultanément la mobilité des porteurs et la résistance d'accès. Nous mettons ainsi en évidence la sensibilité de la résistance d'accès à la contrainte que ce soit pour des transistors FDSOI ou nanofils. Nous mettons en évidence et modélisons la relaxation de la contrainte dans le SiGe apparaissant lors de la gravure des motifs et causant des effets géométriques (LLE) dans les technologies FDSOI avancées. Nous proposons des solutions de type dessin ainsi que des solutions technologiques afin d'améliorer la performance des cellules standard digitales et de mémoire vive statique (SRAM). En particulier, nous démontrons l'efficacité d'une isolation duale pour la gestion de la contrainte et l'extension de la capacité de polarisation arrière, qui un atout majeur de la technologie FDSOI. Enfin, la technologie 3D séquentielle rend possible la polarisation arrière en régime dynamique, à travers une co-optimisation dessin/technologie (DTCO).The Ultra-Thin Body and Buried oxide Fully Depleted Silicon On Insulator (UTBB FDSOI) CMOS technology has been demonstrated to be highly efficient for low power and low leakage applications such as mobile, internet of things or wearable. This is mainly due to the excellent electrostatics in the transistor and the successful integration of strained channel as a carrier mobility booster. This work explores scaling solutions of FDSOI for sub-20nm nodes, including innovative strain engineering, relying on material, device, process integration and circuit design layout studies. Thanks to mechanical simulations, physical characterizations and experimental integration of strained channels (sSOI, SiGe) and local stressors (nitride, oxide creeping, SiGe source/drain) into FDSOI CMOS transistors, we provide guidelines for technology and physical circuit design. In this PhD, we have in-depth studied the carrier transport in short devices, leading us to propose an original method to extract simultaneously the carrier mobility and the access resistance and to clearly evidence and extract the strain sensitivity of the access resistance, not only in FDSOI but also in strained nanowire transistors. Most of all, we evidence and model the patterning-induced SiGe strain relaxation, which is responsible for electrical Local Layout Effects (LLE) in advanced FDSOI transistors. Taking into account these geometrical effects observed at the nano-scale, we propose design and technology solutions to enhance Static Random Access Memory (SRAM) and digital standard cells performance and especially an original dual active isolation integration. Such a solution is not only stress-friendly but can also extend the powerful back-bias capability, which is a key differentiating feature of FDSOI. Eventually the 3D monolithic integration can also leverage planar Fully-Depleted devices by enabling dynamic back-bias owing to a Design/Technology Co-Optimization

    Function Implementation in a Multi-Gate Junctionless FET Structure

    Get PDF
    Title from PDF of title page, viewed September 18, 2023Dissertation advisor: Mostafizur RahmanVitaIncludes bibliographical references (pages 95-117)Dissertation (Ph.D.)--Department of Computer Science and Electrical Engineering, Department of Physics and Astronomy. University of Missouri--Kansas City, 2023This dissertation explores designing and implementing a multi-gate junctionless field-effect transistor (JLFET) structure and its potential applications beyond conventional devices. The JLFET is a promising alternative to conventional transistors due to its simplified fabrication process and improved electrical characteristics. However, previous research has focused primarily on the device's performance at the individual transistor level, neglecting its potential for implementing complex functions. This dissertation fills this research gap by investigating the function implementation capabilities of the JLFET structure and proposing novel circuit designs based on this technology. The first part of this dissertation presents a comprehensive review of the existing literature on JLFETs, including their fabrication techniques, operating principles, and performance metrics. It highlights the advantages of JLFETs over traditional metal-oxide-semiconductor field-effect transistors (MOSFETs) and discusses the challenges associated with their implementation. Additionally, the review explores the limitations of conventional transistor technologies, emphasizing the need for exploring alternative device architectures. Building upon the theoretical foundation, the dissertation presents a detailed analysis of the multi-gate JLFET structure and its potential for realizing advanced functions. The study explores the impact of different design parameters, such as channel length, gate oxide thickness, and doping profiles, on the device performance. It investigates the trade-offs between power consumption, speed, and noise immunity, and proposes design guidelines for optimizing the function implementation capabilities of the JLFET. To demonstrate the practical applicability of the JLFET structure, this dissertation introduces several novel circuit designs based on this technology. These designs leverage the unique characteristics of the JLFET, such as its steep subthreshold slope and improved on/off current ratio, to implement complex functions efficiently. The proposed circuits include arithmetic units, memory cells, and digital logic gates. Detailed simulations and analyses are conducted to evaluate their performance, power consumption, and scalability. Furthermore, this dissertation explores the potential of the JLFET structure for emerging technologies, such as neuromorphic computing and bioelectronics. It investigates how the JLFET can be employed to realize energy-efficient and biocompatible devices for applications in artificial intelligence and biomedical engineering. The study investigates the compatibility of the JLFET with various materials and substrates, as well as its integration with other functional components. In conclusion, this dissertation contributes to the field of nanoelectronics by providing a comprehensive investigation into the function implementation capabilities of the multi-gate JLFET structure. It highlights the potential of this device beyond its individual transistor performance and proposes novel circuit designs based on this technology. The findings of this research pave the way for the development of advanced electronic systems that are more energy-efficient, faster, and compatible with emerging applications in diverse fields.Introduction -- Literature review -- Crosstalk principle -- Experiment of crosstalk -- Device architecture -- Simulation & results -- Conclusio

    Approximate and timing-speculative hardware design for high-performance and energy-efficient video processing

    Get PDF
    Since the end of transistor scaling in 2-D appeared on the horizon, innovative circuit design paradigms have been on the rise to go beyond the well-established and ultraconservative exact computing. Many compute-intensive applications – such as video processing – exhibit an intrinsic error resilience and do not necessarily require perfect accuracy in their numerical operations. Approximate computing (AxC) is emerging as a design alternative to improve the performance and energy-efficiency requirements for many applications by trading its intrinsic error tolerance with algorithm and circuit efficiency. Exact computing also imposes a worst-case timing to the conventional design of hardware accelerators to ensure reliability, leading to an efficiency loss. Conversely, the timing-speculative (TS) hardware design paradigm allows increasing the frequency or decreasing the voltage beyond the limits determined by static timing analysis (STA), thereby narrowing pessimistic safety margins that conventional design methods implement to prevent hardware timing errors. Timing errors should be evaluated by an accurate gate-level simulation, but a significant gap remains: How these timing errors propagate from the underlying hardware all the way up to the entire algorithm behavior, where they just may degrade the performance and quality of service of the application at stake? This thesis tackles this issue by developing and demonstrating a cross-layer framework capable of performing investigations of both AxC (i.e., from approximate arithmetic operators, approximate synthesis, gate-level pruning) and TS hardware design (i.e., from voltage over-scaling, frequency over-clocking, temperature rising, and device aging). The cross-layer framework can simulate both timing errors and logic errors at the gate-level by crossing them dynamically, linking the hardware result with the algorithm-level, and vice versa during the evolution of the application’s runtime. Existing frameworks perform investigations of AxC and TS techniques at circuit-level (i.e., at the output of the accelerator) agnostic to the ultimate impact at the application level (i.e., where the impact is truly manifested), leading to less optimization. Unlike state of the art, the framework proposed offers a holistic approach to assessing the tradeoff of AxC and TS techniques at the application-level. This framework maximizes energy efficiency and performance by identifying the maximum approximation levels at the application level to fulfill the required good enough quality. This thesis evaluates the framework with an 8-way SAD (Sum of Absolute Differences) hardware accelerator operating into an HEVC encoder as a case study. Application-level results showed that the SAD based on the approximate adders achieve savings of up to 45% of energy/operation with an increase of only 1.9% in BD-BR. On the other hand, VOS (Voltage Over-Scaling) applied to the SAD generates savings of up to 16.5% in energy/operation with around 6% of increase in BD-BR. The framework also reveals that the boost of about 6.96% (at 50°) to 17.41% (at 75° with 10- Y aging) in the maximum clock frequency achieved with TS hardware design is totally lost by the processing overhead from 8.06% to 46.96% when choosing an unreliable algorithm to the blocking match algorithm (BMA). We also show that the overhead can be avoided by adopting a reliable BMA. This thesis also shows approximate DTT (Discrete Tchebichef Transform) hardware proposals by exploring a transform matrix approximation, truncation and pruning. The results show that the approximate DTT hardware proposal increases the maximum frequency up to 64%, minimizes the circuit area in up to 43.6%, and saves up to 65.4% in power dissipation. The DTT proposal mapped for FPGA shows an increase of up to 58.9% on the maximum frequency and savings of about 28.7% and 32.2% on slices and dynamic power, respectively compared with stat
    • …
    corecore