27 research outputs found

    Testability and redundancy techniques for improved yield and reliability of CMOS VLSI circuits

    Get PDF
    The research presented in this thesis is concerned with the design of fault-tolerant integrated circuits as a contribution to the design of fault-tolerant systems. The economical manufacture of very large area ICs will necessitate the incorporation of fault-tolerance features which are routinely employed in current high density dynamic random access memories. Furthermore, the growing use of ICs in safety-critical applications and/or hostile environments in addition to the prospect of single-chip systems will mandate the use of fault-tolerance for improved reliability. A fault-tolerant IC must be able to detect and correct all possible faults that may affect its operation. The ability of a chip to detect its own faults is not only necessary for fault-tolerance, but it is also regarded as the ultimate solution to the problem of testing. Off-line periodic testing is selected for this research because it achieves better coverage of physical faults and it requires less extra hardware than on-line error detection techniques. Tests for CMOS stuck-open faults are shown to detect all other faults. Simple test sequence generation procedures for the detection of all faults are derived. The test sequences generated by these procedures produce a trivial output, thereby, greatly simplifying the task of test response analysis. A further advantage of the proposed test generation procedures is that they do not require the enumeration of faults. The implementation of built-in self-test is considered and it is shown that the hardware overhead is comparable to that associated with pseudo-random and pseudo-exhaustive techniques while achieving a much higher fault coverage through-the use of the proposed test generation procedures. The consideration of the problem of testing the test circuitry led to the conclusion that complete test coverage may be achieved if separate chips cooperate in testing each other's untested parts. An alternative approach towards complete test coverage would be to design the test circuitry so that it is as distributed as possible and so that it is tested as it performs its function. Fault correction relies on the provision of spare units and a means of reconfiguring the circuit so that the faulty units are discarded. This raises the question of what is the optimum size of a unit? A mathematical model, linking yield and reliability is therefore developed to answer such a question and also to study the effects of such parameters as the amount of redundancy, the size of the additional circuitry required for testing and reconfiguration, and the effect of periodic testing on reliability. The stringent requirement on the size of the reconfiguration logic is illustrated by the application of the model to a typical example. Another important result concerns the effect of periodic testing on reliability. It is shown that periodic off-line testing can achieve approximately the same level of reliability as on-line testing, even when the time between tests is many hundreds of hours

    Reliable Low-Power High Performance Spintronic Memories

    Get PDF
    Moores Gesetz folgend, ist es der Chipindustrie in den letzten fünf Jahrzehnten gelungen, ein explosionsartiges Wachstum zu erreichen. Dies hatte ebenso einen exponentiellen Anstieg der Nachfrage von Speicherkomponenten zur Folge, was wiederum zu speicherlastigen Chips in den heutigen Computersystemen führt. Allerdings stellen traditionelle on-Chip Speichertech- nologien wie Static Random Access Memories (SRAMs), Dynamic Random Access Memories (DRAMs) und Flip-Flops eine Herausforderung in Bezug auf Skalierbarkeit, Verlustleistung und Zuverlässigkeit dar. Eben jene Herausforderungen und die überwältigende Nachfrage nach höherer Performanz und Integrationsdichte des on-Chip Speichers motivieren Forscher, nach neuen nichtflüchtigen Speichertechnologien zu suchen. Aufkommende spintronische Spe- ichertechnologien wie Spin Orbit Torque (SOT) und Spin Transfer Torque (STT) erhielten in den letzten Jahren eine hohe Aufmerksamkeit, da sie eine Reihe an Vorteilen bieten. Dazu gehören Nichtflüchtigkeit, Skalierbarkeit, hohe Beständigkeit, CMOS Kompatibilität und Unan- fälligkeit gegenüber Soft-Errors. In der Spintronik repräsentiert der Spin eines Elektrons dessen Information. Das Datum wird durch die Höhe des Widerstandes gespeichert, welche sich durch das Anlegen eines polarisierten Stroms an das Speichermedium verändern lässt. Das Prob- lem der statischen Leistung gehen die Speichergeräte sowohl durch deren verlustleistungsfreie Eigenschaft, als auch durch ihr Standard- Aus/Sofort-Ein Verhalten an. Nichtsdestotrotz sind noch andere Probleme, wie die hohe Zugriffslatenz und die Energieaufnahme zu lösen, bevor sie eine verbreitete Anwendung finden können. Um diesen Problemen gerecht zu werden, sind neue Computerparadigmen, -architekturen und -entwurfsphilosophien notwendig. Die hohe Zugriffslatenz der Spintroniktechnologie ist auf eine vergleichsweise lange Schalt- dauer zurückzuführen, welche die von konventionellem SRAM übersteigt. Des Weiteren ist auf Grund des stochastischen Schaltvorgangs der Speicherzelle und des Einflusses der Prozessvari- ation ein nicht zu vernachlässigender Zeitraum dafür erforderlich. In diesem Zeitraum wird ein konstanter Schreibstrom durch die Bitzelle geleitet, um den Schaltvorgang zu gewährleisten. Dieser Vorgang verursacht eine hohe Energieaufnahme. Für die Leseoperation wird gleicher- maßen ein beachtliches Zeitfenster benötigt, ebenfalls bedingt durch den Einfluss der Prozess- variation. Dem gegenüber stehen diverse Zuverlässigkeitsprobleme. Dazu gehören unter An- derem die Leseintereferenz und andere Degenerationspobleme, wie das des Time Dependent Di- electric Breakdowns (TDDB). Diese Zuverlässigkeitsprobleme sind wiederum auf die benötigten längeren Schaltzeiten zurückzuführen, welche in der Folge auch einen über längere Zeit an- liegenden Lese- bzw. Schreibstrom implizieren. Es ist daher notwendig, sowohl die Energie, als auch die Latenz zur Steigerung der Zuverlässigkeit zu reduzieren, um daraus einen potenziellen Kandidaten für ein on-Chip Speichersystem zu machen. In dieser Dissertation werden wir Entwurfsstrategien vorstellen, welche das Ziel verfolgen, die Herausforderungen des Cache-, Register- und Flip-Flop-Entwurfs anzugehen. Dies erre- ichen wir unter Zuhilfenahme eines Cross-Layer Ansatzes. Für Caches entwickelten wir ver- schiedene Ansätze auf Schaltkreisebene, welche sowohl auf der Speicherarchitekturebene, als auch auf der Systemebene in Bezug auf Energieaufnahme, Performanzsteigerung und Zuver- lässigkeitverbesserung evaluiert werden. Wir entwickeln eine Selbstabschalttechnik, sowohl für die Lese-, als auch die Schreiboperation von Caches. Diese ist in der Lage, den Abschluss der entsprechenden Operation dynamisch zu ermitteln. Nachdem der Abschluss erkannt wurde, wird die Lese- bzw. Schreiboperation sofort gestoppt, um Energie zu sparen. Zusätzlich limitiert die Selbstabschalttechnik die Dauer des Stromflusses durch die Speicherzelle, was wiederum das Auftreten von TDDB und Leseinterferenz bei Schreib- bzw. Leseoperationen re- duziert. Zur Verbesserung der Schreiblatenz heben wir den Schreibstrom an der Bitzelle an, um den magnetischen Schaltprozess zu beschleunigen. Um registerbankspezifische Anforderungen zu berücksichtigen, haben wir zusätzlich eine Multiport-Speicherarchitektur entworfen, welche eine einzigartige Eigenschaft der SOT-Zelle ausnutzt, um simultan Lese- und Schreiboperatio- nen auszuführen. Es ist daher möglich Lese/Schreib- Konfilkte auf Bitzellen-Ebene zu lösen, was sich wiederum in einer sehr viel einfacheren Multiport- Registerbankarchitektur nieder- schlägt. Zusätzlich zu den Speicheransätzen haben wir ebenfalls zwei Flip-Flop-Architekturen vorgestellt. Die erste ist eine nichtflüchtige non-Shadow Flip-Flop-Architektur, welche die Speicherzelle als aktive Komponente nutzt. Dies ermöglicht das sofortige An- und Ausschalten der Versorgungss- pannung und ist daher besonders gut für aggressives Powergating geeignet. Alles in Allem zeigt der vorgestellte Flip-Flop-Entwurf eine ähnliche Timing-Charakteristik wie die konventioneller CMOS Flip-Flops auf. Jedoch erlaubt er zur selben Zeit eine signifikante Reduktion der statis- chen Leistungsaufnahme im Vergleich zu nichtflüchtigen Shadow- Flip-Flops. Die zweite ist eine fehlertolerante Flip-Flop-Architektur, welche sich unanfällig gegenüber diversen Defekten und Fehlern verhält. Die Leistungsfähigkeit aller vorgestellten Techniken wird durch ausführliche Simulationen auf Schaltkreisebene verdeutlicht, welche weiter durch detaillierte Evaluationen auf Systemebene untermauert werden. Im Allgemeinen konnten wir verschiedene Techniken en- twickeln, die erhebliche Verbesserungen in Bezug auf Performanz, Energie und Zuverlässigkeit von spintronischen on-Chip Speichern, wie Caches, Register und Flip-Flops erreichen

    Design-for-delay-testability techniques for high-speed digital circuits

    Get PDF
    The importance of delay faults is enhanced by the ever increasing clock rates and decreasing geometry sizes of nowadays' circuits. This thesis focuses on the development of Design-for-Delay-Testability (DfDT) techniques for high-speed circuits and embedded cores. The rising costs of IC testing and in particular the costs of Automatic Test Equipment are major concerns for the semiconductor industry. To reverse the trend of rising testing costs, DfDT is\ud getting more and more important

    Resilience of an embedded architecture using hardware redundancy

    Get PDF
    In the last decade the dominance of the general computing systems market has being replaced by embedded systems with billions of units manufactured every year. Embedded systems appear in contexts where continuous operation is of utmost importance and failure can be profound. Nowadays, radiation poses a serious threat to the reliable operation of safety-critical systems. Fault avoidance techniques, such as radiation hardening, have been commonly used in space applications. However, these components are expensive, lag behind commercial components with regards to performance and do not provide 100% fault elimination. Without fault tolerant mechanisms, many of these faults can become errors at the application or system level, which in turn, can result in catastrophic failures. In this work we study the concepts of fault tolerance and dependability and extend these concepts providing our own definition of resilience. We analyse the physics of radiation-induced faults, the damage mechanisms of particles and the process that leads to computing failures. We provide extensive taxonomies of 1) existing fault tolerant techniques and of 2) the effects of radiation in state-of-the-art electronics, analysing and comparing their characteristics. We propose a detailed model of faults and provide a classification of the different types of faults at various levels. We introduce an algorithm of fault tolerance and define the system states and actions necessary to implement it. We introduce novel hardware and system software techniques that provide a more efficient combination of reliability, performance and power consumption than existing techniques. We propose a new element of the system called syndrome that is the core of a resilient architecture whose software and hardware can adapt to reliable and unreliable environments. We implement a software simulator and disassembler and introduce a testing framework in combination with ERA’s assembler and commercial hardware simulators

    Semiconductor-technology exploration : getting the most out of the MOST

    Get PDF

    Network-on-Chip

    Get PDF
    Addresses the Challenges Associated with System-on-Chip Integration Network-on-Chip: The Next Generation of System-on-Chip Integration examines the current issues restricting chip-on-chip communication efficiency, and explores Network-on-chip (NoC), a promising alternative that equips designers with the capability to produce a scalable, reusable, and high-performance communication backbone by allowing for the integration of a large number of cores on a single system-on-chip (SoC). This book provides a basic overview of topics associated with NoC-based design: communication infrastructure design, communication methodology, evaluation framework, and mapping of applications onto NoC. It details the design and evaluation of different proposed NoC structures, low-power techniques, signal integrity and reliability issues, application mapping, testing, and future trends. Utilizing examples of chips that have been implemented in industry and academia, this text presents the full architectural design of components verified through implementation in industrial CAD tools. It describes NoC research and developments, incorporates theoretical proofs strengthening the analysis procedures, and includes algorithms used in NoC design and synthesis. In addition, it considers other upcoming NoC issues, such as low-power NoC design, signal integrity issues, NoC testing, reconfiguration, synthesis, and 3-D NoC design. This text comprises 12 chapters and covers: The evolution of NoC from SoC—its research and developmental challenges NoC protocols, elaborating flow control, available network topologies, routing mechanisms, fault tolerance, quality-of-service support, and the design of network interfaces The router design strategies followed in NoCs The evaluation mechanism of NoC architectures The application mapping strategies followed in NoCs Low-power design techniques specifically followed in NoCs The signal integrity and reliability issues of NoC The details of NoC testing strategies reported so far The problem of synthesizing application-specific NoCs Reconfigurable NoC design issues Direction of future research and development in the field of NoC Network-on-Chip: The Next Generation of System-on-Chip Integration covers the basic topics, technology, and future trends relevant to NoC-based design, and can be used by engineers, students, and researchers and other industry professionals interested in computer architecture, embedded systems, and parallel/distributed systems

    Reliable Design of Three-Dimensional Integrated Circuits

    Get PDF

    Memory architectures for exaflop computing systems

    Get PDF
    Most computing systems are heavily dependent on their main memories, as their primary storage, or as an intermediate cache for slower storage systems (HDDs). The capacity of memory systems, as well as their performance, have a direct impact on overall computing capabilities of the system, and are also major contributors to its initial and operating costs. Dynamic Random Access Memory (DRAM) technology has been dominating the main memory landscape since its beginnings in 1970s until today. However, due to DRAM's inherent limitations, its steady rate of development has saturated over the past decade, creating a disparity between CPU and main memory performance, known as the memory wall. Modern parallel architectures, such as High-Performance Computing (HPC) clusters and manycore solutions, create even more stress on their memory systems. It is not trivial to estimate memory requirements that these systems will have in the future, and if DRAM technology would be able to meet them, or we would need to look for a novel memory solution. This thesis attempts to give insight in the most important technological challenges that future memory systems need to address, in order to meet the ever growing requirements of users and their applications, in manycore and HPC context. We try to describe the limitations of DRAM, as the dominant technology in today's main memory systems, that may impede performance or increase cost of future systems. We discuss some of the emerging memory technologies, and by comparing them with DRAM, we try to estimate their potential usage in future memory systems. The thesis evaluates the requirements of manycore scientific applications, in terms of memory bandwidth and footprint, and estimates how these requirements may change in the future. With this evaulation in mind, we propose a hybrid memory solution that employs DRAM and PCM, as well as several page placement and page migration policies, to bridge the gap between fast and small DRAM and larger but slower non-volatile memory. As the aforementioned evaluations required custom software solutions, we present tools we produced over the course of this PhD, which continue to be used in Heterogeneous Computer Architectures group in Barcelona Supercomputing Center. First, Limpio - a LIghtweight MPI instrumentatiOn framework, that provides an interface for low-overhead instrumentation and profiling of MPI applications with user-defined routines. Second, MemTraceMPI, a Valgrind tool, used to produce memory access traces of MPI applications, with several innovative concepts included (filter-cache, iteration tracing, compressed trace files).La mayoría de los sistemas de computación dependen en gran medida de sus principales recuerdos, como su almacenamiento primario, o como un caché intermedio para sistemas de almacenamiento más lentos (discos duros). La capacidad de los sistemas de memoria, así como su rendimiento, tienen un impacto directo en las capacidades globales de computación del sistema, y también son los principales contribuyentes a sus costos iniciales y de operación. Tecnología Dynamic Random Access memoria (DRAM) ha estado dominando el principal paisaje de memoria desde sus inicios en 1970 hasta la actualidad. Sin embargo, debido a las limitaciones inherentes de DRAM, su tasa constante de desarrollo ha saturado durante la última década, creando una disparidad entre la CPU y el rendimiento de la memoria principal, conocido como el muro de la memoria. Arquitecturas modernas paralelas, como la computación (HPC) de alto rendimiento y soluciones manycore, crear aún más presión sobre sus sistemas de memoria. No es trivial para estimar los requisitos de memoria que estos sistemas tendrán en el futuro, y si la tecnología DRAM sería capaz de cumplir con ellas, o que tendría que buscar una solución de memoria novela. En esta tesis se intenta dar una idea de los más importantes retos tecnológicos que los sistemas de memoria futuras deben abordar, con el fin de satisfacer las necesidades cada vez mayores de los usuarios y sus aplicaciones, en Manycore y HPC contexto. Intentamos describir las limitaciones de memoria DRAM, como la tecnología dominante en los sistemas de memoria principal de hoy en día, que pueden impedir el rendimiento o el aumento de los costos de los sistemas futuros. Se discuten algunas de las tecnologías de memoria emergentes, y comparándolos con DRAM, tratamos de estimar su uso potencial en sistemas de memoria futuras. La tesis evalúa los requisitos de las aplicaciones científicas manycore, en términos de ancho de banda de memoria y huella, y estima cómo estos requisitos pueden cambiar en el futuro. Con esta evaulation en mente, proponemos una solución de memoria híbrida que emplea DRAM y PCM, así como varias políticas de colocación de la página y la página de la migración, para cerrar la brecha entre la DRAM rápido y pequeño y más grande pero la memoria más lenta no volátil. Como las evaluaciones mencionadas necesarias soluciones de software personalizadas, se presentan las herramientas que hemos producido en el transcurso de esta tesis doctoral, que se siguen utilizando en el grupo heterogéneo de computadoras Arquitecturas en Barcelona Supercomputing Center. En primer lugar, Limpio - un marco MPI Instrumentación ligero, que proporciona una interfaz para la instrumentación de baja sobrecarga y perfilado de aplicaciones MPI con rutinas definidas por el usuario. En segundo lugar, MemTraceMPI, una herramienta Valgrind, utilizado para producir los rastros de acceso a memoria de aplicaciones MPI, con varios conceptos innovadores incluido (filtro-cache, trazado iteración, archivos de seguimiento comprimido)

    Modeling and Mitigation of Soft Errors in Nanoscale SRAMs

    Get PDF
    Energetic particle (alpha particle, cosmic neutron, etc.) induced single event data upset or soft error has emerged as a key reliability concern in SRAMs in sub-100 nanometre technologies. Low operating voltage, small node capacitance, high packing density, and lack of error masking mechanisms are primarily responsible for the soft error susceptibility of SRAMs. In addition, since SRAM occupies the majority of die area in system-on-chips (SoCs) and microprocessors, different leakage reduction techniques, such as, supply voltage reduction, gated grounding, etc., are applied to SRAMs in order to limit the overall chip leakage. These leakage reduction techniques exponentially increase the soft error rate in SRAMs. The soft error rate is further accentuated by process variations, which are prominent in scaled-down technologies. In this research, we address these concerns and propose techniques to characterize and mitigate soft errors in nanoscale SRAMs. We develop a comprehensive analytical model of the critical charge, which is a key to assessing the soft error susceptibility of SRAMs. The model is based on the dynamic behaviour of the cell and a simple decoupling technique for the non-linearly coupled storage nodes. The model describes the critical charge in terms of NMOS and PMOS transistor parameters, cell supply voltage, and noise current parameters. Consequently, it enables characterizing the spread of critical charge due to process induced variations in these parameters and to manufacturing defects, such as, resistive contacts or vias. In addition, the model can estimate the improvement in critical charge when MIM capacitors are added to the cell in order to improve the soft error robustness. The model is validated by SPICE simulations (90nm CMOS) and radiation test. The critical charge calculated by the model is in good agreement with SPICE simulations with a maximum discrepancy of less than 5%. The soft error rate estimated by the model for low voltage (sub 0.8 V) operations is within 10% of the soft error rate measured in the radiation test. Therefore, the model can serve as a reliable alternative to time consuming SPICE simulations for optimizing the critical charge and hence the soft error rate at the design stage. In order to limit the soft error rate further, we propose an area-efficient multiword based error correction code (MECC) scheme. The MECC scheme combines four 32 bit data words to form a composite 128 bit ECC word and uses an optimized 4-input transmission-gate XOR logic. Thus MECC significantly reduces the area overhead for check-bit storage and the delay penalty for error correction. In addition, MECC interleaves two composite words in a row for limiting cosmic neutron induced multi-bit errors. The ground potentials of the composite words are controlled to minimize leakage power without compromising the read data stability. However, use of composite words involves a unique write operation where one data word is written while other three data words are read to update the check-bits. A power efficient word line signaling technique is developed to facilitate the write operation. A 64 kb SRAM macro with MECC is designed and fabricated in a commercial 90nm CMOS technology. Measurement results show that the SRAM consumes 534 ÎĽW at 100 MHz with a data latency of 3.3 ns for a single bit error correction. This translates into 82% per-bit energy saving and 8x speed improvement over recently reported multiword ECC schemes. Accelerated neutron radiation test carried out at TRIUMF in Vancouver confirms that the proposed MECC scheme can correct up to 85% of soft errors
    corecore