634 research outputs found

    Design-for-delay-testability techniques for high-speed digital circuits

    Get PDF
    The importance of delay faults is enhanced by the ever increasing clock rates and decreasing geometry sizes of nowadays' circuits. This thesis focuses on the development of Design-for-Delay-Testability (DfDT) techniques for high-speed circuits and embedded cores. The rising costs of IC testing and in particular the costs of Automatic Test Equipment are major concerns for the semiconductor industry. To reverse the trend of rising testing costs, DfDT is\ud getting more and more important

    Fault and Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices

    Get PDF
    This research addresses design of a reliable computer from unreliable device technologies. A system architecture is developed for a fault and defect tolerant (FDT) computer. Trade-offs between different techniques are studied and yield and hardware cost models are developed. Fault and defect tolerant designs are created for the processor and the cache memory. Simulation results for the content-addressable memory (CAM)-based cache show 90% yield with device failure probabilities of 3 x 10(-6), three orders of magnitude better than non fault tolerant caches of the same size. The entire processor achieves 70% yield with device failure probabilities exceeding 10(-6). The required hardware redundancy is approximately 15 times that of a non-fault tolerant design. While larger than current FT designs, this architecture allows the use of devices much more likely to fail than silicon CMOS. As part of model development, an improved model is derived for NAND Multiplexing. The model is the first accurate model for small and medium amounts of redundancy. Previous models are extended to account for dependence between the inputs and produce more accurate results

    AI/ML Algorithms and Applications in VLSI Design and Technology

    Full text link
    An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

    Characterisation and mitigation of long-term degradation effects in programmable logic

    No full text
    Reliability has always been an issue in silicon device engineering, but until now it has been managed by the carefully tuned fabrication process. In the future the underlying physical limitations of silicon-based electronics, plus the practical challenges of manufacturing with such complexity at such a small scale, will lead to a crunch point where transistor-level reliability must be forfeited to continue achieving better productivity. Field-programmable gate arrays (FPGAs) are built on state-of-the-art silicon processes, but it has been recognised for some time that their distinctive characteristics put them in a favourable position over application-specific integrated circuits in the face of the reliability challenge. The literature shows how a regular structure, interchangeable resources and an ability to reconfigure can all be exploited to detect, locate, and overcome degradation and keep an FPGA application running. To fully exploit these characteristics, a better understanding is needed of the behavioural changes that are seen in the resources that make up an FPGA under ageing. Modelling is an attractive approach to this and in this thesis the causes and effects are explored of three important degradation mechanisms. All are shown to have an adverse affect on FPGA operation, but their characteristics show novel opportunities for ageing mitigation. Any modelling exercise is built on assumptions and so an empirical method is developed for investigating ageing on hardware with an accelerated-life test. Here, experiments show that timing degradation due to negative-bias temperature instability is the dominant process in the technology considered. Building on simulated and experimental results, this work also demonstrates a variety of methods for increasing the lifetime of FPGA lookup tables. The pre-emptive measure of wear-levelling is investigated in particular detail, and it is shown by experiment how di fferent reconfiguration algorithms can result in a significant reduction to the rate of degradation

    Investigation into yield and reliability enhancement of TSV-based three-dimensional integration circuits

    No full text
    Three dimensional integrated circuits (3D ICs) have been acknowledged as a promising technology to overcome the interconnect delay bottleneck brought by continuous CMOS scaling. Recent research shows that through-silicon-vias (TSVs), which act as vertical links between layers, pose yield and reliability challenges for 3D design. This thesis presents three original contributions.The first contribution presents a grouping-based technique to improve the yield of 3D ICs under manufacturing TSV defects, where regular and redundant TSVs are partitioned into groups. In each group, signals can select good TSVs using rerouting multiplexers avoiding defective TSVs. Grouping ratio (regular to redundant TSVs in one group) has an impact on yield and hardware overhead. Mathematical probabilistic models are presented for yield analysis under the influence of independent and clustering defect distributions. Simulation results using MATLAB show that for a given number of TSVs and TSV failure rate, careful selection of grouping ratio results in achieving 100% yield at minimal hardware cost (number of multiplexers and redundant TSVs) in comparison to a design that does not exploit TSV grouping ratios. The second contribution presents an efficient online fault tolerance technique based on redundant TSVs, to detect TSV manufacturing defects and address thermal-induced reliability issue. The proposed technique accounts for both fault detection and recovery in the presence of three TSV defects: voids, delamination between TSV and landing pad, and TSV short-to-substrate. Simulations using HSPICE and ModelSim are carried out to validate fault detection and recovery. Results show that regular and redundant TSVs can be divided into groups to minimise area overhead without affecting the fault tolerance capability of the technique. Synthesis results using 130-nm design library show that 100% repair capability can be achieved with low area overhead (4% for the best case). The last contribution proposes a technique with joint consideration of temperature mitigation and fault tolerance without introducing additional redundant TSVs. This is achieved by reusing spare TSVs that are frequently deployed for improving yield and reliability in 3D ICs. The proposed technique consists of two steps: TSV determination step, which is for achieving optimal partition between regular and spare TSVs into groups; The second step is TSV placement, where temperature mitigation is targeted while optimizing total wirelength and routing difference. Simulation results show that using the proposed technique, 100% repair capability is achieved across all (five) benchmarks with an average temperature reduction of 75.2? (34.1%) (best case is 99.8? (58.5%)), while increasing wirelength by a small amount

    Constraint-driven RF test stimulus generation and built-in test

    Get PDF
    With the explosive growth in wireless applications, the last decade witnessed an ever-increasing test challenge for radio frequency (RF) circuits. While the design community has pushed the envelope far into the future, by expanding CMOS process to be used with high-frequency wireless devices, test methodology has not advanced at the same pace. Consequently, testing such devices has become a major bottleneck in high-volume production, further driven by the growing need for tighter quality control. RF devices undergo testing during the prototype phase and during high-volume manufacturing (HVM). The benchtop test equipment used throughout prototyping is very precise yet specialized for a subset of functionalities. HVM calls for a different kind of test paradigm that emphasizes throughput and sufficiency, during which the projected performance parameters are measured one by one for each device by automated test equipment (ATE) and compared against defined limits called specifications. The set of tests required for each product differs greatly in terms of the equipment required and the time taken to test individual devices. Together with signal integrity, precision, and repeatability concerns, the initial cost of RF ATE is prohibitively high. As more functionality and protocols are integrated into a single RF device, the required number of specifications to be tested also increases, adding to the overall cost of testing, both in terms of the initial and recurring operating costs. In addition to the cost problem, RF testing proposes another challenge when these components are integrated into package-level system solutions. In systems-on-packages (SOP), the test problems resulting from signal integrity, input/output bandwidth (IO), and limited controllability and observability have initiated a paradigm shift in high-speed analog testing, favoring alternative approaches such as built-in tests (BIT) where the test functionality is brought into the package. This scheme can make use of a low-cost external tester connected through a low-bandwidth link in order to perform demanding response evaluations, as well as make use of the analog-to-digital converters and the digital signal processors available in the package to facilitate testing. Although research on analog built-in test has demonstrated hardware solutions for single specifications, the paradigm shift calls for a rather general approach in which a single methodology can be applied across different devices, and multiple specifications can be verified through a single test hardware unit, minimizing the area overhead. Specification-based alternate test methodology provides a suitable and flexible platform for handling the challenges addressed above. In this thesis, a framework that integrates ATE and system constraints into test stimulus generation and test response extraction is presented for the efficient production testing of high-performance RF devices using specification-based alternate tests. The main components of the presented framework are as follows: Constraint-driven RF alternate test stimulus generation: An automated test stimulus generation algorithm for RF devices that are evaluated by a specification-based alternate test solution is developed. The high-level models of the test signal path define constraints in the search space of the optimized test stimulus. These models are generated in enough detail such that they inherently define limitations of the low-cost ATE and the I/O restrictions of the device under test (DUT), yet they are simple enough that the non-linear optimization problem can be solved empirically in a reasonable amount of time. Feature extractors for BIT: A methodology for the built-in testing of RF devices integrated into SOPs is developed using additional hardware components. These hardware components correlate the high-bandwidth test response to low bandwidth signatures while extracting the test-critical features of the DUT. Supervised learning is used to map these extracted features, which otherwise are too complicated to decipher by plain mathematical analysis, into the specifications under test. Defect-based alternate testing of RF circuits: A methodology for the efficient testing of RF devices with low-cost defect-based alternate tests is developed. The signature of the DUT is probabilistically compared with a class of defect-free device signatures to explore possible corners under acceptable levels of process parameter variations. Such a defect filter applies discrimination rules generated by a supervised classifier and eliminates the need for a library of possible catastrophic defects.Ph.D.Committee Chair: Chatterjee, Abhijit; Committee Member: Durgin, Greg; Committee Member: Keezer, David; Committee Member: Milor, Linda; Committee Member: Sitaraman, Sures

    Dependable Embedded Systems

    Get PDF
    This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems

    On Energy Efficient Computing Platforms

    Get PDF
    In accordance with the Moore's law, the increasing number of on-chip integrated transistors has enabled modern computing platforms with not only higher processing power but also more affordable prices. As a result, these platforms, including portable devices, work stations and data centres, are becoming an inevitable part of the human society. However, with the demand for portability and raising cost of power, energy efficiency has emerged to be a major concern for modern computing platforms. As the complexity of on-chip systems increases, Network-on-Chip (NoC) has been proved as an efficient communication architecture which can further improve system performances and scalability while reducing the design cost. Therefore, in this thesis, we study and propose energy optimization approaches based on NoC architecture, with special focuses on the following aspects. As the architectural trend of future computing platforms, 3D systems have many bene ts including higher integration density, smaller footprint, heterogeneous integration, etc. Moreover, 3D technology can signi cantly improve the network communication and effectively avoid long wirings, and therefore, provide higher system performance and energy efficiency. With the dynamic nature of on-chip communication in large scale NoC based systems, run-time system optimization is of crucial importance in order to achieve higher system reliability and essentially energy efficiency. In this thesis, we propose an agent based system design approach where agents are on-chip components which monitor and control system parameters such as supply voltage, operating frequency, etc. With this approach, we have analysed the implementation alternatives for dynamic voltage and frequency scaling and power gating techniques at different granularity, which reduce both dynamic and leakage energy consumption. Topologies, being one of the key factors for NoCs, are also explored for energy saving purpose. A Honeycomb NoC architecture is proposed in this thesis with turn-model based deadlock-free routing algorithms. Our analysis and simulation based evaluation show that Honeycomb NoCs outperform their Mesh based counterparts in terms of network cost, system performance as well as energy efficiency.Siirretty Doriast

    Architecting a One-to-many Traffic-Aware and Secure Millimeter-Wave Wireless Network-in-Package Interconnect for Multichip Systems

    Get PDF
    With the aggressive scaling of device geometries, the yield of complex Multi Core Single Chip(MCSC) systems with many cores will decrease due to the higher probability of manufacturing defects especially, in dies with a large area. Disintegration of large System-on-Chips(SoCs) into smaller chips called chiplets has shown to improve the yield and cost of complex systems. Therefore, platform-based computing modules such as embedded systems and micro-servers have already adopted Multi Core Multi Chip (MCMC) architectures overMCSC architectures. Due to the scaling of memory intensive parallel applications in such systems, data is more likely to be shared among various cores residing in different chips resulting in a significant increase in chip-to-chip traffic, especially one-to-many traffic. This one-to-many traffic is originated mainly to maintain cache-coherence between many cores residing in multiple chips. Besides, one-to-many traffics are also exploited by many parallel programming models, system-level synchronization mechanisms, and control signals. How-ever, state-of-the-art Network-on-Chip (NoC)-based wired interconnection architectures do not provide enough support as they handle such one-to-many traffic as multiple unicast trafficusing a multi-hop MCMC communication fabric. As a result, even a small portion of such one-to-many traffic can significantly reduce system performance as traditional NoC-basedinterconnect cannot mask the high latency and energy consumption caused by chip-to-chipwired I/Os. Moreover, with the increase in memory intensive applications and scaling of MCMC systems, traditional NoC-based wired interconnects fail to provide a scalable inter-connection solution required to support the increased cache-coherence and synchronization generated one-to-many traffic in future MCMC-based High-Performance Computing (HPC) nodes. Therefore, these computation and memory intensive MCMC systems need an energy-efficient, low latency, and scalable one-to-many (broadcast/multicast) traffic-aware interconnection infrastructure to ensure high-performance. Research in recent years has shown that Wireless Network-in-Package (WiNiP) architectures with CMOS compatible Millimeter-Wave (mm-wave) transceivers can provide a scalable, low latency, and energy-efficient interconnect solution for on and off-chip communication. In this dissertation, a one-to-many traffic-aware WiNiP interconnection architecture with a starvation-free hybrid Medium Access Control (MAC), an asymmetric topology, and a novel flow control has been proposed. The different components of the proposed architecture are individually one-to-many traffic-aware and as a system, they collaborate with each other to provide required support for one-to-many traffic communication in a MCMC environment. It has been shown that such interconnection architecture can reduce energy consumption and average packet latency by 46.96% and 47.08% respectively for MCMC systems. Despite providing performance enhancements, wireless channel, being an unguided medium, is vulnerable to various security attacks such as jamming induced Denial-of-Service (DoS), eavesdropping, and spoofing. Further, to minimize the time-to-market and design costs, modern SoCs often use Third Party IPs (3PIPs) from untrusted organizations. An adversary either at the foundry or at the 3PIP design house can introduce a malicious circuitry, to jeopardize an SoC. Such malicious circuitry is known as a Hardware Trojan (HT). An HTplanted in the WiNiP from a vulnerable design or manufacturing process can compromise a Wireless Interface (WI) to enable illegitimate transmission through the infected WI resulting in a potential DoS attack for other WIs in the MCMC system. Moreover, HTs can be used for various other malicious purposes, including battery exhaustion, functionality subversion, and information leakage. This information when leaked to a malicious external attackercan reveals important information regarding the application suites running on the system, thereby compromising the user profile. To address persistent jamming-based DoS attack in WiNiP, in this dissertation, a secure WiNiP interconnection architecture for MCMC systems has been proposed that re-uses the one-to-many traffic-aware MAC and existing Design for Testability (DFT) hardware along with Machine Learning (ML) approach. Furthermore, a novel Simulated Annealing (SA)-based routing obfuscation mechanism was also proposed toprotect against an HT-assisted novel traffic analysis attack. Simulation results show that,the ML classifiers can achieve an accuracy of 99.87% for DoS attack detection while SA-basedrouting obfuscation could reduce application detection accuracy to only 15% for HT-assistedtraffic analysis attack and hence, secure the WiNiP fabric from age-old and emerging attacks
    • …
    corecore