350 research outputs found

    Challenge of a multiple-valued technology in recent deep-submicron VLSI

    Get PDF
    科研費報告書収録論文(課題番号:13558026・基盤研究(B)(2)・13~16/研究代表者:羽生, 貴弘/転送ボトルネックフリー多値ロジックインメモリVLSIの開発と応用

    Empirical timing analysis of CPUs and delay fault tolerant design using partial redundancy

    Get PDF
    The operating clock frequency is determined by the longest signal propagation delay, setup/hold time, and timing margin. These are becoming less predictable with the increasing design complexity and process miniaturization. The difficult challenge is then to ensure that a device operating at its clock frequency is error-free with quantifiable assurance. Effort at device-level engineering will not suffice for these circuits exhibiting wide process variation and heightened sensitivities to operating condition stress. Logic-level redress of this issue is a necessity and we propose a design-level remedy for this timing-uncertainty problem. The aim of the design and analysis approaches presented in this dissertation is to provide framework, SABRE, wherein an increased operating clock frequency can be achieved. The approach is a combination of analytical modeling, experimental analy- sis, hardware /time-redundancy design, exception handling and recovery techniques. Our proposed design replicates only a necessary part of the original circuit to avoid high hardware overhead as in triple-modular-redundancy (TMR). The timing-critical combinational circuit is path-wise partitioned into two sections. The combinational circuits associated with long paths are laid out without any intrusion except for the fan-out connections from the first section of the circuit to a replicated second section of the combinational circuit. Thus only the second section of the circuit is replicated. The signals fanning out from the first section are latches, and thus are far shorter than the paths spanning the entire combinational circuit. The replicated circuit is timed at a subsequent clock cycle to ascertain relaxed timing paths. This insures that the likelihood of mistiming due to stress or process variation is eliminated. During the subsequent clock cycle, the outcome of the two logically identical, yet time-interleaved, circuit outputs are compared to detect faults. When a fault is detected, the retry sig- nal is triggered and the dynamic frequency-step-down takes place before a pipe flush, and retry is issued. The significant timing overhead associated with the retry is offset by the rarity of the timing violation events. Simulation results on ISCAS Benchmark circuits show that 10% of clock frequency gain is possible with 10 to 20 % of hardware overhead of replicated timing-critical circuit

    Modelling and Test Generation for Crosstalk Faults in DSM Chips

    Get PDF
    In the era of deep submicron technology (DSM), many System-on-Chip (SoC) applications require the components to be operating at high clock speeds. With the shrinking feature size and ever increasing clock frequencies, the DSM technology has led to a well-known problem of Signal Integrity (SI) more especially in the connecting layout design. The increasing aspect ratios of metal wires and also the ratio of coupling capacitance over substrate capacitance result in electrical coupling of interconnects which leads to crosstalk problems. In this thesis, first the work carried out to model the crosstalk behaviour between aggressor and victim by considering the distributed RLGC parameters of interconnect and the coupling capacitance and mutual conductance between the two nets is presented. The proposed model also considers the RC linear models of the CMOS drivers and receivers. The behaviour of crosstalk in case of under etching problem has been studied and modelled by distributing and approximating the defect behaviour throughout the nets. Next, the proposed model has also been extended to model the behaviour of crosstalk in case of one victim is influenced by several aggressors by considering all aggressors have similar effect (worst-case) on victim. In all the above cases simulation experiments were also carried out and compared with well-known circuit simulation tool PSPICE. It has been proved that the generated crosstalk model is faster and the results generated are within 10% of error margin compared to latter simulation tool. Because of the accuracy and speed of the proposed model, the model is very useful for both SoC designers and test engineers to analyse the crosstalk behaviour. Each manufactured device needs to be tested thoroughly to ensure the functionality before its delivery. The test pattern generation for crosstalk faults is also necessary to test the corresponding crosstalk faults. In this thesis, the well-known PODEM algorithm for stuck-at faults is extended to generate the test patterns for crosstalk faults between single aggressor and single victim. To apply modified PODEM for crosstalk faults, the transition behaviour has been divided into two logic parts as before transition and after transition. After finding individually required test patterns for before transition and after transition, the generated logic vectors are appended to create transition test patterns for crosstalk faults. The developed algorithm is also applied for a few ISCAS 85 benchmark circuits and the fault coverage is found excellent in most circuits. With the incorporation of proposed algorithm into the ATPG tools, the efficiency of testing will be improved by generating the test patterns for crosstalk faults besides for the conventional stuck-at faults. In generating test patterns for crosstalk faults on single victim due to multiple aggressors, the modified PODEM algorithm is found to be more time consuming. The search capability of Genetic Algorithms in finding the required combination of several input factors for any optimized problem fascinated to apply GA for generating test patterns as generating the test pattern is also similar to finding the required vector out of several input transitions. Initially the GA is applied for generating test patterns for stuck-at faults and compared the results with PODEM algorithm. As the fault coverage is almost similar to the deterministic algorithm PODEM, the GA developed for stuck-at faults is extended to find test patterns for crosstalk faults between single aggressor and single victim. The elitist GA is also applied for a few ISCAS 85 benchmark circuits. Later the algorithm is extended to generate test patterns for worst-case crosstalk faults. It has been proved that elitist GA developed in this thesis is also very useful in generating test patterns for crosstalk faults especially for multiple aggressor and single victim crosstalk faults

    Can deep-sub-micron device noise be used as the basis for probabilistic neural computation?

    Get PDF
    This thesis explores the potential of probabilistic neural architectures for computation with future nanoscale Metal-Oxide-Semiconductor Field Effect Transistors (MOSFETs). In particular, the performance of a Continuous Restricted Boltzmann Machine {CRBM) implemented with generated noise of Random Telegraph Signal (RTS) and 1/ f form has been studied with reference to the 'typical' Gaussian implementation. In this study, a time domain RTS based noise analysis capability has been developed based upon future nanoscale MOSFETs, to represent the effect of nanoscale MOSFET noise on circuit implementation in particular the synaptic analogue multiplier which is subsequently used to implement stochastic behaviour of the CRBM. The result of this thesis indicates little degradation in performance from that of the typical Gaussian CRBM. Through simulation experiments, the CRBM with nanoscale MOSFET noise shows the ability to reconstruct training data, although it takes longer to converge to equilibrium. The results in this thesis do not prove that nanoscale MOSFET noise can be exploited in all contexts and with all data, for probabilistic computation. However, the result indicates, for the first time, that nanoscale MOSFET noise has the potential to be used for probabilistic neural computation hardware implementation. This thesis thus introduces a methodology for a form of technology-downstreaming and highlights the potential of probabilistic architecture for computation with future nanoscale MOSFETs

    AI/ML Algorithms and Applications in VLSI Design and Technology

    Full text link
    An evident challenge ahead for the integrated circuit (IC) industry in the nanometer regime is the investigation and development of methods that can reduce the design complexity ensuing from growing process variations and curtail the turnaround time of chip manufacturing. Conventional methodologies employed for such tasks are largely manual; thus, time-consuming and resource-intensive. In contrast, the unique learning strategies of artificial intelligence (AI) provide numerous exciting automated approaches for handling complex and data-intensive tasks in very-large-scale integration (VLSI) design and testing. Employing AI and machine learning (ML) algorithms in VLSI design and manufacturing reduces the time and effort for understanding and processing the data within and across different abstraction levels via automated learning algorithms. It, in turn, improves the IC yield and reduces the manufacturing turnaround time. This paper thoroughly reviews the AI/ML automated approaches introduced in the past towards VLSI design and manufacturing. Moreover, we discuss the scope of AI/ML applications in the future at various abstraction levels to revolutionize the field of VLSI design, aiming for high-speed, highly intelligent, and efficient implementations

    Design and modelling of variability tolerant on-chip communication structures for future high performance system on chip designs

    Get PDF
    The incessant technology scaling has enabled the integration of functionally complex System-on-Chip (SoC) designs with a large number of heterogeneous systems on a single chip. The processing elements on these chips are integrated through on-chip communication structures which provide the infrastructure necessary for the exchange of data and control signals, while meeting the strenuous physical and design constraints. The use of vast amounts of on chip communications will be central to future designs where variability is an inherent characteristic. For this reason, in this thesis we investigate the performance and variability tolerance of typical on-chip communication structures. Understanding of the relationship between variability and communication is paramount for the designers; i.e. to devise new methods and techniques for designing performance and power efficient communication circuits in the forefront of challenges presented by deep sub-micron (DSM) technologies. The initial part of this work investigates the impact of device variability due to Random Dopant Fluctuations (RDF) on the timing characteristics of basic communication elements. The characterization data so obtained can be used to estimate the performance and failure probability of simple links through the methodology proposed in this work. For the Statistical Static Timing Analysis (SSTA) of larger circuits, a method for accurate estimation of the probability density functions of different circuit parameters is proposed. Moreover, its significance on pipelined circuits is highlighted. Power and area are one of the most important design metrics for any integrated circuit (IC) design. This thesis emphasises the consideration of communication reliability while optimizing for power and area. A methodology has been proposed for the simultaneous optimization of performance, area, power and delay variability for a repeater inserted interconnect. Similarly for multi-bit parallel links, bandwidth driven optimizations have also been performed. Power and area efficient semi-serial links, less vulnerable to delay variations than the corresponding fully parallel links are introduced. Furthermore, due to technology scaling, the coupling noise between the link lines has become an important issue. With ever decreasing supply voltages, and the corresponding reduction in noise margins, severe challenges are introduced for performing timing verification in the presence of variability. For this reason an accurate model for crosstalk noise in an interconnection as a function of time and skew is introduced in this work. This model can be used for the identification of skew condition that gives maximum delay noise, and also for efficient design verification

    Empirical timing analysis of CPUs and delay fault tolerant design using partial redundancy

    Get PDF
    The operating clock frequency is determined by the longest signal propagation delay, setup/hold time, and timing margin. These are becoming less predictable with the increasing design complexity and process miniaturization. The difficult challenge is then to ensure that a device operating at its clock frequency is error-free with quantifiable assurance. Effort at device-level engineering will not suffice for these circuits exhibiting wide process variation and heightened sensitivities to operating condition stress. Logic-level redress of this issue is a necessity and we propose a design-level remedy for this timing-uncertainty problem. The aim of the design and analysis approaches presented in this dissertation is to provide framework, SABRE, wherein an increased operating clock frequency can be achieved. The approach is a combination of analytical modeling, experimental analy- sis, hardware /time-redundancy design, exception handling and recovery techniques. Our proposed design replicates only a necessary part of the original circuit to avoid high hardware overhead as in triple-modular-redundancy (TMR). The timing-critical combinational circuit is path-wise partitioned into two sections. The combinational circuits associated with long paths are laid out without any intrusion except for the fan-out connections from the first section of the circuit to a replicated second section of the combinational circuit. Thus only the second section of the circuit is replicated. The signals fanning out from the first section are latches, and thus are far shorter than the paths spanning the entire combinational circuit. The replicated circuit is timed at a subsequent clock cycle to ascertain relaxed timing paths. This insures that the likelihood of mistiming due to stress or process variation is eliminated. During the subsequent clock cycle, the outcome of the two logically identical, yet time-interleaved, circuit outputs are compared to detect faults. When a fault is detected, the retry sig- nal is triggered and the dynamic frequency-step-down takes place before a pipe flush, and retry is issued. The significant timing overhead associated with the retry is offset by the rarity of the timing violation events. Simulation results on ISCAS Benchmark circuits show that 10% of clock frequency gain is possible with 10 to 20 % of hardware overhead of replicated timing-critical circuit
    corecore