1,074 research outputs found

    Gate-level timing analysis and waveform evaluation

    Get PDF
    Static timing analysis (STA) is an integral part of modern VLSI chip design. Table lookup based methods are widely used in current industry due to its fast runtime and mature algorithms. Conventional STA algorithms based on table-lookup methods are developed under many assumptions in timing analysis; however, most of those assumptions, such as that input signals and output signals can be accurately modeled as ramp waveforms, are no longer satisfactory to meet the increasing demand of accuracy for new technologies. In this dissertation, we discuss several crucial issues that conventional STA has not taken into consideration, and propose new methods to handle these issues and show that new methods produce accurate results. In logic circuits, gates may have multiple inputs and signals can arrive at these inputs at different times and with different waveforms. Different arrival times and waveforms of signals can cause very different responses. However, multiple-input transition effects are totally overlooked by current STA tools. Using a conventional single-input transition model when multiple-input transition happens can cause significant estimation errors in timing analysis. Previous works on this issue focus on developing a complicated gate model to simulate the behavior of logic gates. These methods have high computational cost and have to make significant changes to the prevailing STA tools, and are thus not feasible in practice. This dissertation proposes a simplified gate model, uses transistor connection structures to capture the behavior of multiple-input transitions and requires no change to the current STA tools. Another issue with table lookup based methods is that the load of each gate in technology libraries is modeled as a single lumped capacitor. But in the real circuit, the Abstract 2 gate connects to its subsequent gates via metal wires. As the feature size of integrated circuit scales down, the interconnection cannot be seen as a simple capacitor since the resistive shielding effect will largely affect the equivalent capacitance seen from the gate. As the interconnection has numerous structures, tabulating the timing data for various interconnection structures is not feasible. In this dissertation, by using the concept of equivalent admittance, we reduce an arbitrary interconnection structure into an equivalent π-model RC circuit. Many previous works have mapped the π-model to an effective capacitor, which makes the table lookup based methods useful again. However, a capacitor cannot be equivalent to a π-model circuit, and will thus result in significant inaccuracy in waveform evaluation. In order to obtain an accurate waveform at gate output, a piecewise waveform evaluation method is proposed in this dissertation. Each part of the piecewise waveform is evaluated according to the gate characteristic and load structures. Another contribution of this dissertation research is a proposed equivalent waveform search method. The signal waveforms can be very complicated in the real circuits because of noises, race hazards, etc. The conventional STA only uses one attribute (i.e., transition time) to describe the waveform shape which can cause significant estimation errors. Our approach is to develop heuristic search functions to find equivalent ramps to approximate input waveforms. Here the transition time of a final ramp can be completely different from that of the original waveform, but we can get higher accuracy on output arrival time and transition time. All of the methods mentioned in this dissertation require no changes to the prevailing STA tools, and have been verified across different process technologies

    Innovative Techniques for Testing and Diagnosing SoCs

    Get PDF
    We rely upon the continued functioning of many electronic devices for our everyday welfare, usually embedding integrated circuits that are becoming even cheaper and smaller with improved features. Nowadays, microelectronics can integrate a working computer with CPU, memories, and even GPUs on a single die, namely System-On-Chip (SoC). SoCs are also employed on automotive safety-critical applications, but need to be tested thoroughly to comply with reliability standards, in particular the ISO26262 functional safety for road vehicles. The goal of this PhD. thesis is to improve SoC reliability by proposing innovative techniques for testing and diagnosing its internal modules: CPUs, memories, peripherals, and GPUs. The proposed approaches in the sequence appearing in this thesis are described as follows: 1. Embedded Memory Diagnosis: Memories are dense and complex circuits which are susceptible to design and manufacturing errors. Hence, it is important to understand the fault occurrence in the memory array. In practice, the logical and physical array representation differs due to an optimized design which adds enhancements to the device, namely scrambling. This part proposes an accurate memory diagnosis by showing the efforts of a software tool able to analyze test results, unscramble the memory array, map failing syndromes to cell locations, elaborate cumulative analysis, and elaborate a final fault model hypothesis. Several SRAM memory failing syndromes were analyzed as case studies gathered on an industrial automotive 32-bit SoC developed by STMicroelectronics. The tool displayed defects virtually, and results were confirmed by real photos taken from a microscope. 2. Functional Test Pattern Generation: The key for a successful test is the pattern applied to the device. They can be structural or functional; the former usually benefits from embedded test modules targeting manufacturing errors and is only effective before shipping the component to the client. The latter, on the other hand, can be applied during mission minimally impacting on performance but is penalized due to high generation time. However, functional test patterns may benefit for having different goals in functional mission mode. Part III of this PhD thesis proposes three different functional test pattern generation methods for CPU cores embedded in SoCs, targeting different test purposes, described as follows: a. Functional Stress Patterns: Are suitable for optimizing functional stress during I Operational-life Tests and Burn-in Screening for an optimal device reliability characterization b. Functional Power Hungry Patterns: Are suitable for determining functional peak power for strictly limiting the power of structural patterns during manufacturing tests, thus reducing premature device over-kill while delivering high test coverage c. Software-Based Self-Test Patterns: Combines the potentiality of structural patterns with functional ones, allowing its execution periodically during mission. In addition, an external hardware communicating with a devised SBST was proposed. It helps increasing in 3% the fault coverage by testing critical Hardly Functionally Testable Faults not covered by conventional SBST patterns. An automatic functional test pattern generation exploiting an evolutionary algorithm maximizing metrics related to stress, power, and fault coverage was employed in the above-mentioned approaches to quickly generate the desired patterns. The approaches were evaluated on two industrial cases developed by STMicroelectronics; 8051-based and a 32-bit Power Architecture SoCs. Results show that generation time was reduced upto 75% in comparison to older methodologies while increasing significantly the desired metrics. 3. Fault Injection in GPGPU: Fault injection mechanisms in semiconductor devices are suitable for generating structural patterns, testing and activating mitigation techniques, and validating robust hardware and software applications. GPGPUs are known for fast parallel computation used in high performance computing and advanced driver assistance where reliability is the key point. Moreover, GPGPU manufacturers do not provide design description code due to content secrecy. Therefore, commercial fault injectors using the GPGPU model is unfeasible, making radiation tests the only resource available, but are costly. In the last part of this thesis, we propose a software implemented fault injector able to inject bit-flip in memory elements of a real GPGPU. It exploits a software debugger tool and combines the C-CUDA grammar to wisely determine fault spots and apply bit-flip operations in program variables. The goal is to validate robust parallel algorithms by studying fault propagation or activating redundancy mechanisms they possibly embed. The effectiveness of the tool was evaluated on two robust applications: redundant parallel matrix multiplication and floating point Fast Fourier Transform

    High-level power optimisation for Digital Signal Processing in Recon gurable Logic

    No full text
    This thesis is concerned with the optimisation of Digital Signal Processing (DSP) algorithm implementations on recon gurable hardware via the selection of appropriate word-lengths for the signals in these algorithms, in order to minimise system power consumption. Whilst existing word-length optimisation work has concentrated on the minimisation of the area of algorithm implementations, this work introduces the rst set of power consumption models that can be evaluated quickly enough to be used within the search of the enormous design space of multiple word-length optimisation problems. These models achieve their speed by estimating both the power consumed within the arithmetic components of an algorithm and the power in the routing wires that connect these components, using only a high-level description of the algorithm itself. Trading o a small reduction in power model accuracy for a large increase in speed is one of the major contributions of this thesis. In addition to the work on power consumption modelling, this thesis also develops a new technique for selecting the appropriate word-lengths for an algorithm implementation in order to minimise its cost in terms of power (or some other metric for which models are available). The method developed is able to provide tight lower and upper bounds on the optimal cost that can be obtained for a particular word-length optimisation problem and can, as a result, nd provably near-optimal solutions to word-length optimisation problems without resorting to an NP-hard search of the design space. Finally the costs of systems optimised via the proposed technique are compared to those obtainable by word-length optimisation for minimisation of other metrics (such as logic area) and the results compared, providing greater insight into the nature of wordlength optimisation problems and the extent of the improvements obtainable by them

    Piezoelectric Transformer and Hall-Effect Based Sensing and Disturbance Monitoring Methodology for High-Voltage Power Supply Lines

    Get PDF
    Advancements in relaying algorithms have led to an accurate and robust protection system widely used in power distribution. However, in low power sections of relaying systems, standard voltage and current measurement techniques are still used. These techniques have disadvantages like higher cost, size, electromagnetic interference, resistive losses and measurement errors and hence provide a number of opportunities for improvement and integration. We present a novel microsystem methodology to sense low-power voltage and current signals and detect disturbances in high-voltage power distribution lines. The system employs dual sensor architecture that consists of a piezoelectric transformer in combination with Hall-effect sensor, used to detect the disturbances whose harmonics are in the kHz frequency range. Our numerical analysis is based on three-dimensional finite element models of the piezoelectric transformer (PT) and the principle of Hall-effect based “Integrated Magnetic Concentrator (IMC)” sensor. This model is verified by using experimental data recorded in the resonant frequency and low frequency regions of operation of PT for voltage sensing. Actual measurements with the commercial IMC sensor too validate the modelling results. These results describe a characteristic low frequency behaviour of rectangular piezoelectric transformer, which enables it to withstand voltages as high as 150V. In the frequency range of 10Hz to 250Hz, the PT steps down 10-150V input with a linearity of ±1%. The recorded group delay data shows that propagation delay through PT reduces to few microseconds above 1kHz input signal frequency. Similarly, the non-intrusive current sensor detects current with a response time of 8μs and converts the current into corresponding output voltage. These properties, in addition to frequency spectrum of voltage and current input signals, have been used to develop a signal processing and fault detection system for two real-time cases of faults to produce a 6-bit decision logic capable of detecting various types of line disturbances in less than 3ms of delay

    Empirical timing analysis of CPUs and delay fault tolerant design using partial redundancy

    Get PDF
    The operating clock frequency is determined by the longest signal propagation delay, setup/hold time, and timing margin. These are becoming less predictable with the increasing design complexity and process miniaturization. The difficult challenge is then to ensure that a device operating at its clock frequency is error-free with quantifiable assurance. Effort at device-level engineering will not suffice for these circuits exhibiting wide process variation and heightened sensitivities to operating condition stress. Logic-level redress of this issue is a necessity and we propose a design-level remedy for this timing-uncertainty problem. The aim of the design and analysis approaches presented in this dissertation is to provide framework, SABRE, wherein an increased operating clock frequency can be achieved. The approach is a combination of analytical modeling, experimental analy- sis, hardware /time-redundancy design, exception handling and recovery techniques. Our proposed design replicates only a necessary part of the original circuit to avoid high hardware overhead as in triple-modular-redundancy (TMR). The timing-critical combinational circuit is path-wise partitioned into two sections. The combinational circuits associated with long paths are laid out without any intrusion except for the fan-out connections from the first section of the circuit to a replicated second section of the combinational circuit. Thus only the second section of the circuit is replicated. The signals fanning out from the first section are latches, and thus are far shorter than the paths spanning the entire combinational circuit. The replicated circuit is timed at a subsequent clock cycle to ascertain relaxed timing paths. This insures that the likelihood of mistiming due to stress or process variation is eliminated. During the subsequent clock cycle, the outcome of the two logically identical, yet time-interleaved, circuit outputs are compared to detect faults. When a fault is detected, the retry sig- nal is triggered and the dynamic frequency-step-down takes place before a pipe flush, and retry is issued. The significant timing overhead associated with the retry is offset by the rarity of the timing violation events. Simulation results on ISCAS Benchmark circuits show that 10% of clock frequency gain is possible with 10 to 20 % of hardware overhead of replicated timing-critical circuit

    Degradation in FPGAs: Monitoring, Modeling and Mitigation

    Get PDF
    This dissertation targets the transistor aging degradation as well as the associated thermal challenges in FPGAs (since there is an exponential relation between aging and chip temperature). The main objectives are to perform experimentation, analysis and device-level model abstraction for modeling the degradation in FPGAs, then to monitor the FPGA to keep track of aging rates and ultimately to propose an aging-aware FPGA design flow to mitigate the aging

    Power supply noise in delay testing

    Get PDF
    As technology scales into the Deep Sub-Micron (DSM) regime, circuit designs have become more and more sensitive to power supply noise. Excessive noise can significantly affect the timing performance of DSM designs and cause non-trivial additional delay. In delay test generation, test compaction and test fill techniques can produce excessive power supply noise. This will eventually result in delay test overkill. To reduce this overkill, we propose a low-cost pattern-dependent approach to analyze noise-induced delay variation for each delay test pattern applied to the design. Two noise models have been proposed to address array bond and wire bond power supply networks, and they are experimentally validated and compared. Delay model is then applied to calculate path delay under noise. This analysis approach can be integrated into static test compaction or test fill tools to control supply noise level of delay tests. We also propose an algorithm to predict transition count of a circuit, which can be applied to control switching activity during dynamic compaction. Experiments have been performed on ISCAS89 benchmark circuits. Results show that compacted delay test patterns generated by our compaction tool can meet a moderate noise or delay constraint with only a small increase in compacted test set size. Take the benchmark circuit s38417 for example: a 10% delay increase constraint only results in 1.6% increase in compacted test set size in our experiments. In addition, different test fill techniques have a significant impact on path delay. In our work, a test fill tool with supply noise analysis has been developed to compare several test fill techniques, and results show that the test fill strategy significant affect switching activity, power supply noise and delay. For instance, patterns with minimum transition fill produce less noise-induced delay than random fill. Silicon results also show that test patterns filled in different ways can cause as much as 14% delay variation on target paths. In conclusion, we must take noise into consideration when delay test patterns are generated

    Toward Reliable, Secure, and Energy-Efficient Multi-Core System Design

    Get PDF
    Computer hardware researchers have perennially focussed on improving the performance of computers while stipulating the energy consumption under a strict budget. While several innovations over the years have led to high performance and energy efficient computers, more challenges have also emerged as a fallout. For example, smaller transistor devices in modern multi-core systems are afflicted with several reliability and security concerns, which were inconceivable even a decade ago. Tackling these bottlenecks happens to negatively impact the power and performance of the computers. This dissertation explores novel techniques to gracefully solve some of the pressing challenges of the modern computer design. Specifically, the proposed techniques improve the reliability of on-chip communication fabric under a high power supply noise, increase the energy-efficiency of low-power graphics processing units, and demonstrate an unprecedented security loophole of the low-power computing paradigm through rigorous hardware-based experiments

    High-speed Design Of High-resolution Dacs

    Get PDF
    Tez (Doktora) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2009Thesis (PhD) -- İstanbul Technical University, Institute of Science and Technology, 2009Bu çalışmada, yüksek çözünürlüklü akım yönlendirmeli sayısal-analog dönüştürücülerin (SAD) hızlı tasarımını sağlayan yöntemler incelenmekte ve yeni yaklaşımlar önerilmektedir. Veri dönüştürücüler analog ve sayısal dünyalar arasında bir köprü oluşturdukları için hızlı ve verimli bir şekilde gerçekleştirilmeleri yüksek derecede arzu edilmektedir. Yüksek hızlı (birkaç 100MHz) ve yüksek çözünürlüklü (10 bitten fazla) SAD için artan rağbet, akım yönlendirmeli SADların kullanımını zorunlu kılmaktadır. Yüksek performanslı akım yönlendirmeli SADların tasarımında ve gerçekleştirmesinde kesimleme (segmentation) yöntemi kullanılmaktadır. Bu yöntem, yüksek hız ve yüksek çözünürlük gerektiren uygulamaların çoğunda avantajlı olmasına rağmen uzun süreli tasarım zamanı, karmaşıklık ve yüksek maliyet yüzünden değer kaybetmektedir. Böylece, bazı uygulamalar için zaman ve maliyet açısından bu yöntemin kullanılması hızlı ve verimli olmayabilir. Bu problemlerin üstesinden gelmek için yüksek çözünürlüklü SADların yüksek hızlı tasarımını sağlayan hızlı ve verimli yöntemler dikkate alınmaktadır. Uygun bir tasarım yöntemi ve yeni bir yapı önerilmektedir. Akım yönlendirmeli SADlar gibi karmaşık karma yapılı sistemlerin tasarımı için davranışsal modelin oluşturulması zorunlu olmaktadır. Bu amaçla gerçekleştirilen modellerin çoğu sistemin davranışı hakkında istenilen eksiksiz manzarayı vermemektedir. Bu yüzden, transistor seviyesindeki tasarıma geçmeden önce, tasarımı hızlandırabilen ve sistemin davranışını doğru bir şekilde yansıtabilen modeller geliştirilmektedir. SIMULINK® kullanılarak bir davranışsal model kurulmakta ve modelin performansı benzetimlerle sınanmaktadır. Sonuç olarak, uygulanan yöntemin verimliliğini ve davranışsal modelin doğruluğunu sınamak için 0.35µm CMOS proses teknolojisi için tasarlanan bir 12 bitlik melez akım yönlendirmeli SAD kullanılmaktadır. Yapı bloklarında yapılan iyileştirmeler ve kullanılan farklı yöntemler, gerçekleştirilen SAD’ın serimindeki ilgili kısımlarda yer almaktadırlar. CADENCE Geleneksel Tümleşik Devre Tasarım Araçları kullanılarak serim sonrası benzetimleri yapılmakta ve SAD’ın performans karakteristikleri incelenmektedir.In this thesis, different problems related to the design speed-up of high-resolution current-steering digital-to-analog converters (DAC) are addressed and novel solutions are proposed. Since data converters form the bridge between the analog and digital world their efficient implementation is highly desirable. The increase in demand for high-speed (several 100MHz) and high-resolution (higher than 10-bit) DAC, forces the use of current-steering DACs. Segmentation method is used for the design and the implementation of high performance current-steering DACs. Although this methodology is advantageous in most of the applications requiring high-speed and high-resolution, it suffers from the prolonged design time, complexity and high cost. Thus, the use of this methodology for some applications is not efficient concerning the time and the cost. To overcome these problems efficient methodologies for the high-speed design of high-resolution DACs are considered. A proper design methodology and a novel architecture are introduced. Behavioral modeling is necessary for the design of complex mixed-mode systems like current-steering DACs. Most of the models constructed can not give a complete view of the system’s behavior. For this reason, models that speed up the design and reflect accurately the behavior of the system prior to transistor level implementation are developed. A SIMULINK® based behavioral model is developed and verified through simulations. To conclude, the efficiency of the applied methodology and the accuracy of the behavioral model are validated through the implementation of a 12-bit hybrid current-steering DAC in a 0.35µm CMOS process technology. The improvements in the building blocks and the different approaches used are reflected in the respective parts of the layout of the implemented DAC. Post-layout simulations are obtained using CADENCE Custom IC Design Tools and the performance metrics of the DAC are investigated.DoktoraPh

    IMPROVING THE SENSITIVITY OF ADVANCED LIGO THROUGH DETECTOR CHARACTERIZATION

    Get PDF
    This dissertation focuses on the impact of detector characterization work on searches for gravitational waves from compact binary coalescences (CBCs) in Ad- vanced LIGO’s second observing run (O2). This observing run started on November 30, 2016, and lasted until August 25, 2017, and resulted in the identification of 8 unambiguous gravitational-wave signals, including the first observation of a binary neutron star merger. The role of detector characterization is to leverage knowledge of both the interferometers and the data in order to improve aLIGO’s ability to observe gravitational-waves. I focus on the construction of the O2 noise subtracted data set that was searched as a part of the LIGO-Virgo Collaboration’s first gravitational-wave catalog, GWTC- 1. This data set was processed with a noise subtraction pipeline to remove the excess noise identified at each interferometer that resulted in a 30% improvement in the sensitive volume that aLIGO was able to probe. Equally important to the finalized data set is the inclusion of data quality vetoes that indicate periods of instrumental artifacts. I also examine how these instrumental artifacts can mimic gravitational-wave waveforms and reduce the sensitivity of searches for CBC signals, with particular emphasis on the PyCBC pipeline. Understanding this connection is one of the key ways that gravitational waves are differentiated from instrumental artifacts. Finally, I detail the final results presented in the GWTC-1 catalog from a detector charac- terization perspective, and discuss how the efforts highlighted in this dissertation allowed for the detection of new gravitational-wave events and improved analyses of previously identified events
    corecore