11 research outputs found

    Design for pre-bond testability in 3D integrated circuits

    Get PDF
    In this dissertation we propose several DFT techniques specific to 3D stacked IC systems. The goal has explicitly been to create techniques that integrate easily with existing IC test systems. Specifically, this means utilizing scan- and wrapper-based techniques, two foundations of the digital IC test industry. First, we describe a general test architecture for 3D ICs. In this architecture, each tier of a 3D design is wrapped in test control logic that both manages tier test pre-bond and integrates the tier into the large test architecture post-bond. We describe a new kind of boundary scan to provide the necessary test control and observation of the partial circuits, and we propose a new design methodology for test hardcore that ensures both pre-bond functionality and post-bond optimality. We present the application of these techniques to the 3D-MAPS test vehicle, which has proven their effectiveness. Second, we extend these DFT techniques to circuit-partitioned designs. We find that boundary scan design is generally sufficient, but that some 3D designs require special DFT treatment. Most importantly, we demonstrate that the functional partitioning inherent in 3D design can potentially decrease the total test cost of verifying a circuit. Third, we present a new CAD algorithm for designing 3D test wrappers. This algorithm co-designs the pre-bond and post-bond wrappers to simultaneously minimize test time and routing cost. On average, our algorithm utilizes over 90% of the wires in both the pre-bond and post-bond wrappers. Finally, we look at the 3D vias themselves to develop a low-cost, high-volume pre-bond test methodology appropriate for production-level test. We describe the shorting probes methodology, wherein large test probes are used to contact multiple small 3D vias. This technique is an all-digital test method that integrates seamlessly into existing test flows. Our experimental results demonstrate two key facts: neither the large capacitance of the probe tips nor the process variation in the 3D vias and the probe tips significantly hinders the testability of the circuits. Taken together, this body of work defines a complete test methodology for testing 3D ICs pre-bond, eliminating one of the key hurdles to the commercialization of 3D technology.PhDCommittee Chair: Lee, Hsien-Hsin; Committee Member: Bakir, Muhannad; Committee Member: Lim, Sung Kyu; Committee Member: Vuduc, Richard; Committee Member: Yalamanchili, Sudhaka

    Harnessing Checker Hierarchy for Reliable Microprocessors

    Get PDF
    Traditional fault-tolerant multi-threading architectures provide good fault tolerance by re-executing all the computations. However, such a full re-execution significantly increases the demand on the processor resources, resulting in severe performance degradation. To address this problem, this dissertation presents Active Verification Management (AVM) approaches that utilize a checker hierarchy to increase its performance with a minimal effect on the overall reliability. Based on a simplified queueing model, AVM employs a filter checker which prioritizes the verification candidates to selectively do verification. This dissertation proposes three filter checkers - based on (1) result usage, (2) result bitwidth, and (3) result anomaly - that exploit correctness-criticality metrics and anomaly speculation. Binary Correctness Criticality (BCC) and Likelihood of Correctness Criticality (LoCC) are metrics that quantify whether an instruction is important for reliability or how likely an instruction is correctness-critical, respectively. Based on the BCC, a result-usage-based filter checker mitigates the verification workload by bypassing instructions that are unnecessary for correct execution. Computing the LoCC is accomplished by exploiting information redundancy of compressing computationally useful data bits. Numerical significance hints let the result-bitwidth-based filter checker guide a verification priority effectively before the re-execution process starts. A result-anomaly-based filter checker exploits a value similarity property, which is defined by a frequent occurrence of partially identical values. Based on the biased distribution of similarity distance measure, this dissertation further investigates another application to exploit similar values for soft error tolerance with anomaly speculation. Extensive measurements show that the majority of instructions produce values that are different from the previous result value only in a few bits. Experimental results show that the proposed schemes accelerate the processor to be 180% faster than traditional fully-fault-tolerant processor, with a minimal impact on the overall soft error rate. With no AVM, congestion at the checker badly affects performance, to the tune of 57%, when compared to that of a non-fault-tolerant processor. These results explain that the proposed AVM has the potential to solve the verification congestion problem when perfect fault coverage is not needed

    Designing Efficient Network Interfaces For System Area Networks

    Full text link
    The network is the key component of a Cluster of Workstations/PCs. Its performance, measured in terms of bandwidth and latency, has a great impact on the overall system performance. It quickly became clear that traditional WAN/LAN technology is not too well suited for interconnecting powerful nodes into a cluster. Their poor performance too often slows down communication-intensive applications. This observation led to the birth of a new class of networks called System Area Networks (SAN). The ATOLL network introduces a new optimized architecture for SANs. On a single chip, not one but four network interfaces (NI) have been implemented, together with an on-chip 4x4 full-duplex switch and four link interfaces. This unique "Network on a Chip" architecture is best suited for interconnecting SMP nodes, where multiple CPUs are given an exclusive NI and do not have to share a single interface. It also removes the need for any additional switching hardware, since the four byte-wide full-duplex links can be connected by cables with neighbor nodes in an arbitrary network topology

    Communication synthesis of networks-on-chip (NoC)

    Get PDF
    The emergence of networks-on-chip (NoC) as the communication infrastructure solution for complex multi-core SoCs presents communication synthesis challenges. This dissertation addresses the design and run-time management aspects of communication synthesis. Design reuse and the infeasibility of Intellectual Property (IP) core interface redesign, requires the development of a Core-Network Interface (CNI) which allows them to communicate over the on-chip network. The absence of intelligence amongst the NoC components, entails the introduction of a CNI capable of not only providing basic packetization and depacketization, but also other essential services such as reliability, power management, reconguration and test support. A generic CNI architecture providing these services for NoCs is proposed and evaluated in this dissertation. Rising on-chip communication power costs and reliability concerns due to these, motivate the development of a peak power management technique that is both scalable to dierent NoCs and adaptable to varying trac congurations. A scalable and adaptable peak power management technique - SAPP - is proposed and demonstrated. Latency and throughput improvements observed with SAPP demonstrate its superiority over existing techniques. Increasing design complexity make prediction of design lifetimes dicult. Post SoC deployment, an on-line health monitoring scheme, is essential to maintain con- dence in the correct operation of on-chip cores. The rising design complexity and IP core test costs makes non-concurrent testing of the IP cores infeasible. An on-line scheme capable of managing IP core test in the presence of executing applications is essential. Such a scheme ensures application performance and system power budgets are eciently managed. This dissertation proposes Concurrent On-Line Test (COLT) for NoC-based systems and demonstrates how a robust implementation of COLT using a Test Infrastructure-IP (TI-IP) can be used to maintain condence in the correct operation of the SoC

    Low-cost and efficient fault detection and diagnosis schemes for modern cores

    Get PDF
    Continuous improvements in transistor scaling together with microarchitectural advances have made possible the widespread adoption of high-performance processors across all market segments. However, the growing reliability threats induced by technology scaling and by the complexity of designs are challenging the production of cheap yet robust systems. Soft error trends are haunting, especially for combinational logic, and parity and ECC codes are therefore becoming insufficient as combinational logic turns into the dominant source of soft errors. Furthermore, experts are warning about the need to also address intermittent and permanent faults during processor runtime, as increasing temperatures and device variations will accelerate inherent aging phenomena. These challenges specially threaten the commodity segments, which impose requirements that existing fault tolerance mechanisms cannot offer. Current techniques based on redundant execution were devised in a time when high penalties were assumed for the sake of high reliability levels. Novel light-weight techniques are therefore needed to enable fault protection in the mass market segments. The complexity of designs is making post-silicon validation extremely expensive. Validation costs exceed design costs, and the number of discovered bugs is growing, both during validation and once products hit the market. Fault localization and diagnosis are the biggest bottlenecks, magnified by huge detection latencies, limited internal observability, and costly server farms to generate test outputs. This thesis explores two directions to address some of the critical challenges introduced by unreliable technologies and by the limitations of current validation approaches. We first explore mechanisms for comprehensively detecting multiple sources of failures in modern processors during their lifetime (including transient, intermittent, permanent and also design bugs). Our solutions embrace a paradigm where fault tolerance is built based on exploiting high-level microarchitectural invariants that are reusable across designs, rather than relying on re-execution or ad-hoc block-level protection. To do so, we decompose the basic functionalities of processors into high-level tasks and propose three novel runtime verification solutions that combined enable global error detection: a computation/register dataflow checker, a memory dataflow checker, and a control flow checker. The techniques use the concept of end-to-end signatures and allow designers to adjust the fault coverage to their needs, by trading-off area, power and performance. Our fault injection studies reveal that our methods provide high coverage levels while causing significantly lower performance, power and area costs than existing techniques. Then, this thesis extends the applicability of the proposed error detection schemes to the validation phases. We present a fault localization and diagnosis solution for the memory dataflow by combining our error detection mechanism, a new low-cost logging mechanism and a diagnosis program. Selected internal activity is continuously traced and kept in a memory-resident log whose capacity can be expanded to suite validation needs. The solution can catch undiscovered bugs, reducing the dependence on simulation farms that compute golden outputs. Upon error detection, the diagnosis algorithm analyzes the log to automatically locate the bug, and also to determine its root cause. Our evaluations show that very high localization coverage and diagnosis accuracy can be obtained at very low performance and area costs. The net result is a simplification of current debugging practices, which are extremely manual, time consuming and cumbersome. Altogether, the integrated solutions proposed in this thesis capacitate the industry to deliver more reliable and correct processors as technology evolves into more complex designs and more vulnerable transistors.El continuo escalado de los transistores junto con los avances microarquitect贸nicos han posibilitado la presencia de potentes procesadores en todos los segmentos de mercado. Sin embargo, varios problemas de fiabilidad est谩n desafiando la producci贸n de sistemas robustos. Las predicciones de "soft errors" son inquietantes, especialmente para la l贸gica combinacional: soluciones como ECC o paridad se est谩n volviendo insuficientes a medida que dicha l贸gica se convierte en la fuente predominante de soft errors. Adem谩s, los expertos est谩n alertando acerca de la necesidad de detectar otras fuentes de fallos (causantes de errores permanentes e intermitentes) durante el tiempo de vida de los procesadores. Los segmentos "commodity" son los m谩s vulnerables, ya que imponen unos requisitos que las t茅cnicas actuales de fiabilidad no ofrecen. Estas soluciones (generalmente basadas en re-ejecuci贸n) fueron ideadas en un tiempo en el que con tal de alcanzar altos nivel de fiabilidad se asum铆an grandes costes. Son por tanto necesarias nuevas t茅cnicas que permitan la protecci贸n contra fallos en los segmentos m谩s populares. La complejidad de los dise帽os est谩 encareciendo la validaci贸n "post-silicon". Su coste excede el de dise帽o, y el n煤mero de errores descubiertos est谩 aumentando durante la validaci贸n y ya en manos de los clientes. La localizaci贸n y el diagn贸stico de errores son los mayores problemas, empeorados por las altas latencias en la manifestaci贸n de errores, por la poca observabilidad interna y por el coste de generar las se帽ales esperadas. Esta tesis explora dos direcciones para tratar algunos de los retos causados por la creciente vulnerabilidad hardware y por las limitaciones de los enfoques de validaci贸n. Primero exploramos mecanismos para detectar m煤ltiples fuentes de fallos durante el tiempo de vida de los procesadores (errores transitorios, intermitentes, permanentes y de dise帽o). Nuestras soluciones son de un paradigma donde la fiabilidad se construye explotando invariantes microarquitect贸nicos gen茅ricos, en lugar de basarse en re-ejecuci贸n o en protecci贸n ad-hoc. Para ello descomponemos las funcionalidades b谩sicas de un procesador y proponemos tres soluciones de `runtime verification' que combinadas permiten una detecci贸n de errores a nivel global. Estas tres soluciones son: un verificador de flujo de datos de registro y de computaci贸n, un verificador de flujo de datos de memoria y un verificador de flujo de control. Nuestras t茅cnicas usan el concepto de firmas y permiten a los dise帽adores ajustar los niveles de protecci贸n a sus necesidades, mediante compensaciones en 谩rea, consumo energ茅tico y rendimiento. Nuestros estudios de inyecci贸n de errores revelan que los m茅todos propuestos obtienen altos niveles de protecci贸n, a la vez que causan menos costes que las soluciones existentes. A continuaci贸n, esta tesis explora la aplicabilidad de estos esquemas a las fases de validaci贸n. Proponemos una soluci贸n de localizaci贸n y diagn贸stico de errores para el flujo de datos de memoria que combina nuestro mecanismo de detecci贸n de errores, junto con un mecanismo de logging de bajo coste y un programa de diagn贸stico. Cierta actividad interna es continuamente registrada en una zona de memoria cuya capacidad puede ser expandida para satisfacer las necesidades de validaci贸n. La soluci贸n permite descubrir bugs, reduciendo la necesidad de calcular los resultados esperados. Al detectar un error, el algoritmo de diagn贸stico analiza el registro para autom谩ticamente localizar el bug y determinar su causa. Nuestros estudios muestran un alto grado de localizaci贸n y de precisi贸n de diagn贸stico a un coste muy bajo de rendimiento y 谩rea. El resultado es una simplificaci贸n de las pr谩cticas actuales de depuraci贸n, que son enormemente manuales, inc贸modas y largas. En conjunto, las soluciones de esta tesis capacitan a la industria a producir procesadores m谩s fiables, a medida que la tecnolog铆a evoluciona hacia dise帽os m谩s complejos y m谩s vulnerables

    Scalably Verifiable Cache Coherence

    Get PDF
    <p>The correctness of a cache coherence protocol is crucial to the system since a subtle bug in the protocol may lead to disastrous consequences. However, the verification of a cache coherence protocol is never an easy task due to the complexity of the protocol. Moreover, as more and more cores are compressed into a single chip, there is an urge for the cache coherence protocol to have higher performance, lower power consumption, and less storage overhead. People perform various optimizations to meet these goals, which unfortunately, further exacerbate the verification problem. The current situation is that there are no efficient and universal methods for verifying a realistic cache coherence protocol for a many-core system. </p><p>We, as architects, believe that we can alleviate the verification problem by changing the traditional design paradigm. We suggest taking verifiability as a first-class design constraint, just as we do with other traditional metrics, such as performance, power consumption, and area overhead. To do this, we need to incorporate verification effort in the early design stage of a cache coherence protocol and make wise design decisions regarding the verifiability. Such a protocol will be amenable to verification and easier to be verified in a later stage. Specifically, we propose two methods in this thesis for designing scalably verifiable cache coherence protocols. </p><p>The first method is Fractal Coherence, targeting verifiable hierarchical protocols. Fractal Coherence leverages the fractal idea to design a cache coherence protocol. The self-similarity of the fractal enables the inductive verification of the protocol. Such a verification process is independent of the number of nodes and thus is scalable. We also design example protocols to show that Fractal Coherence protocols can attain comparable performance compared to a traditional snooping or directory protocol. </p><p>As a system scales hierarchically, Fractal Coherence can perfectly solve the verification problem of the implemented cache coherence protocol. However, Fractal Coherence cannot help if the system scales horizontally. Therefore, we propose the second method, PVCoherence, targeting verifiable flat protocols. PVCoherence is based on parametric verification, a widely used method for verifying the coherence of a flat protocol with infinite number of nodes. PVCoherence captures the fundamental requirements and limitations of parametric verification and proposes a set of guidelines for designing cache coherence protocols that are compatible with parametric verification. As long as designers follow these guidelines, their protocols can be easily verified. </p><p>We further show that Fractal Coherence and PVCoherence can also facilitate the verification of memory consistency, another extremely challenging problem. One piece of previous work proves that the verification of memory consistency can be decomposed into three steps. The most complex and non-scalable step is the verification of the cache coherence protocol. If we design the protocol following the design methodology of Fractal Coherence or PVCoherence, we can easily verify the cache coherence protocol and overcome the biggest obstacle in the verification of memory consistency. </p><p>As system expands and cache coherence protocols get more complex, the verification problem of the protocol becomes more prominent. We believe it is time to reconsider the traditional design flow in which verification is totally separated from the design stage. We show that by incorporating the verifiability in the early design stage and designing protocols to be scalably verifiable in the first place, we can greatly reduce the burden of verification. Meanwhile, we perform various experiments and show that we do not lose benefits in performance as well as in other metrics when we obtain the correctness guarantee.</p>Dissertatio

    Aeronautical Engineering: A cumulative index to the 1984 issues of the continuing bibliography

    Get PDF
    This bibliography is a cumulative index to the abstracts contained in NASA SP-7037(171) through NASA SP-7037(182) of Aeronautical Engineering: A Continuing Bibliography. NASA SP-7037 and its supplements have been compiled through the cooperative efforts of the American Institute of Aeronautics and Astronautics (AIAA) and the National Aeronautics and Space Administration (NASA). This cumulative index includes subject, personal author, corporate source, foreign technology, contract, report number, and accession number indexes

    Testability Features of the Alpha 21364 Microprocessor

    No full text
    The custom testability strategy of the Alpha 21364, Hewlett-Packard鈥檚 most recent Alpha microprocessor, builds upon its Alpha 21264 embedded core. Several additional DFT features integrate to meet the testing challenges of the new generation. 1

    Aeronautical engineering: A cumulative index to a continuing bibliography (supplement 274)

    Get PDF
    This publication is a cumulative index to the abstracts contained in supplements 262 through 273 of Aeronautical Engineering: A Continuing Bibliography. The bibliographic series is compiled through the cooperative efforts of the American Institute of Aeronautics and Astronautics (AIAA) and the National Aeronautics and Space Administration (NASA). Seven indexes are included: subject, personal author, corporate source, foreign technology, contract number, report number, and accession number
    corecore