1,033 research outputs found
Error Mitigation Using Approximate Logic Circuits: A Comparison of Probabilistic and Evolutionary Approaches
Technology scaling poses an increasing challenge to the reliability of digital circuits. Hardware redundancy solutions, such as triple modular redundancy (TMR), produce very high area overhead, so partial redundancy is often used to reduce the overheads. Approximate logic circuits provide a general framework for optimized mitigation of errors arising from a broad class of failure mechanisms, including transient, intermittent, and permanent failures. However, generating an optimal redundant logic circuit that is able to mask the faults with the highest probability while minimizing the area overheads is a challenging problem. In this study, we propose and compare two new approaches to generate approximate logic circuits to be used in a TMR schema. The probabilistic approach approximates a circuit in a greedy manner based on a probabilistic estimation of the error. The evolutionary approach can provide radically different solutions that are hard to reach by other methods. By combining these two approaches, the solution space can be explored in depth. Experimental results demonstrate that the evolutionary approach can produce better solutions, but the probabilistic approach is close. On the other hand, these approaches provide much better scalability than other existing partial redundancy techniques.This work was supported by the Ministry of Economy and Competitiveness of Spain under project ESP2015-68245-C4-1-P, and by the Czech science foundation project GA16-17538S and the Ministry of Education, Youth and Sports of the Czech Republic from the National Programme of Sustainability (NPU II); project IT4Innovations excellence in science - LQ1602
On Co-Optimization Of Constrained Satisfiability Problems For Hardware Software Applications
Manufacturing technology has permitted an exponential growth in transistor count and density. However, making efficient use of the available transistors in the design has become exceedingly difficult. Standard design flow involves synthesis, verification, placement and routing followed by final tape out of the design. Due to the presence of various undesirable effects like capacitive crosstalk, supply noise, high temperatures, etc., verification/validation of the design has become a challenging problem. Therefore, having a good design convergence may not be possible within the target time, due to a need for a large number of design iterations.
Capacitive crosstalk is one of the major causes of design convergence problems in deep sub-micron era. With scaling, the number of crosstalk violations has been increasing because of reduced inter-wire distances. Consequently only the most severe crosstalk faults are fixed pre-silicon while the rest are tested post-silicon. Testing for capacitive crosstalk involves generation of input patterns which can be applied post-silicon to the integrated circuit and comparison of the output response. These patterns are generated at the gate/ Register Transfer Level (RTL) of abstraction using Automatic Test Pattern Generation (ATPG) tools. In this dissertation, anInteger Linear Programming (ILP) based ATPG technique for maximizing crosstalk induced delay increase at the victim net, for multiple aggressor crosstalk faults, is presented. Moreover, various solutions for pattern generation considering both zero as well as unit delay models is also proposed.
With voltage scaling, power supply switching noise has become one of the leading causes of signal integrity related failures in deep sub-micron designs. Hence, during power supply network design and analysis of power supply switching noise, computation of peak supply current is an essential step. Traditional peak current estimation approaches involve addition of peak current associated with all the CMOS gates which are switching in a combinational circuit. Consequently, this approach does not take the Boolean and temporal relationships of the circuit into account. This work presents an ILP based technique for generation of an input pattern pair which maximizes switching supply currents for a combinational circuit in the presence of integer gate delays. The input pattern pair generated using the above approach can be applied post-silicon for power droop testing.
With high level of integration, Multi-Processor Systems on Chip (MPSoC) feature multiple processor cores and accelerators on the same die, so as to exploit the instruction level parallelism in the application. For hardware-software co-design, application programming model is based on a Task Graph, which represents task dependencies and execution/transfer times for various threads and processes within an application. Mapping an application to an MPSoC traditionally involves representing it in the form of a task graph and employing static scheduling in order to minimize the schedule length. Dynamic system behavior is not taken into consideration during static scheduling, while dynamic scheduling requires the knowledge of task graph at runtime. A run-time task graph extraction heuristic to facilitate dynamic scheduling is also presented here. A novel game theory based approach uses this extracted task graph to perform run-time scheduling in order to minimize total schedule length.
With increase in transistor density, power density has gone up substantially. This has lead to generation of regions with very high temperature called Hotspots. Hotspots lead to reliability and performance issues and affect design convergence. In current generation Integrated Circuits (ICs) temperature is controlled by reducing power dissipation using Dynamic Thermal Management (DTM) techniques like frequency and/or voltage scaling. These techniques are reactive in nature and have detrimental effects on performance. Here, a look-ahead based task migration technique is proposed, in order to utilize the multitude of cores available in an MPSoC to eliminate thermal emergencies. Our technique is based on temperature prediction, leveraging upon a novel wavelet based thermal modeling approach.
Hence, this work addresses several optimization problems that can be reduced to constrained max-satisfiability, involving integer as well as Boolean constraints in hardware and software domains. Moreover, it provides domain specific heuristic solutions for each of them
Timing speculation and adaptive reliable overclocking techniques for aggressive computer systems
Computers have changed our lives beyond our own imagination in the past several decades. The continued and progressive advancements in VLSI technology and numerous micro-architectural innovations have played a key role in the design of spectacular low-cost high performance computing systems that have become omnipresent in today\u27s technology driven world. Performance and dependability have become key concerns as these ubiquitous computing machines continue to drive our everyday life. Every application has unique demands, as they run in diverse operating environments. Dependable, aggressive and adaptive systems improve efficiency in terms of speed, reliability and energy consumption.
Traditional computing systems run at a fixed clock frequency, which is determined by taking into account the worst-case timing paths, operating conditions, and process variations. Timing speculation based reliable overclocking advocates going beyond worst-case limits to achieve best performance while not avoiding, but detecting and correcting a modest number of timing errors. The success of this design methodology relies on the fact that timing critical paths are rarely exercised in a design, and typical execution happens much faster than the timing requirements dictated by worst-case design methodology. Better-than-worst-case design methodology is advocated by several recent research pursuits, which exploit dependability techniques to enhance computer system performance.
In this dissertation, we address different aspects of timing speculation based adaptive reliable overclocking schemes, and evaluate their role in the design of low-cost, high performance, energy efficient and dependable systems. We visualize various control knobs in the design that can be favorably controlled to ensure different design targets.
As part of this research, we extend the SPRIT3E, or Superscalar PeRformance Improvement Through Tolerating Timing Errors, framework, and characterize the extent of application dependent performance acceleration achievable in superscalar processors by scrutinizing the various parameters that impact the operation beyond worst-case limits. We study the limitations imposed by short-path constraints on our technique, and present ways to exploit them to maximize performance gains. We analyze the sensitivity of our technique\u27s adaptiveness by exploring the necessary hardware requirements for dynamic overclocking schemes. Experimental analysis based on SPEC2000 benchmarks running on a SimpleScalar Alpha processor simulator, augmented with error rate data obtained from hardware simulations of a superscalar processor, are presented.
Even though reliable overclocking guarantees functional correctness, it leads to higher power consumption. As a consequence, reliable overclocking without considering on-chip temperatures will bring down the lifetime reliability of the chip. In this thesis, we analyze how reliable overclocking impacts the on-chip temperature of a microprocessor and evaluate the effects of overheating, due to such reliable dynamic frequency tuning mechanisms, on the lifetime reliability of these systems. We then evaluate the effect of performing thermal throttling, a technique that clamps the on-chip temperature below a predefined value, on system performance and reliability. Our study shows that a reliably overclocked system with dynamic thermal management achieves 25% performance improvement, while lasting for 14 years when being operated within 353K.
Over the past five decades, technology scaling, as predicted by Moore\u27s law, has been the bedrock of semiconductor technology evolution. The continued downscaling of CMOS technology to deep sub-micron gate lengths has been the primary reason for its dominance in today\u27s omnipresent silicon microchips. Even as the transition to the next technology node is indispensable, the initial cost and time associated in doing so presents a non-level playing field for the competitors in the semiconductor business. As part of this thesis, we evaluate the capability of speculative reliable overclocking mechanisms to maximize performance at a given technology level. We evaluate its competitiveness when compared to technology scaling, in terms of performance, power consumption, energy and energy delay product. We present a comprehensive comparison for integer and floating point SPEC2000 benchmarks running on a simulated Alpha processor at three different technology nodes in normal and enhanced modes. Our results suggest that adopting reliable overclocking strategies will help skip a technology node altogether, or be competitive in the market, while porting to the next technology node.
Reliability has become a serious concern as systems embrace nanometer technologies. In this dissertation, we propose a novel fault tolerant aggressive system that combines soft error protection and timing error tolerance. We replicate both the pipeline registers and the pipeline stage combinational logic. The replicated logic receives its inputs from the primary pipeline registers while writing its output to the replicated pipeline registers. The organization of redundancy in the proposed Conjoined Pipeline system supports overclocking, provides concurrent error detection and recovery capability for soft errors, intermittent faults and timing errors, and flags permanent silicon defects. The fast recovery process requires no checkpointing and takes three cycles. Back annotated post-layout gate-level timing simulations, using 45nm technology, of a conjoined two-stage arithmetic pipeline and a conjoined five-stage DLX pipeline processor, with forwarding logic, show that our approach, even under a severe fault injection campaign, achieves near 100% fault coverage and an average performance improvement of about 20%, when dynamically overclocked
Recommended from our members
IC design for reliability
textAs the feature size of integrated circuits goes down to the nanometer scale,
transient and permanent reliability issues are becoming a significant concern for circuit
designers. Traditionally, the reliability issues were mostly handled at the device level as a
device engineering problem. However, the increasing severity of reliability challenges
and higher error rates due to transient upsets favor higher-level design for reliability
(DFR). In this work, we develop several methods for DFR at the circuit level.
A major source of transient errors is the single event upset (SEU). SEUs are
caused by high-energy particles present in the cosmic rays or emitted by radioactive
contaminants in the chip packaging materials. When these particles hit a N+/P+ depletion
region of an MOS transistor, they may generate a temporary logic fault. Depending on
where the MOS transistor is located and what state the circuit is at, an SEU may result in
a circuit-level error. We analyze SEUs both in combinational logic and memories
(SRAM). For combinational logic circuit, we propose FASER, a Fast Analysis tool of
Soft ERror susceptibility for cell-based designs. The efficiency of FASER is achieved
through its static and vector-less nature. In order to evaluate the impact of SEU on SRAM, a theory for estimating dynamic noise margins is developed analytically. The
results allow predicting the transient error susceptibility of an SRAM cell using a closedform
expression.
Among the many permanent failure mechanisms that include time-dependent
oxide breakdown (TDDB), electro-migration (EM), hot carrier effect (HCE), and
negative bias temperature instability (NBTI), NBTI has recently become important.
Therefore, the main focus of our work is NBTI. NBTI occurs when the gate of PMOS is
negatively biased. The voltage stress across the gate generates interface traps, which
degrade the threshold voltage of PMOS. The degraded PMOS may eventually fail to meet
timing requirement and cause functional errors. NBTI becomes severe at elevated
temperatures. In this dissertation, we propose a NBTI degradation model that takes into
account the temperature variation on the chip and gives the accurate estimation of the
degraded threshold voltage.
In order to account for the degradation of devices, traditional design methods add
guard-bands to ensure that the circuit will function properly during its lifetime. However,
the worst-case based guard-bands lead to significant penalty in performance. In this
dissertation, we propose an effective macromodel-based reliability tracking and
management framework, based on a hybrid network of on-chip sensors, consisting of
temperature sensors and ring oscillators. The model is concerned specifically with NBTIinduced
transistor aging. The key feature of our work, in contrast to the traditional
tracking techniques that rely solely on direct measurement of the increase of threshold
voltage or circuit delay, is an explicit macromodel which maps operating temperature to
circuit degradation (the increase of circuit delay). The macromodel allows for costeffective
tracking of reliability using temperature sensors and is also essential for
enabling the control loop of the reliability management system. The developed methods improve the over-conservatism of the device-level, worstcase
reliability estimation techniques. As the severity of reliability challenges continue to
grow with technology scaling, it will become more important for circuit designers/CAD
tools to be equipped with the developed methods.Electrical and Computer Engineerin
Transient error mitigation by means of approximate logic circuits
Mención Internacional en el tÃtulo de doctorThe technological advances in the manufacturing of electronic circuits have allowed to
greatly improve their performance, but they have also increased the sensitivity of electronic
devices to radiation-induced errors. Among them, the most common effects are
the SEEs, i.e., electrical perturbations provoked by the strike of high-energy particles,
which may modify the internal state of a memory element (SEU) or generate erroneous
transient pulses (SET), among other effects. These events pose a threat for the reliability
of electronic circuits, and therefore fault-tolerance techniques must be applied to
deal with them.
The most common fault-tolerance techniques are based in full replication (DWC or
TMR). These techniques are able to cover a wide range of failure mechanisms present
in electronic circuits. However, they suffer from high overheads in terms of area and
power consumption. For this reason, lighter alternatives are often sought at the expense
of slightly reducing reliability for the least critical circuit sections. In this context a new
paradigm of electronic design is emerging, known as approximate computing, which
is based on improving the circuit performance in change of slight modifications of the
intended functionality. This is an interesting approach for the design of lightweight
fault-tolerant solutions, which has not been yet studied in depth.
The main goal of this thesis consists in developing new lightweight fault-tolerant
techniques with partial replication, by means of approximate logic circuits. These
circuits can be designed with great flexibility. This way, the level of protection as
well as the overheads can be adjusted at will depending on the necessities of each
application. However, finding optimal approximate circuits for a given application is
still a challenge.
In this thesis a method for approximate circuit generation is proposed, denoted
as fault approximation, which consists in assigning constant logic values to specific
circuit lines. On the other hand, several criteria are developed to generate the most
suitable approximate circuits for each application, by using this fault approximation
mechanism. These criteria are based on the idea of approximating the least testable
sections of circuits, which allows reducing overheads while minimising the loss of reliability.
Therefore, in this thesis the selection of approximations is linked to testability
measures.
The first criterion for fault selection developed in this thesis uses static testability
measures. The approximations are generated from the results of a fault simulation of
the target circuit, and from a user-specified testability threshold. The amount of approximated
faults depends on the chosen threshold, which allows to generate approximate circuits with different performances. Although this approach was initially intended for
combinational circuits, an extension to sequential circuits has been performed as well,
by considering the flip-flops as both inputs and outputs of the combinational part of
the circuit. The experimental results show that this technique achieves a wide scalability,
and an acceptable trade-off between reliability versus overheads. In addition, its
computational complexity is very low.
However, the selection criterion based in static testability measures has some drawbacks.
Adjusting the performance of the generated approximate circuits by means of
the approximation threshold is not intuitive, and the static testability measures do not
take into account the changes as long as faults are approximated. Therefore, an alternative
criterion is proposed, which is based on dynamic testability measures. With this
criterion, the testability of each fault is computed by means of an implication-based
probability analysis. The probabilities are updated with each new approximated fault,
in such a way that on each iteration the most beneficial approximation is chosen, that
is, the fault with the lowest probability. In addition, the computed probabilities allow
to estimate the level of protection against faults that the generated approximate circuits
provide. Therefore, it is possible to generate circuits which stick to a target error rate.
By modifying this target, circuits with different performances can be obtained. The
experimental results show that this new approach is able to stick to the target error rate
with reasonably good precision. In addition, the approximate circuits generated with
this technique show better performance than with the approach based in static testability
measures. In addition, the fault implications have been reused too in order to
implement a new type of logic transformation, which consists in substituting functionally
similar nodes.
Once the fault selection criteria have been developed, they are applied to different
scenarios. First, an extension of the proposed techniques to FPGAs is performed,
taking into account the particularities of this kind of circuits. This approach has been
validated by means of radiation experiments, which show that a partial replication with
approximate circuits can be even more robust than a full replication approach, because
a smaller area reduces the probability of SEE occurrence. Besides, the proposed
techniques have been applied to a real application circuit as well, in particular to the
microprocessor ARM Cortex M0. A set of software benchmarks is used to generate
the required testability measures. Finally, a comparative study of the proposed approaches
with approximate circuit generation by means of evolutive techniques have
been performed. These approaches make use of a high computational capacity to generate
multiple circuits by trial-and-error, thus reducing the possibility of falling into
local minima. The experimental results demonstrate that the circuits generated with
evolutive approaches are slightly better in performance than the circuits generated with
the techniques here proposed, although with a much higher computational effort.
In summary, several original fault mitigation techniques with approximate logic
circuits are proposed. These approaches are demonstrated in various scenarios, showing
that the scalability and adaptability to the requirements of each application are their
main virtuesLos avances tecnológicos en la fabricación de circuitos electrónicos han permitido mejorar
en gran medida sus prestaciones, pero también han incrementado la sensibilidad
de los mismos a los errores provocados por la radiación. Entre ellos, los más comunes
son los SEEs, perturbaciones eléctricas causadas por el impacto de partÃculas de alta
energÃa, que entre otros efectos pueden modificar el estado de los elementos de memoria
(SEU) o generar pulsos transitorios de valor erróneo (SET). Estos eventos suponen
un riesgo para la fiabilidad de los circuitos electrónicos, por lo que deben ser tratados
mediante técnicas de tolerancia a fallos.
Las técnicas de tolerancia a fallos más comunes se basan en la replicación completa
del circuito (DWC o TMR). Estas técnicas son capaces de cubrir una amplia variedad
de modos de fallo presentes en los circuitos electrónicos. Sin embargo, presentan un
elevado sobrecoste en área y consumo. Por ello, a menudo se buscan alternativas más
ligeras, aunque no tan efectivas, basadas en una replicación parcial. En este contexto
surge una nueva filosofÃa de diseño electrónico, conocida como computación aproximada,
basada en mejorar las prestaciones de un diseño a cambio de ligeras modificaciones
de la funcionalidad prevista. Es un enfoque atractivo y poco explorado para el diseño
de soluciones ligeras de tolerancia a fallos.
El objetivo de esta tesis consiste en desarrollar nuevas técnicas ligeras de tolerancia
a fallos por replicación parcial, mediante el uso de circuitos lógicos aproximados. Estos
circuitos se pueden diseñar con una gran flexibilidad. De este forma, tanto el nivel de
protección como el sobrecoste se pueden regular libremente en función de los requisitos
de cada aplicación. Sin embargo, encontrar los circuitos aproximados óptimos para
cada aplicación es actualmente un reto.
En la presente tesis se propone un método para generar circuitos aproximados, denominado
aproximación de fallos, consistente en asignar constantes lógicas a ciertas
lÃneas del circuito. Por otro lado, se desarrollan varios criterios de selección para, mediante
este mecanismo, generar los circuitos aproximados más adecuados para cada
aplicación. Estos criterios se basan en la idea de aproximar las secciones menos testables
del circuito, lo que permite reducir los sobrecostes minimizando la perdida de
fiabilidad. Por tanto, en esta tesis la selección de aproximaciones se realiza a partir de
medidas de testabilidad.
El primer criterio de selección de fallos desarrollado en la presente tesis hace uso de
medidas de testabilidad estáticas. Las aproximaciones se generan a partir de los resultados
de una simulación de fallos del circuito objetivo, y de un umbral de testabilidad
especificado por el usuario. La cantidad de fallos aproximados depende del umbral escogido, lo que permite generar circuitos aproximados con diferentes prestaciones.
Aunque inicialmente este método ha sido concebido para circuitos combinacionales,
también se ha realizado una extensión a circuitos secuenciales, considerando los biestables
como entradas y salidas de la parte combinacional del circuito. Los resultados
experimentales demuestran que esta técnica consigue una buena escalabilidad, y unas
prestaciones de coste frente a fiabilidad aceptables. Además, tiene un coste computacional
muy bajo.
Sin embargo, el criterio de selección basado en medidas estáticas presenta algunos
inconvenientes. No resulta intuitivo ajustar las prestaciones de los circuitos aproximados
a partir de un umbral de testabilidad, y las medidas estáticas no tienen en cuenta los
cambios producidos a medida que se van aproximando fallos. Por ello, se propone un
criterio alternativo de selección de fallos, basado en medidas de testabilidad dinámicas.
Con este criterio, la testabilidad de cada fallo se calcula mediante un análisis de probabilidades
basado en implicaciones. Las probabilidades se actualizan con cada nuevo
fallo aproximado, de forma que en cada iteración se elige la aproximación más favorable,
es decir, el fallo con menor probabilidad. Además, las probabilidades calculadas
permiten estimar la protección frente a fallos que ofrecen los circuitos aproximados
generados, por lo que es posible generar circuitos que se ajusten a una tasa de fallos
objetivo. Modificando esta tasa se obtienen circuitos aproximados con diferentes prestaciones.
Los resultados experimentales muestran que este método es capaz de ajustarse
razonablemente bien a la tasa de fallos objetivo. Además, los circuitos generados
con esta técnica muestran mejores prestaciones que con el método basado en medidas
estáticas. También se han aprovechado las implicaciones de fallos para implementar
un nuevo tipo de transformación lógica, consistente en sustituir nodos funcionalmente
similares.
Una vez desarrollados los criterios de selección de fallos, se aplican a distintos
campos. En primer lugar, se hace una extensión de las técnicas propuestas para FPGAs,
teniendo en cuenta las particularidades de este tipo de circuitos. Esta técnica se ha validado
mediante experimentos de radiación, los cuales demuestran que una replicación
parcial con circuitos aproximados puede ser incluso más robusta que una replicación
completa, ya que un área más pequeña reduce la probabilidad de SEEs. Por otro lado,
también se han aplicado las técnicas propuestas en esta tesis a un circuito de aplicación
real, el microprocesador ARM Cortex M0, utilizando un conjunto de benchmarks
software para generar las medidas de testabilidad necesarias. Por ´último, se realiza un
estudio comparativo de las técnicas desarrolladas con la generación de circuitos aproximados
mediante técnicas evolutivas. Estas técnicas hacen uso de una gran capacidad
de cálculo para generar múltiples circuitos mediante ensayo y error, reduciendo la posibilidad
de caer en algún mÃnimo local. Los resultados confirman que, en efecto, los
circuitos generados mediante técnicas evolutivas son ligeramente mejores en prestaciones
que con las técnicas aquà propuestas, pero con un coste computacional mucho
mayor.
En definitiva, se proponen varias técnicas originales de mitigación de fallos mediante
circuitos aproximados. Se demuestra que estas técnicas tienen diversas aplicaciones,
haciendo de la flexibilidad y adaptabilidad a los requisitos de cada aplicación
sus principales virtudes.Programa Oficial de Doctorado en IngenierÃa Eléctrica, Electrónica y AutomáticaPresidente: Raoul Velazco.- Secretario: Almudena Lindoso Muñoz.- Vocal: Jaume Segura Fuste
Design for Test and Hardware Security Utilizing Tester Authentication Techniques
Design-for-Test (DFT) techniques have been developed to improve testability of integrated circuits. Among the known DFT techniques, scan-based testing is considered an efficient solution for digital circuits. However, scan architecture can be exploited to launch a side channel attack. Scan chains can be used to access a cryptographic core inside a system-on-chip to extract critical information such as a private encryption key. For a scan enabled chip, if an attacker is given unlimited access to apply all sorts of inputs to the Circuit-Under-Test (CUT) and observe the outputs, the probability of gaining access to critical information increases. In this thesis, solutions are presented to improve hardware security and protect them against attacks using scan architecture. A solution based on tester authentication is presented in which, the CUT requests the tester to provide a secret code for authentication. The tester authentication circuit limits the access to the scan architecture to known testers. Moreover, in the proposed solution the number of attempts to apply test vectors and observe the results through the scan architecture is limited to make brute-force attacks practically impossible. A tester authentication utilizing a Phase Locked Loop (PLL) to encrypt the operating frequency of both DUT/Tester has also been presented. In this method, the access to the critical security circuits such as crypto-cores are not granted in the test mode. Instead, a built-in self-test method is used in the test mode to protect the circuit against scan-based attacks. Security for new generation of three-dimensional (3D) integrated circuits has been investigated through 3D simulations COMSOL Multiphysics environment. It is shown that the process of wafer thinning for 3D stacked IC integration reduces the leakage current which increases the chip security against side-channel attacks
A Survey of Fault-Injection Methodologies for Soft Error Rate Modeling in Systems-on-Chips
The development of process technology has increased system performance, but the system failure probability has also significantly increased. It is important to consider the system reliability in addition to the cost, performance, and power consumption. In this paper, we describe the types of faults that occur in a system and where these faults originate. Then, fault-injection techniques, which are used to characterize the fault rate of a system-on-chip (SoC), are investigated to provide a guideline to SoC designers for the realization of resilient SoCs
Efficient Path Delay Test Generation with Boolean Satisfiability
This dissertation focuses on improving the accuracy and efficiency of path delay test generation using a Boolean satisfiability (SAT) solver. As part of this research, one of the most commonly used SAT solvers, MiniSat, was integrated into the path delay test generator CodGen. A mixed structural-functional approach was implemented in CodGen where longest paths were detected using the K Longest Path Per Gate (KLPG) algorithm and path justification and dynamic compaction were handled with the SAT solver.
Advanced techniques were implemented in CodGen to further speed up the performance of SAT based path delay test generation using the knowledge of the circuit structure. SAT solvers are inherently circuit structure unaware, and significant speedup can be availed if structure information of the circuit is provided to the SAT solver. The advanced techniques explored include: Dynamic SAT Solving (DSS), Circuit Observability Don’t Care (Cir-ODC), SAT based static learning, dynamic learnt clause management and Approximate Observability Don’t Care (ACODC). Both ISCAS 89 and ITC 99 benchmarks as well as industrial circuits were used to demonstrate that the performance of CodGen was significantly improved with MiniSat and the use of circuit structure
Empirical timing analysis of CPUs and delay fault tolerant design using partial redundancy
The operating clock frequency is determined by the longest signal propagation
delay, setup/hold time, and timing margin. These are becoming less predictable with
the increasing design complexity and process miniaturization. The difficult challenge
is then to ensure that a device operating at its clock frequency is error-free with
quantifiable assurance. Effort at device-level engineering will not suffice for these
circuits exhibiting wide process variation and heightened sensitivities to operating
condition stress. Logic-level redress of this issue is a necessity and we propose a
design-level remedy for this timing-uncertainty problem.
The aim of the design and analysis approaches presented in this dissertation is to
provide framework, SABRE, wherein an increased operating clock frequency can be
achieved. The approach is a combination of analytical modeling, experimental analy-
sis, hardware /time-redundancy design, exception handling and recovery techniques.
Our proposed design replicates only a necessary part of the original circuit to avoid
high hardware overhead as in triple-modular-redundancy (TMR). The timing-critical
combinational circuit is path-wise partitioned into two sections. The combinational
circuits associated with long paths are laid out without any intrusion except for the
fan-out connections from the first section of the circuit to a replicated second section
of the combinational circuit. Thus only the second section of the circuit is replicated.
The signals fanning out from the first section are latches, and thus are far shorter than the paths spanning the entire combinational circuit. The replicated circuit is timed
at a subsequent clock cycle to ascertain relaxed timing paths. This insures that the
likelihood of mistiming due to stress or process variation is eliminated. During the
subsequent clock cycle, the outcome of the two logically identical, yet time-interleaved,
circuit outputs are compared to detect faults. When a fault is detected, the retry sig-
nal is triggered and the dynamic frequency-step-down takes place before a pipe flush,
and retry is issued. The significant timing overhead associated with the retry is offset
by the rarity of the timing violation events. Simulation results on ISCAS Benchmark
circuits show that 10% of clock frequency gain is possible with 10 to 20 % of hardware
overhead of replicated timing-critical circuit
- …