42 research outputs found
Design, Analysis and Test of Logic Circuits under Uncertainty.
Integrated circuits are increasingly susceptible to uncertainty caused by soft
errors, inherently probabilistic devices, and manufacturing variability. As device technologies
scale, these effects become detrimental to circuit reliability. In order to address
this, we develop methods for analyzing, designing, and testing circuits subject to probabilistic
effects. Our main contributions are: 1) a fast, soft-error rate (SER) analyzer
that uses functional-simulation signatures to capture error effects, 2) novel design techniques
that improve reliability using little area and performance overhead, 3) a matrix-based
reliability-analysis framework that captures many types of probabilistic faults, and
4) test-generation/compaction methods aimed at probabilistic faults in logic circuits.
SER analysis must account for the main error-masking mechanisms in ICs: logic,
timing, and electrical masking. We relate logic masking to node testability of the circuit
and utilize functional-simulation signatures, i.e., partial truth tables, to efficiently compute
estability (signal probability and observability). To account for timing masking, we compute
error-latching windows (ELWs) from timing analysis information. Electrical masking
is incorporated into our estimates through derating factors for gate error probabilities. The
SER of a circuit is computed by combining the effects of all three masking mechanisms
within our SER analyzer called AnSER.
Using AnSER, we develop several low-overhead techniques that increase reliability,
including: 1) an SER-aware design method that uses redundancy already present within
the circuit, 2) a technique that resynthesizes small logic windows to improve area and
reliability, and 3) a post-placement gate-relocation technique that increases timing masking by decreasing ELWs.
We develop the probabilistic transfer matrix (PTM) modeling framework to analyze
effects beyond soft errors. PTMs are compressed into algebraic decision diagrams (ADDs)
to improve computational efficiency. Several ADD algorithms are developed to extract
reliability and error susceptibility information from PTMs representing circuits.
We propose new algorithms for circuit testing under probabilistic faults, which require
a reformulation of existing test techniques. For instance, a test vector may need to be
repeated many times to detect a fault. Also, different vectors detect the same fault with
different probabilities. We develop test generation methods that account for these differences, and integer linear programming (ILP) formulations to optimize test sets.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/61584/1/smita_1.pd
Approximate logic circuits: Theory and applications
CMOS technology scaling, the process of shrinking transistor dimensions based
on Moore's law, has been the thrust behind increasingly powerful integrated circuits
for over half a century. As dimensions are scaled to few tens of nanometers, process
and environmental variations can significantly alter transistor characteristics, thus
degrading reliability and reducing performance gains in CMOS designs with technology
scaling. Although design solutions proposed in recent years to improve reliability
of CMOS designs are power-efficient, the performance penalty associated with these
solutions further reduces performance gains with technology scaling, and hence these
solutions are not well-suited for high-performance designs.
This thesis proposes approximate logic circuits as a new logic synthesis paradigm
for reliable, high-performance computing systems. Given a specification, an approximate
logic circuit is functionally equivalent to the given specification for a "significant"
portion of the input space, but has a smaller delay and power as compared to a
circuit implementation of the original specification. This contributions of this thesis
include (i) a general theory of approximation and efficient algorithms for automated
synthesis of approximations for unrestricted random logic circuits, (ii) logic design solutions
based on approximate circuits to improve reliability of designs with negligible
performance penalty, and (iii) efficient decomposition algorithms based on approxiiii
mate circuits to improve performance of designs during logic synthesis. This thesis
concludes with other potential applications of approximate circuits and identifies. open
problems in logic decomposition and approximate circuit synthesis
Design Automation and Application for Emerging Reconfigurable Nanotechnologies
In the last few decades, two major phenomena have revolutionized the electronic industry – the ever-increasing dependence on electronic circuits and the Complementary Metal Oxide Semiconductor (CMOS) downscaling. These two phenomena have been complementing each other in a way that while electronics, in general, have demanded more computations per functional unit, CMOS downscaling has aptly supported such needs. However, while the computational demand is still rising exponentially, CMOS downscaling is reaching its physical limits. Hence, the need to explore viable emerging nanotechnologies is more imperative than ever. This thesis focuses on streamlining the existing design automation techniques for a class of emerging reconfigurable nanotechnologies. Transistors based on this technology exhibit duality in conduction, i.e. they can be configured dynamically either as a p-type or an n-type device on the application of an external bias. Owing to this dynamic reconfiguration, these transistors are also referred to as Reconfigurable Field-Effect Transistors (RFETs).
Exploring and developing new technologies just like CMOS, require tackling two main challenges – first, design automation flow has to be modified to enable tailor- made circuit designs. Second, possible application opportunities should be explored where such technologies can outsmart the existing CMOS technologies. This thesis targets the above two objectives for emerging reconfigurable nanotechnologies by proposing approaches for enabling an Electronic Design Automation (EDA) flow for circuits based on RFETs and exploring hardware security as an application that exploits the transistor-level dynamic reconfiguration offered by this technology.
This thesis explains the bottom-up approach adopted to propose a logic synthesis flow by identifying new logic gates and circuit design paradigms that can particularly exploit the dynamic reconfiguration offered by these novel nanotechnologies. This led to the subsequent need of finding natural Boolean logic abstraction for emerging reconfigurable nanotechnologies as it is shown that the existing abstraction of negative unate logic for CMOS technologies is sub-optimal for RFETs-based circuits. In this direction, it has been shown that duality in Boolean logic is a natural abstraction for this technology and can truly represent the duality in conduction offered by individual transistors. Finding this abstraction paved the way for defining suitable primitives and proposing various algorithms for logic synthesis and technology mapping.
The following step is to explore compatible physical synthesis flow for emerging reconfigurable nanotechnologies. Using silicon nanowire-based RFETs, .lef and .lib files have been provided which can provide an end-to-end flow to generate .GDSII file for circuits exclusively based on RFETs. Additionally, new approaches have been explored to improve placement and routing for circuits based on reconfigurable nanotechnologies. It has been demonstrated how these approaches led to superior results as compared to the native flow meant for CMOS.
Lastly, the unique property of transistor-level reconfiguration offered by RFETs is utilized to implement efficient Intellectual Property (IP) protection schemes against adversarial attacks. The ability to control the conduction of individual transistors can be argued as one of the impactful features of this technology and suitably fits into the paradigm of security measures. Prior security schemes based on CMOS technology often come with large overheads in terms of area, power, and delay. In contrast, RFETs-based hardware security measures such as logic locking, split manufacturing, etc. proposed in this thesis, demonstrate affordable security solutions with low overheads.
Overall, this thesis lays a strong foundation for the two main objectives – design automation, and hardware security as an application, to push emerging reconfigurable nanotechnologies for commercial integration. Additionally, contributions done in this thesis are made available under open-source licenses so as to foster new research directions and collaborations.:Abstract
List of Figures
List of Tables
1 Introduction
1.1 What are emerging reconfigurable nanotechnologies?
1.2 Why does this technology look so promising?
1.3 Electronics Design Automation
1.4 The game of see-saw: key challenges vs benefits for emerging reconfigurable nanotechnologies
1.4.1 Abstracting ambipolarity in logic gate designs
1.4.2 Enabling electronic design automation for RFETs
1.4.3 Enhanced functionality: a suitable fit for hardware security applications
1.5 Research questions
1.6 Entire RFET-centric EDA Flow
1.7 Key Contributions and Thesis Organization
2 Preliminaries
2.1 Reconfigurable Nanotechnology
2.1.1 1D devices
2.1.2 2D devices
2.1.3 Factors favoring circuit-flexibility
2.2 Feasibility aspects of RFET technology
2.3 Logic Synthesis Preliminaries
2.3.1 Circuit Model
2.3.2 Boolean Algebra
2.3.3 Monotone Function and the property of Unateness
2.3.4 Logic Representations
3 Exploring Circuit Design Topologies for RFETs
3.1 Contributions
3.2 Organization
3.3 Related Works
3.4 Exploring design topologies for combinational circuits: functionality-enhanced logic gates
3.4.1 List of Combinational Functionality-Enhanced Logic Gates based on RFETs
3.4.2 Estimation of gate delay using the logical effort theory
3.5 Invariable design of Inverters
3.6 Sequential Circuits
3.6.1 Dual edge-triggered TSPC-based D-flip flop
3.6.2 Exploiting RFET’s ambipolarity for metastability
3.7 Evaluations
3.7.1 Evaluation of combinational logic gates
3.7.2 Novel design of 1-bit ALU
3.7.3 Comparison of the sequential circuit with an equivalent CMOS-based design
3.8 Concluding remarks
4 Standard Cells and Technology Mapping
4.1 Contributions
4.2 Organization
4.3 Related Work
4.4 Standard cells based on RFETs
4.4.1 Interchangeable Pull-Up and Pull-Down Networks
4.4.2 Reconfigurable Truth-Table
4.5 Distilling standard cells
4.6 HOF-based Technology Mapping Flow for RFETs-based circuits
4.6.1 Area adjustments through inverter sharings
4.6.2 Technology Mapping Flow
4.6.3 Realizing Parameters For The Generic Library
4.6.4 Defining RFETs-based Genlib for HOF-based mapping
4.7 Experiments
4.7.1 Experiment 1: Distilling standard-cells from a benchmark suite
4.7.2 Experiment 2A: HOF-based mapping .
4.7.3 Experiment 2B: Using the distilled standard-cells during mapping
4.8 Concluding Remarks
5 Logic Synthesis with XOR-Majority Graphs
5.1 Contributions
5.2 Organization
5.3 Motivation
5.4 Background and Preliminaries
5.4.1 Terminologies
5.4.2 Self-duality in NPN classes
5.4.3 Majority logic synthesis
5.4.4 Earlier work on XMG
5.4.5 Classification of Boolean functions
5.5 Preserving Self-Duality
5.5.1 During logic synthesis
5.5.2 During versatile technology mapping
5.6 Advanced Logic synthesis techniques
5.6.1 XMG resubstitution
5.6.2 Exact XMG rewriting
5.7 Logic representation-agnostic Mapping
5.7.1 Versatile Mapper
5.7.2 Support of supergates
5.8 Creating Self-dual Benchmarks
5.9 Experiments
5.9.1 XMG-based Flow
5.9.2 Experimental Setup
5.9.3 Synthetic self-dual benchmarks
5.9.4 Cryptographic benchmark suite
5.10 Concluding remarks and future research directions
6 Physical synthesis flow and liberty generation
6.1 Contributions
6.2 Organization
6.3 Background and Related Work
6.3.1 Related Works
6.3.2 Motivation
6.4 Silicon Nanowire Reconfigurable Transistors
6.5 Layouts for Logic Gates
6.5.1 Layouts for Static Functional Logic Gates
6.5.2 Layout for Reconfigurable Logic Gate
6.6 Table Model for Silicon Nanowire RFETs
6.7 Exploring Approaches for Physical Synthesis
6.7.1 Using the Standard Place & Route Flow
6.7.2 Open-source Flow
6.7.3 Concept of Driver Cells
6.7.4 Native Approach
6.7.5 Island-based Approach
6.7.6 Utilization Factor
6.7.7 Placement of the Island on the Chip
6.8 Experiments
6.8.1 Preliminary comparison with CMOS technology
6.8.2 Evaluating different physical synthesis approaches
6.9 Results and discussions
6.9.1 Parameters Which Affect The Area
6.9.2 Use of Germanium Nanowires Channels
6.10 Concluding Remarks
7 Polymporphic Primitives for Hardware Security
7.1 Contributions
7.2 Organization
7.3 The Shift To Explore Emerging Technologies For Security
7.4 Background
7.4.1 IP protection schemes
7.4.2 Preliminaries
7.5 Security Promises
7.5.1 RFETs for logic locking (transistor-level locking)
7.5.2 RFETs for split manufacturing
7.6 Security Vulnerabilities
7.6.1 Realization of short-circuit and open-circuit scenarios in an RFET-based inverter
7.6.2 Circuit evaluation on sub-circuits
7.6.3 Reliability concerns: A consequence of short-circuit scenario
7.6.4 Implication of the proposed security vulnerability
7.7 Analytical Evaluation
7.7.1 Investigating the security promises
7.7.2 Investigating the security vulnerabilities
7.8 Concluding remarks and future research directions
8 Conclusion
8.1 Concluding Remarks
8.2 Directions for Future Work
Appendices
A Distilling standard-cells
B RFETs-based Genlib
C Layout Extraction File (.lef) for Silicon Nanowire-based RFET
D Liberty (.lib) file for Silicon Nanowire-based RFET
Reliable chip design from low powered unreliable components
The pace of technological improvement of the semiconductor market is driven by Moore’s Law, enabling chip transistor density to double every two years. The transistors would continue to decline in cost and size but increase in power. The continuous transistor scaling and extremely lower power constraints in modern Very Large Scale Integrated(VLSI) chips can potentially supersede the benefits of the technology shrinking due to reliability issues. As VLSI technology scales into nanoscale regime, fundamental physical limits are approached, and higher levels of variability, performance degradation, and higher rates of manufacturing defects are experienced. Soft errors, which traditionally affected only the memories, are now also resulting in logic circuit reliability degradation. A solution to these limitations is to integrate reliability assessment techniques into the Integrated Circuit(IC) design flow. This thesis investigates four aspects of reliability driven circuit design: a)Reliability estimation; b) Reliability optimization; c) Fault-tolerant techniques, and d) Delay degradation analysis. To guide the reliability driven synthesis and optimization of combinational circuits, highly accurate probability based reliability estimation methodology christened Conditional Probabilistic Error Propagation(CPEP) algorithm is developed to compute the impact of gate failures on the circuit output. CPEP guides the proposed rewriting based logic optimization algorithm employing local transformations. The main idea behind this methodology is to replace parts of the circuit with functionally equivalent but more reliable counterparts chosen from a precomputed subset of Negation-Permutation-Negation(NPN) classes of 4-variable functions. Cut enumeration and Boolean matching driven by reliability-aware optimization algorithm are used to identify the best possible replacement candidates. Experiments on a set of MCNC benchmark circuits and 8051 functional microcontroller units indicate that the proposed framework can achieve up to 75% reduction of output error probability. On average, about 14% SER reduction is obtained at the expense of very low area overhead of 6.57% that results in 13.52% higher power consumption. The next contribution of the research describes a novel methodology to design fault tolerant circuitry by employing the error correction codes known as Codeword Prediction Encoder(CPE). Traditional fault tolerant techniques analyze the circuit reliability issue from a static point of view neglecting the dynamic errors. In the context of communication and storage, the study of novel methods for reliable data transmission under unreliable hardware is an increasing priority. The idea of CPE is adapted from the field of forward error correction for telecommunications focusing on both encoding aspects and error correction capabilities. The proposed Augmented Encoding solution consists of computing an augmented codeword that contains both the codeword to be transmitted on the channel and extra parity bits. A Computer Aided Development(CAD) framework known as CPE simulator is developed providing a unified platform that comprises a novel encoder and fault tolerant LDPC decoders. Experiments on a set of encoders with different coding rates and different decoders indicate that the proposed framework can correct all errors under specific scenarios. On average, about 1000 times improvement in Soft Error Rate(SER) reduction is achieved. Last part of the research is the Inverse Gaussian Distribution(IGD) based delay model applicable to both combinational and sequential elements for sub-powered circuits. The Probability Density Function(PDF) based delay model accurately captures the delay behavior of all the basic gates in the library database. The IGD model employs these necessary parameters, and the delay estimation accuracy is demonstrated by evaluating multiple circuits. Experiments results indicate that the IGD based approach provides a high matching against HSPICE Monte Carlo simulation results, with an average error less than 1.9% and 1.2% for the 8-bit Ripple Carry Adder(RCA), and 8-bit De-Multiplexer(DEMUX) and Multiplexer(MUX) respectively
Transient error mitigation by means of approximate logic circuits
Mención Internacional en el tÃtulo de doctorThe technological advances in the manufacturing of electronic circuits have allowed to
greatly improve their performance, but they have also increased the sensitivity of electronic
devices to radiation-induced errors. Among them, the most common effects are
the SEEs, i.e., electrical perturbations provoked by the strike of high-energy particles,
which may modify the internal state of a memory element (SEU) or generate erroneous
transient pulses (SET), among other effects. These events pose a threat for the reliability
of electronic circuits, and therefore fault-tolerance techniques must be applied to
deal with them.
The most common fault-tolerance techniques are based in full replication (DWC or
TMR). These techniques are able to cover a wide range of failure mechanisms present
in electronic circuits. However, they suffer from high overheads in terms of area and
power consumption. For this reason, lighter alternatives are often sought at the expense
of slightly reducing reliability for the least critical circuit sections. In this context a new
paradigm of electronic design is emerging, known as approximate computing, which
is based on improving the circuit performance in change of slight modifications of the
intended functionality. This is an interesting approach for the design of lightweight
fault-tolerant solutions, which has not been yet studied in depth.
The main goal of this thesis consists in developing new lightweight fault-tolerant
techniques with partial replication, by means of approximate logic circuits. These
circuits can be designed with great flexibility. This way, the level of protection as
well as the overheads can be adjusted at will depending on the necessities of each
application. However, finding optimal approximate circuits for a given application is
still a challenge.
In this thesis a method for approximate circuit generation is proposed, denoted
as fault approximation, which consists in assigning constant logic values to specific
circuit lines. On the other hand, several criteria are developed to generate the most
suitable approximate circuits for each application, by using this fault approximation
mechanism. These criteria are based on the idea of approximating the least testable
sections of circuits, which allows reducing overheads while minimising the loss of reliability.
Therefore, in this thesis the selection of approximations is linked to testability
measures.
The first criterion for fault selection developed in this thesis uses static testability
measures. The approximations are generated from the results of a fault simulation of
the target circuit, and from a user-specified testability threshold. The amount of approximated
faults depends on the chosen threshold, which allows to generate approximate circuits with different performances. Although this approach was initially intended for
combinational circuits, an extension to sequential circuits has been performed as well,
by considering the flip-flops as both inputs and outputs of the combinational part of
the circuit. The experimental results show that this technique achieves a wide scalability,
and an acceptable trade-off between reliability versus overheads. In addition, its
computational complexity is very low.
However, the selection criterion based in static testability measures has some drawbacks.
Adjusting the performance of the generated approximate circuits by means of
the approximation threshold is not intuitive, and the static testability measures do not
take into account the changes as long as faults are approximated. Therefore, an alternative
criterion is proposed, which is based on dynamic testability measures. With this
criterion, the testability of each fault is computed by means of an implication-based
probability analysis. The probabilities are updated with each new approximated fault,
in such a way that on each iteration the most beneficial approximation is chosen, that
is, the fault with the lowest probability. In addition, the computed probabilities allow
to estimate the level of protection against faults that the generated approximate circuits
provide. Therefore, it is possible to generate circuits which stick to a target error rate.
By modifying this target, circuits with different performances can be obtained. The
experimental results show that this new approach is able to stick to the target error rate
with reasonably good precision. In addition, the approximate circuits generated with
this technique show better performance than with the approach based in static testability
measures. In addition, the fault implications have been reused too in order to
implement a new type of logic transformation, which consists in substituting functionally
similar nodes.
Once the fault selection criteria have been developed, they are applied to different
scenarios. First, an extension of the proposed techniques to FPGAs is performed,
taking into account the particularities of this kind of circuits. This approach has been
validated by means of radiation experiments, which show that a partial replication with
approximate circuits can be even more robust than a full replication approach, because
a smaller area reduces the probability of SEE occurrence. Besides, the proposed
techniques have been applied to a real application circuit as well, in particular to the
microprocessor ARM Cortex M0. A set of software benchmarks is used to generate
the required testability measures. Finally, a comparative study of the proposed approaches
with approximate circuit generation by means of evolutive techniques have
been performed. These approaches make use of a high computational capacity to generate
multiple circuits by trial-and-error, thus reducing the possibility of falling into
local minima. The experimental results demonstrate that the circuits generated with
evolutive approaches are slightly better in performance than the circuits generated with
the techniques here proposed, although with a much higher computational effort.
In summary, several original fault mitigation techniques with approximate logic
circuits are proposed. These approaches are demonstrated in various scenarios, showing
that the scalability and adaptability to the requirements of each application are their
main virtuesLos avances tecnológicos en la fabricación de circuitos electrónicos han permitido mejorar
en gran medida sus prestaciones, pero también han incrementado la sensibilidad
de los mismos a los errores provocados por la radiación. Entre ellos, los más comunes
son los SEEs, perturbaciones eléctricas causadas por el impacto de partÃculas de alta
energÃa, que entre otros efectos pueden modificar el estado de los elementos de memoria
(SEU) o generar pulsos transitorios de valor erróneo (SET). Estos eventos suponen
un riesgo para la fiabilidad de los circuitos electrónicos, por lo que deben ser tratados
mediante técnicas de tolerancia a fallos.
Las técnicas de tolerancia a fallos más comunes se basan en la replicación completa
del circuito (DWC o TMR). Estas técnicas son capaces de cubrir una amplia variedad
de modos de fallo presentes en los circuitos electrónicos. Sin embargo, presentan un
elevado sobrecoste en área y consumo. Por ello, a menudo se buscan alternativas más
ligeras, aunque no tan efectivas, basadas en una replicación parcial. En este contexto
surge una nueva filosofÃa de diseño electrónico, conocida como computación aproximada,
basada en mejorar las prestaciones de un diseño a cambio de ligeras modificaciones
de la funcionalidad prevista. Es un enfoque atractivo y poco explorado para el diseño
de soluciones ligeras de tolerancia a fallos.
El objetivo de esta tesis consiste en desarrollar nuevas técnicas ligeras de tolerancia
a fallos por replicación parcial, mediante el uso de circuitos lógicos aproximados. Estos
circuitos se pueden diseñar con una gran flexibilidad. De este forma, tanto el nivel de
protección como el sobrecoste se pueden regular libremente en función de los requisitos
de cada aplicación. Sin embargo, encontrar los circuitos aproximados óptimos para
cada aplicación es actualmente un reto.
En la presente tesis se propone un método para generar circuitos aproximados, denominado
aproximación de fallos, consistente en asignar constantes lógicas a ciertas
lÃneas del circuito. Por otro lado, se desarrollan varios criterios de selección para, mediante
este mecanismo, generar los circuitos aproximados más adecuados para cada
aplicación. Estos criterios se basan en la idea de aproximar las secciones menos testables
del circuito, lo que permite reducir los sobrecostes minimizando la perdida de
fiabilidad. Por tanto, en esta tesis la selección de aproximaciones se realiza a partir de
medidas de testabilidad.
El primer criterio de selección de fallos desarrollado en la presente tesis hace uso de
medidas de testabilidad estáticas. Las aproximaciones se generan a partir de los resultados
de una simulación de fallos del circuito objetivo, y de un umbral de testabilidad
especificado por el usuario. La cantidad de fallos aproximados depende del umbral escogido, lo que permite generar circuitos aproximados con diferentes prestaciones.
Aunque inicialmente este método ha sido concebido para circuitos combinacionales,
también se ha realizado una extensión a circuitos secuenciales, considerando los biestables
como entradas y salidas de la parte combinacional del circuito. Los resultados
experimentales demuestran que esta técnica consigue una buena escalabilidad, y unas
prestaciones de coste frente a fiabilidad aceptables. Además, tiene un coste computacional
muy bajo.
Sin embargo, el criterio de selección basado en medidas estáticas presenta algunos
inconvenientes. No resulta intuitivo ajustar las prestaciones de los circuitos aproximados
a partir de un umbral de testabilidad, y las medidas estáticas no tienen en cuenta los
cambios producidos a medida que se van aproximando fallos. Por ello, se propone un
criterio alternativo de selección de fallos, basado en medidas de testabilidad dinámicas.
Con este criterio, la testabilidad de cada fallo se calcula mediante un análisis de probabilidades
basado en implicaciones. Las probabilidades se actualizan con cada nuevo
fallo aproximado, de forma que en cada iteración se elige la aproximación más favorable,
es decir, el fallo con menor probabilidad. Además, las probabilidades calculadas
permiten estimar la protección frente a fallos que ofrecen los circuitos aproximados
generados, por lo que es posible generar circuitos que se ajusten a una tasa de fallos
objetivo. Modificando esta tasa se obtienen circuitos aproximados con diferentes prestaciones.
Los resultados experimentales muestran que este método es capaz de ajustarse
razonablemente bien a la tasa de fallos objetivo. Además, los circuitos generados
con esta técnica muestran mejores prestaciones que con el método basado en medidas
estáticas. También se han aprovechado las implicaciones de fallos para implementar
un nuevo tipo de transformación lógica, consistente en sustituir nodos funcionalmente
similares.
Una vez desarrollados los criterios de selección de fallos, se aplican a distintos
campos. En primer lugar, se hace una extensión de las técnicas propuestas para FPGAs,
teniendo en cuenta las particularidades de este tipo de circuitos. Esta técnica se ha validado
mediante experimentos de radiación, los cuales demuestran que una replicación
parcial con circuitos aproximados puede ser incluso más robusta que una replicación
completa, ya que un área más pequeña reduce la probabilidad de SEEs. Por otro lado,
también se han aplicado las técnicas propuestas en esta tesis a un circuito de aplicación
real, el microprocesador ARM Cortex M0, utilizando un conjunto de benchmarks
software para generar las medidas de testabilidad necesarias. Por ´último, se realiza un
estudio comparativo de las técnicas desarrolladas con la generación de circuitos aproximados
mediante técnicas evolutivas. Estas técnicas hacen uso de una gran capacidad
de cálculo para generar múltiples circuitos mediante ensayo y error, reduciendo la posibilidad
de caer en algún mÃnimo local. Los resultados confirman que, en efecto, los
circuitos generados mediante técnicas evolutivas son ligeramente mejores en prestaciones
que con las técnicas aquà propuestas, pero con un coste computacional mucho
mayor.
En definitiva, se proponen varias técnicas originales de mitigación de fallos mediante
circuitos aproximados. Se demuestra que estas técnicas tienen diversas aplicaciones,
haciendo de la flexibilidad y adaptabilidad a los requisitos de cada aplicación
sus principales virtudes.Programa Oficial de Doctorado en IngenierÃa Eléctrica, Electrónica y AutomáticaPresidente: Raoul Velazco.- Secretario: Almudena Lindoso Muñoz.- Vocal: Jaume Segura Fuste
Algorithms and architectures for the multirate additive synthesis of musical tones
In classical Additive Synthesis (AS), the output signal is the sum of a large number of independently controllable sinusoidal partials. The advantages of AS for music synthesis are well known as is the high computational cost. This thesis is concerned with the computational optimisation of AS by multirate DSP techniques. In note-based music synthesis, the expected bounds of the frequency trajectory of each partial in a finite lifecycle tone determine critical time-invariant partial-specific sample rates which are lower than the conventional rate (in excess of 40kHz) resulting in computational savings. Scheduling and interpolation (to suppress quantisation noise) for many sample rates is required, leading to the concept of Multirate Additive Synthesis (MAS) where these overheads are minimised by synthesis filterbanks which quantise the set of available sample rates. Alternative AS optimisations are also appraised. It is shown that a hierarchical interpretation of the QMF filterbank preserves AS generality and permits efficient context-specific adaptation of computation to required note dynamics. Practical QMF implementation and the modifications necessary for MAS are discussed. QMF transition widths can be logically excluded from the MAS paradigm, at a cost. Therefore a novel filterbank is evaluated where transition widths are physically excluded. Benchmarking of a hypothetical orchestral synthesis application provides a tentative quantitative analysis of the performance improvement of MAS over AS. The mapping of MAS into VLSI is opened by a review of sine computation techniques. Then the functional specification and high-level design of a conceptual MAS Coprocessor (MASC) is developed which functions with high autonomy in a loosely-coupled master- slave configuration with a Host CPU which executes filterbanks in software. Standard hardware optimisation techniques are used, such as pipelining, based upon the principle of an application-specific memory hierarchy which maximises MASC throughput
A Structured Design Methodology for High Performance VLSI Arrays
abstract: The geometric growth in the integrated circuit technology due to transistor scaling also with system-on-chip design strategy, the complexity of the integrated circuit has increased manifold. Short time to market with high reliability and performance is one of the most competitive challenges. Both custom and ASIC design methodologies have evolved over the time to cope with this but the high manual labor in custom and statistic design in ASIC are still causes of concern. This work proposes a new circuit design strategy that focuses mostly on arrayed structures like TLB, RF, Cache, IPCAM etc. that reduces the manual effort to a great extent and also makes the design regular, repetitive still achieving high performance. The method proposes making the complete design custom schematic but using the standard cells. This requires adding some custom cells to the already exhaustive library to optimize the design for performance. Once schematic is finalized, the designer places these standard cells in a spreadsheet, placing closely the cells in the critical paths. A Perl script then generates Cadence Encounter compatible placement file. The design is then routed in Encounter. Since designer is the best judge of the circuit architecture, placement by the designer will allow achieve most optimal design. Several designs like IPCAM, issue logic, TLB, RF and Cache designs were carried out and the performance were compared against the fully custom and ASIC flow. The TLB, RF and Cache were the part of the HEMES microprocessor.Dissertation/ThesisPh.D. Electrical Engineering 201
Architectural Exploration of KeyRing Self-Timed Processors
RÉSUMÉ
Les dernières décennies ont vu l’augmentation des performances des processeurs contraintes
par les limites imposées par la consommation d’énergie des systèmes électroniques : des très
basses consommations requises pour les objets connectés, aux budgets de dépenses électriques
des serveurs, en passant par les limitations thermiques et la durée de vie des batteries des
appareils mobiles. Cette forte demande en processeurs efficients en énergie, couplée avec
les limitations de la réduction d’échelle des transistors—qui ne permet plus d’améliorer les
performances à densité de puissance constante—, conduit les concepteurs de circuits intégrés
à explorer de nouvelles microarchitectures permettant d’obtenir de meilleures performances
pour un budget énergétique donné. Cette thèse s’inscrit dans cette tendance en proposant
une nouvelle microarchitecture de processeur, appelée KeyRing, conçue avec l’intention de
réduire la consommation d’énergie des processeurs.
La fréquence d’opération des transistors dans les circuits intégrés est proportionnelle à leur
consommation dynamique d’énergie. Par conséquent, les techniques de conception permettant
de réduire dynamiquement le nombre de transistors en opération sont très largement
adoptées pour améliorer l’efficience énergétique des processeurs. La technique de clock-gating
est particulièrement usitée dans les circuits synchrones, car elle réduit l’impact de l’horloge
globale, qui est la principale source d’activité. La microarchitecture KeyRing présentée dans
cette thèse utilise une méthode de synchronisation décentralisée et asynchrone pour réduire
l’activité des circuits. Elle est dérivée du processeur AnARM, un processeur développé par
Octasic sur la base d’une microarchitecture asynchrone ad hoc. Bien qu’il soit plus efficient
en énergie que des alternatives synchrones, le AnARM est essentiellement incompatible avec
les méthodes de synthèse et d’analyse temporelle statique standards. De plus, sa technique
de conception ad hoc ne s’inscrit que partiellement dans les paradigmes de conceptions asynchrones.
Cette thèse propose une approche rigoureuse pour définir les principes généraux
de cette technique de conception ad hoc, en faisant levier sur la littérature asynchrone. La
microarchitecture KeyRing qui en résulte est développée en association avec une méthode
de conception automatisée, qui permet de s’affranchir des incompatibilités natives existant
entre les outils de conception et les systèmes asynchrones. La méthode proposée permet de
pleinement mettre à profit les flots de conception standards de l’industrie microélectronique
pour réaliser la synthèse et la vérification des circuits KeyRing. Cette thèse propose également
des protocoles expérimentaux, dont le but est de renforcer la relation de causalité
entre la microarchitecture KeyRing et une réduction de la consommation énergétique des
processeurs, comparativement à des alternatives synchrones équivalentes.----------ABSTRACT
Over the last years, microprocessors have had to increase their performances while keeping
their power envelope within tight bounds, as dictated by the needs of various markets: from
the ultra-low power requirements of the IoT, to the electrical power consumption budget
in enterprise servers, by way of passive cooling and day-long battery life in mobile devices.
This high demand for power-efficient processors, coupled with the limitations of technology
scaling—which no longer provides improved performances at constant power densities—, is
leading designers to explore new microarchitectures with the goal of pulling more performances
out of a fixed power budget. This work enters into this trend by proposing a new
processor microarchitecture, called KeyRing, having a low-power design intent.
The switching activity of integrated circuits—i.e. transistors switching on and off—directly
affects their dynamic power consumption. Circuit-level design techniques such as clock-gating
are widely adopted as they dramatically reduce the impact of the global clock in synchronous
circuits, which constitutes the main source of switching activity. The KeyRing microarchitecture
presented in this work uses an asynchronous clocking scheme that relies on decentralized
synchronization mechanisms to reduce the switching activity of circuits. It is derived from
the AnARM, a power-efficient ARM processor developed by Octasic using an ad hoc asynchronous
microarchitecture. Although it delivers better power-efficiency than synchronous
alternatives, it is for the most part incompatible with standard timing-driven synthesis and
Static Timing Analysis (STA). In addition, its design style does not fit well within the existing
asynchronous design paradigms. This work lays the foundations for a more rigorous
definition of this rather unorthodox design style, using circuits and methods coming from the
asynchronous literature. The resulting KeyRing microarchitecture is developed in combination
with Electronic Design Automation (EDA) methods that alleviate incompatibility issues
related to ad hoc clocking, enabling timing-driven optimizations and verifications of KeyRing
circuits using industry-standard design flows. In addition to bridging the gap with standard
design practices, this work also proposes comprehensive experimental protocols that aims to
strengthen the causal relation between the reported asynchronous microarchitecture and a
reduced power consumption compared with synchronous alternatives.
The main achievement of this work is a framework that enables the architectural exploration
of circuits using the KeyRing microarchitecture
High level compilation for gate reconfigurable architectures
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 205-215).A continuing exponential increase in the number of programmable elements is turning management of gate-reconfigurable architectures as "glue logic" into an intractable problem; it is past time to raise this abstraction level. The physical hardware in gate-reconfigurable architectures is all low level - individual wires, bit-level functions, and single bit registers - hence one should look to the fetch-decode-execute machinery of traditional computers for higher level abstractions. Ordinary computers have machine-level architectural mechanisms that interpret instructions - instructions that are generated by a high-level compiler. Efficiently moving up to the next abstraction level requires leveraging these mechanisms without introducing the overhead of machine-level interpretation. In this dissertation, I solve this fundamental problem by specializing architectural mechanisms with respect to input programs. This solution is the key to efficient compilation of high-level programs to gate reconfigurable architectures. My approach to specialization includes several novel techniques. I develop, with others, extensive bitwidth analyses that apply to registers, pointers, and arrays. I use pointer analysis and memory disambiguation to target devices with blocks of embedded memory. My approach to memory parallelization generates a spatial hierarchy that enables easier-to-synthesize logic state machines with smaller circuits and no long wires.(cont.) My space-time scheduling approach integrates the techniques of high-level synthesis with the static routing concepts developed for single-chip multiprocessors. Using DeepC, a prototype compiler demonstrating my thesis, I compile a new benchmark suite to Xilinx Virtex FPGAs. Resulting performance is comparable to a custom MIPS processor, with smaller area (40 percent on average), higher evaluation speeds (2.4x), and lower energy (18x) and energy-delay (45x). Specialization of advanced mechanisms results in additional speedup, scaling with hardware area, at the expense of power. For comparison, I also target IBM's standard cell SA-27E process and the RAW microprocessor. Results include sensitivity analysis to the different mechanisms specialized and a grand comparison between alternate targets.by Jonathan William Babb.Ph.D