**UNIVERSITY OF OSLO Department of Informatics** ## Low Power CMOS Design Exploring Radiation Tolerance in a 90 nm Low Power Commercial Process Master thesis Amir Hasanbegovic June 2, 2010 ## **Abstract** This thesis aims to examine radiation tolerance of low power digital CMOS circuits in a commercial 90 nm low power triple-well process from TSMC. By combining supply voltage scaling and Radiation-Hardened By Design (RHBD) design techniques, the goal is to achieve low supply voltage, radiation tolerant, circuit behavior. The target circuit architecture for comparison between different radiation hardening techniques is a Successive Approximation Register (SAR) architecture comprising both combinational and sequential logic. The purpose of the SAR architecture is to emulate a larger system, since larger systems are usually composed of combinational and sequential building blocks. The method used for achieving low power operation is primarily voltage scaling, with the ultimate goal of reaching subthreshold operation, while maintaining radiation tolerant circuit behavior. Radiation hardening is performed on circuit-level by applying RHBD circuit topologies, as well as architectural-level mitigation techniques. This thesis includes three papers within the field of robust low power CMOS design. Two of the papers cover low power level shifter designs in 90 nm and 65 nm process from ST Microelectronics. The third paper examines memory element design using minority-3 gates and inverters for robust low voltage operation. Prototyping has been conducted on low power CMOS building blocks including level shifter and memory design, for potential use in future radiation tolerant designs. Prototyping has been conducted on two chips from two different 90 nm processes from ST Microelectronics and TSMC. A test setup for radiation induced errors has been developed. Experimental radiation tests of the SAR architectures were conducted at SAFE, revealing no radiation induced errors. ## **Preface** This thesis is submitted as part of the degree Master of Science in Microelectronics, to the Department of Informatics, Faculty of Mathematics and Natural Sciences, University of Oslo (UiO). The project was initiated in February 2009, and concluded the following year, June 2010. The work done throughout my thesis work has been both inspiring and challenging. Among other things, resulting in two published papers, and an invited paper to the MICPRO Elsevier journal. I have been fortunate enough to take part in design, tapeout and prototyping of two chips from two different processes (ST Microelectronics and TSMC). Moreover, working with the "Robust Ultra-Low-Power Circuits for Nano-Scale CMOS Technologies" project through the participation in the DAAD researcher exchange program between Norway and Germany (project number: 193831), has been very educational and inspiring. I first and formost want to express my gratitude to my supervisor, Snorre Aunet, for all the valuable guidance and the inspiration he has given me throughout this thesis. I also want to thank Tom Arne Danielsen for his assistance with VHDL coding. A special thanks goes to associate professor Torfinn Lindem from the Department of Physics, UiO, and professor Jon Petter Omtvedt from the Department of Chemistry, UiO, for assisting in the radiation testing at SAFE. A lot of gratitude goes to Hans K. Otnes Berge and Olav Stanly Kyrvestad for providing me with helpful information regarding many theoretical and technical aspects encountered in the course of this thesis work. I also want to thank the students at the lab, Mats, Dag and Geir for both relevant and irrelevant discussions during my stay at the Nanoelectronics group. On a more personal note, I want to thank my family for their support throughout my master thesis work. ## **Contents** | Ab | stract | | Ì | |----|--------|-------------------------------------------------------------|----| | Pr | eface | | ii | | 1 | Intro | oduction | 1 | | | 1.1 | Motivation | 2 | | | 1.2 | Thesis outline | 3 | | 2 | Low | Power- and Radiation Tolerant CMOS Design | 5 | | | 2.1 | Low Power Digital CMOS | 5 | | | | 2.1.1 Subthreshold Operation | 5 | | | 2.2 | Radiation Environments | 6 | | | 2.3 | Radiation Effects In Electronics | 7 | | | | 2.3.1 Single Event Transients | 8 | | | | 2.3.2 Single Event Upsets | 8 | | | | 2.3.3 Multiple Bit Upset | 8 | | | | 2.3.4 Single Event Latchup | 9 | | | | 2.3.5 Total Ionizing Dose Effects | 9 | | | 2.4 | Radiation Tolerant CMOS Design | 9 | | 3 | Tran | sistor -Structures and -Properties in RHBD Applications | 11 | | | 3.1 | Linear CMOS devices | 11 | | | | 3.1.1 Impact of technology scaling on radiation tolerance | 11 | | | | 3.1.2 Radiation induced leakage | 12 | | | 3.2 | Alternative CMOS devices | 14 | | 4 | Circ | uit- and Architectural Level Hardening of SAR Architectures | 17 | | | 4.1 | Circuit level radiation hardening of logic blocks | 17 | | | | 4.1.1 SET mitigation | 17 | | | | 4.1.2 SEU mitigation | 18 | | | 4.2 | Architectural level hardening | 19 | | | | 4.2.1 Triple Modular Redundancy | 19 | | | | 4.2.2 Dual Modular Redundancy | 20 | | | 4.3 | SEE Layout Considerations | 21 | | | 4.4 | SAR architecture | 21 | | 5 | Soft | Error Simulation- and Test Methodology | 25 | | | 5.1 | Simulation Methodology | 25 | | | 5.2 | Template-based Soft Error Characterization | 27 | vi CONTENTS | | | 5.2.1<br>5.2.2<br>5.2.3 | Soft Error Detection using a FPGA | 28<br>29<br>30 | |---|------------------------------------------------------------|----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------| | 6 | Resu<br>6.1 | lts<br>Papers<br>6.1.1 | Paper I: | <b>33</b> 33 | | | | 6.1.2 | Low-Power Subthreshold to Above Threshold Level Shifter in 90 nm Process Paper II: Memory Elements Based on Minority-3 Gates and Inverters Implemented in 90 nm CMOS | | | | | 6.1.3 | in 90 nm CMOS | 38<br>45 | | | 6.2 | Irradiat<br>6.2.1<br>6.2.2 | ion of SAR architectures | 54<br>54<br>57 | | 7 | Disc | ussion | | 59 | | 8 | Conc | clusion | | 61 | | A | VHD | )L code | for template-based soft error characterization | 69 | | В | Matl | ab code | for error detection | 73 | | C | C PCB Layout | | | 79 | | D | O ST Microelectronics chip (90 nm general purpose process) | | | 81 | | E | TSMC chip (90 nm low power process) | | | 83 | ## List of Figures | 1.1 | Multiple voltage domains interfaced with level shifters | 1 | |----------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------| | 2.1<br>2.2<br>2.3 | Earth's radiation environment | 6<br>7<br>8 | | 3.1<br>3.2<br>3.3<br>3.4 | Lateral leakage paths, NMOS intra-device leakage | 12<br>13<br>13<br>14 | | 4.1<br>4.2<br>4.3<br>4.4<br>4.5<br>4.6 | C-element | 18<br>19<br>20<br>21<br>22<br>22 | | 5.1<br>5.2<br>5.3<br>5.4<br>5.5<br>5.6 | Particle hit induced current | 26<br>26<br>27<br>28<br>29<br>31 | | 6.1<br>6.2<br>6.3<br>6.4<br>6.5<br>6.6 | Preliminary error detection of the TMR SAR architecture with 1.2 V supply voltage Preliminary error detection of the TMR SAR architecture with 350 mV supply voltage | 55<br>56<br>57<br>58 | | C.1 | PCB layout | 79 | | D.1 | ST Microelectronics chip with level shifter and a D flip flop based minority-3 gates and inverters | 81 | | E.1 | TSMC chip for radiation testing | 83 | viii LIST OF FIGURES ## Chapter 1 ## Introduction The increasing demand of portability and extended battery life has been a major driving force for optimization between power consumption, performance and reliability. The use of multiple voltage- and clock domains is a good way to optimize device performance and reliability, while maintaining low power consumption [3]. Such implementations dedicate low supply voltage to circuits that operate at low speed in order to save power, while higher supply voltages are dedicated to circuits that have higher speed requirements at the expense of increased power consumption [4]. The use of multiple voltage domains increases circuit complexity, and therefore requires thourough performance and reliability analysis. Level shifters need to be inserted for interfacing of different voltage domains and digital logic blocks need to be designed for their intended supply voltage operation [3,5,6]. The utilization of dynamic voltage scaling enables for futher optimization between performance and power saving by varying the supply voltage based on the speed requirements of the integrated circuit (IC) at any given time. Figure 1.1 shows the principle behind multiple voltage domain utilization. **Figure 1.1** Multiple voltage domains interfaced with level shifters Low power IC design is also of interest to the radiation tolerant market. Radiation tolerance of standard circuits is however strongly dependent on the amount of current flowing through each node in the circuit. For this reason, low power- and radiation tolerant circuit implementations are often contradictory in design and are therefore governed by increased complexity and overheads compared to the implementation of equivalent standard circuits. Contemporary technology scaling trends which have contributed to smaller feature sizes, reduction in supply voltage as well as higher density designs have made deep submicron ICs increasingly susceptible to radiation induced errors. Such errors are of main concern to memory design, however standard logic circuits are also becoming more sensitive to radiation induced errors and therefore need to be taken into account when designing circits for applications requiring high reliability. Parity and error-correcting code (ECC) are usually used in memory designs in order to protect memory from radiation induced errors. Nonetheless, with peripheral combinational and sequential logic exhibiting poor radiation tolerance, the impact of the effectiveness of ECC is thereby limited [7]. Reports have shown that the reliability of Earth based designs has been compromised<sup>1</sup>, sending a clear message to the electronics industry that radiation induced errors are of great concern for further advancement of IC design. Moreover, state of the art realization of automotive and medical devices, such as implantable biomedical devices, must be hardened against radiation induced errors in order to minimize the possibility for human hazards<sup>2</sup>. Comparable Earth based applications have initially the lowest radiation tolerance requirements. The radiation tolerance requirements increase as the application areas get more specific, such as military use and nuclear power plants. For applications such as satallitecommunication, the radiation tolerance requirements may become relatively higher because of the value and cost implications of such projects. System failures that occur in Earth based applications are more accessable for repairs then failures occurring in space applications. Therefore, the radiation tolerance requirements of systems found in space applications are often relatively higher in order to minimize the probability for potential need of repairs. Current research efforts within the field of radiation tolerant electronics design are focused on meeting these requirements by using commercial process foundries instead of radiation hardened process foundries (which are dedicated for the purpose of radiation tolerance). Design and development of radiation tolerant digital custom cell libraries in commercial processes may contribute to increased efficiency for applications requiring radiation tolerance in terms of power, area, speed and time to market [8]. However, before customized cell libraries can be implemented, a thourough radiation tolerance characterization of the commercial process needs to be conducted. The impact of different types of radiation effects needs to be identified, and techniques for mitigating radiation effects need to be employed. Simulation- and physical verification methodologies are essential in order to be able to predict the radiation tolerance of such cell libraries. #### 1.1 Motivation Technology and market trends have altered the way radiation tolerance is achieved in present day electronics design. The design trend is shifting towards radiation hardened by design (RHBD) methodologies and away from radiation hardened by process (RHBP) methodologies. Over the past few decades, the RHBP IC market has been substantially reduced. There are several reasons for this trend, the most significant being the exponential growth of the consumer IC market which has attracted many semiconductor suppliers [9]. Environmental directives such as RoHS and WEEE have indirectly restriced the radiation hardened supplier market because suppliers are more interested in the more lucrative commercial IC market. Moreover, unintentional consequences of the advancements in the commercial semiconductor processing technologies has contributed to increased total dose hardness of commercially produced ICs. The increase in total dose hardness has made commercial processes favorable in many applications where high reliability is required, given that the negative aspects of radiation induced effects are tackled appropriately. Although there are positive sides when it comes to utilizing commercial processes for the purpose of radiation tolerant IC design, there are also negative sides, especially related to the technology scaling trends. The reduction of the supply voltage makes circuits more vulnerable to radiation induced particle hits, which is an effect that needs to be countered using specialized circuit, architectural and layout topologies, i.e. RHBD. <sup>&</sup>lt;sup>1</sup>Latchup on CISCO routers in 2004 (www.cisco.com) <sup>&</sup>lt;sup>2</sup>Pacemakers experienced neutron-induced shutdowns (IEEE TNS 96) 1.2 Thesis outline 3 RHBD circuits do impose an area, speed and power consumption overhead compared to the equivalent unhardened circuits in the same process. Nevertheless, due to the fact that radiation hardened process foundries lag 2-3 generation behind commercial process foundries in terms of performance (and availability), the use of commercial process flows are therefore attractive for meeting radiation tolerance requirements at the same time as increasing the performance [10, 9]. The ultimate goal of utilizing commercial IC processes is the instigation of cheaper and faster design and production of radiation tolerant electronics, while maintaining the best possible performance and minimizing RHBD penalties. #### 1.2 Thesis outline **Chapter 2** gives a brief introduction to low power- and radiation tolerant CMOS design. A brief introduction to radiation effects in electronics is also provided. **Chapter 3** presents the most substantial transistor properties in digital RHBD applications. **Chapter 4** presents circuit- and architectural level SET and SEU mitigation techniques, which are applied to three SAR architectures for SEU characterization. **Chapter 5** gives an overview the soft error simulation- and test methodology used to evaluate the SEU response of the SAR architectures. **Chapter 6** presents some of the results obtained during this thesis work. Three papers are included as well as the results from the SEU characterization of SAR architectures. **Chapter 7** is a discussion of the thesis contributions. Chapter 8 concludes this thesis work and provides some remarks regarding future work. Five appendices are also included: **Appendix A** shows a part of the VHDL code for template-based soft error characterization. **Appendix B** shows a part of matlab code for error detection. **Appendix C** shows the PCB layout. **Appendix D** shows the layout of the ST Microelectronics chip (90 nm general purpose process). **Appendix E** shows the layout of the TSMC chip (90 nm low power process). A CD has been enclosed with additional material including full VHDL code for the template-based soft error characterization, necessary matlab code for running the experiment as well as additional measurement results. ## Chapter 2 # Low Power- and Radiation Tolerant CMOS Design ## 2.1 Low Power Digital CMOS The increasing requirement of portability has been a major driving force for development in low power digital CMOS design the last few decades. Low power IC design incorporates many different design techniques in order to achieve low power consumption while maintaining reasonable speed performace. Such techniques may include utilization of Multi-threshold CMOS (MTC-MOS), low power clocking, multi-supply voltages, dynamic voltage- and frequency scaling and/or standby modes [11]. Although there are many ways of attaining low power consumption, the most direct and dramatic means of reducing power consumption in ICs is by operating circuits in the subthreshold region [12, 13]. ## 2.1.1 Subthreshold Operation Subthreshold operation has become a well established region of operation for digital circuits when ultra low power circuit operation is in demand, and speed is of secondary importance [14]. Subthreshold operation implies that the gate-source voltage, $V_{gs}$ , and the power supply voltage, $V_{dd}$ , is below the absolute value of the transistor's threshold voltage, $|V_T|$ [15]. By reducing the power supply voltage down to the subthreshold region, a large decrease in static and dynamic power consumption can be achieved. However, a decrease in power supply voltage also contributes to circuit reliability issues. Process- and missmatch variations gain an increased impact on circuit behavior due to exponential current dependencies in the subthreshold region. The subthreshold current $I_{Dsub}$ is given by [15] $$I_{Dsub} = I_0 e^{\frac{V_{gs} - V_T}{nV_{th}}} \left( 1 - e^{\frac{-V_{ds}}{V_{th}}} \right)$$ (2.1) where $I_0$ is the drain current when $V_{gs} < V_T$ $$I_0 = \mu_0 C_{ox} \frac{W}{L} (n-1) V_{th}^2 \tag{2.2}$$ The exponential current dependencies may lead to large variations in propagation delay, which make it hard to determine the design specification predictability [16]. Such specification predictabilities range from $I_{ON}/I_{OFF}$ ratio variations to setup- and hold time violations in sequential logic, and are highly dependent on circuit topology choice. Even though subthreshold operation involves several design challenges, the operating region still remains an attractive method for achieving ultra low power CMOS circuits, given that proper design techniques are employed. #### 2.2 Radiation Environments Electronic devices are often exposed to radiation effects, and the most frequently associated radiation environment is the naturally-occurring space radiation environment. Put in simple terms, space electronics are exposed to radiation from the Van Allen belts, solar flares, solar wind and cosmic rays. Spacecraft and satellites in Earth-orbit encounter large amounts of radiation from the Van Allen radiation belts, which are regions of trapped protons and electrons, see Figure 2.1. The inner Van Allen belt contains mainly protons with energies up to 30 MeV, whereas the outer Van Allen belt contains fast moving electrons and slow moving ions, that are trapped in the magnetosphere, with energies which can exceed 100 MeV. The particle flux in these regions depends highly on the altitude, orbital inclination as well as the solar activity. **Figure 2.1** Earth's radiation environment Electronic devices also encounter hazardous radiation environments in the Earths atmosphere. Galactic cosmic rays, which originate from outside the solar system, consist of low flux, highly energetic particles up to the GeV range. These particles produce intense ionization and are very hard to shield against. When cosmic rays interact with atmospheric nuclei, nuclear spallation reactions take place and induce production of high energy neutrons and pions [17]. Figure 2.2 illustrates the terrestrial nuclear cascade shower as a result of incoming cosmic rays. As in the geospace environment, the particle-flux and energy in the Earths atmosphere depend highly on altitude as well as events in the intergalactic space weather. Therefore, the radiation impact on electronics is higher for avionics applications then for ground based applications [2]. Cosmic rays can also induce soft errors in electronics at sea level [1]. The main radiation sources causing soft errors in electronics at sea level are cosmic ray induced neutrons and pions, and microelectronic packaging induced alpha particles. [1] As neutrons and pions interact with silicon, nuclear reaction takes place and the resulting high energy secondary particles scatter in all directions. Furthermore, natural radiation environments are not the only radiation environments hazardous for electronics. Man-made radiation environments can also induce errors on electronic devices. Such environments are encountered in, but not limited to, nuclear power plants, medical applications and warefare. Figure 2.2 Terrestrial radiation environment, adapted from [1] ### 2.3 Radiation Effects In Electronics Radiation effects in electronics are often categorized into several separate categories of radiation induced effects. Galactic cosmic rays, cosmic solar particles and trapped protons in radiation belts are the main sources of energetic particles causing single event effects (SEE). These single event effect are initially non-destructive events in electronic devices. However, radiation effects can also result in semiconductor degradation (both over time and instantly) by causing radiation damage to local or global parts of the device. Therefore, radiation induced errors in electronics are often categorized in soft and hard errors. Single event effects have received recognition ever since the 1950s, and their nature has been researched thoroughly ever since then [1]. Soft errors are commonly related to the post impact state of high energy ionizing particles interacting with semiconductor materials at locations close to sensitive circuit nodes of active devices. Figure 2.3 shows the principle behind this phenomenon. When a high energy particle passes through the silicon near a sensitive node, the particle produces a dense radial distribution of electron-hole pairs along its trajectory (Figure 2.3 a). Electrons and holes then drift towards their opposite potential via the electric field in the depletion region, thereby extending and distorting the depletion region, thus giving rise to rapid charge collection via drift (Figure 2.3 b). When the excess carriers along the particle track have been collected, recombined or diffused away from the junction area, the slower diffusion collection takes over until equilibrium is reached (Figure 2.3 c). A current pulse is then generated as a result of the collected charge (2.3 c) and causes a voltage fluctuation at this node [7]. This is only a principal model of a particle strike and the final result depends on a lot of variables such as type of particle, its initial energy, its angle of impact and process parameters. **Figure 2.3** Principle behind charge collection in a n+/p junction immediately after a particle strike, adapted from [2] Hard errors on the other hand are errors that cause destructive damage to transistors and other semiconductor materials. Hard errors often occur over time due to cumulative effects, and can therefore be avoided if the potential hard error inducing effects are discovered and corrected in time. This following subsections briefly describes the most common, both non-destructive and destructive, radiation effects encountered in electronic devices. ## 2.3.1 Single Event Transients Single Event Transients (SET) are transient voltage fluctuations, induced by charge deposition by a subatomic particle strike. The voltage fluctuations can propagate to storage elements and cause an erroneous latched logic state, resulting in a Single Event Upset (SEU) [18]. Single event transients are of great concern for analog electronics such as comparators [19]. Mitigation techniques primarily involve capacitive hardening and in some cases digital correction techniques. ## 2.3.2 Single Event Upsets A SEU can be defined as a change in the logic state of a latched logic element from a logic one to a logic zero or vice-versa. SEU occur when high energy particles alter the charge deposition in critical nodes in such a way that it results in a bit error. SEU are defined to be non-destructive events, and therefore the affected logic can be rewritten or reset to regain proper operational behavior [18]. ## 2.3.3 Multiple Bit Upset Multiple bit upsets (MBU) are upsets resulting from a single high velocity charged particle strike, passing through several sensitive nodes in an electronic device, and thereby changing the logic state in several adjacent latched logic elements. For high velocity particle strikes, the number of bit upsets has been shown to be highly dependent on the impact angle of the particle [20]. MBUs can also occur as a result of proton induced- as well as terrestrial neutron induced nuclear reactions that produce secondary ionizing particles which interact with capacitive nodes in close proximity of the initial strike [21]. However, due to the technology feature size scaling trends, the device sensitivity to MBUs is expected to keep increasing in high density ultra deep-submicron designs [22, 23]. #### 2.3.4 Single Event Latchup Single event latch-up is a permanent and potentially destructive state of the device under test (DUT) whereby a parasitic thyristor structure is triggered by a high energy particle strike and a low impedance, high current path is established between the power rails. A power cycle is usually sufficient to reestablish normal operation of the device. If for example a latch-up is left uncorrected, the high current path may result in destructive failure, referred to as Single Event Burn-out (SEBO). #### 2.3.5 Total Ionizing Dose Effects Total ionizing doze (TID) effects are cumulative damage effects caused by ionizing radiation over time. In semiconductor devices, ionizing radiation contaminates oxide layers by accumulation of trapped positive charge, which degrades transistor performance by causing channel inversion at the oxide/silicon interface. These inverted channels result in current conducting leakage paths and given high enough radiation doses, over a longer period of time, the trapped charge can become sufficiently high to force a transistor to be permanently open/closed. TID effect may also result in displacement damage in electronic devices as a result of proton induced degradation mechanisms [24]. ## 2.4 Radiation Tolerant CMOS Design Throughout this decade, the space industry has been signaling an increased interest in robust ultra low power integrated circuit design for use in future space missions [25, 26, 27]. A special case of this trend has come to be known as "CMOS ULTRA-LOW POWER RADIATION TOLERANT" (CULPRIT) integrated circuits [27]. This concept evolves around scaling of the power supply voltage to reduce power consumption at the same time as scaling the threshold voltage in order to compensate for the performance loss imposed by a decrease in supply voltage. Traditional circuits used in space applications are designed using radiation hardened processes and may consume relatively large amounts of power. However, with the decline in the RHBP market and with new semiconductor technologies emerging, new mitigation techniques need to be employed in order to successfully suppress the impact of soft errors on modern electronic devices [2]. These recent years, several papers have been published on exploring potential advantages with subthreshold operation in radiation tolerant digital CMOS design [28, 29]. Experimental results have shown that the TID induced leakage current can be reduce by up to 100 times by lowering the supply voltage [28]. With the use of circuit-level soft error mitigation techniques, digital circuits operating in the subthreshold region may become relevant for hazardous radiation environments, and even space applications in the future. In such applications, subthreshold operation may be employed by secondary systems or voltage domains, thereby freeing up power for more crucial components. <sup>&</sup>lt;sup>1</sup>Iridium Technologies have put radiation hardened subthreshold circuit design as one of their ultimate goals Traditional design techniques such as voltage scaling and multi supply voltage domain utilization have been proposed for realization of low power, radiation tolerant circuit designs [30]. The proposed idea is based on the use of low-VDD operation for circuits that are less sensitive to radiation induced errors, and high-VDD for circuits that are more sensitive to radiation induced errors. A challenge with reduced power supply voltages is the reduction of the critical charge, $Q_{crit}$ , which is the required amount of charge to upset a capacitive node in a circuit [31]. $Q_{crit}$ is a supply voltage- and transistor geometry dependent parameter and will be addressed in Section 5.1 and in Section 4.1.2. ## Chapter 3 ## Transistor -Structures and -Properties in RHBD Applications When designing circuits for radiation tolerant applications, the designer must primarily evaluate the transistor behavior for the targeted radiation environment. For these types of applications, transistor behavior is highly dependent on process choice as well as technology node [32]. The primary concern regarding transistor behavior is described in terms of TID induced leakage. Several approaches have been proposed to counter the effects of TID, and thereby retain stable transistor operation in harsh radiation environments using standard commercial CMOS processes. #### 3.1 Linear CMOS devices The standard transistor, also referred to as a linear device, has traditionally been regarded as a poor choice for RHBD applications. The main reason for this is the impact of TID induced leakage which is highly dependent on device isolation process technique and to some extent gate oxide thickness, the first mentioned being implemented using Shallow trench isolation (STI) in newer processes [33, 2]. For gate oxide thickness above 6nm, trapped positive charge in the gate oxide has been a big source of threshold voltage shifts in the transistor [34]. This variation in threshold voltage may potentially drive the transistor out of its intended operation area, and thereby render the transistor unreliable. In the recent years however, this trend has shifted and standard CMOS transistors have become more attractive for use in radiation tolerant integrated circuit design due to the impact of technology scaling. ## 3.1.1 Impact of technology scaling on radiation tolerance Technology scaling trends in commercial processes have brought about both positive and negative effects in terms of radiation tolerance. The TID induced leakage currents and threshold voltage shifts have been substantially reduced as a consequence of using thinner gate oxides [10]. Contemporary commercial processes exhibit little sensitivity to radiation induced threshold voltage shifts due to the reduced possibility of radiation induced positive charge getting trapped in the thin oxide layers [10]. Furthermore, the utilization of STI instead of Local Oxidation of Silicon (LOCOS) for inter device isolation has contributed to large reductions in TID induced leakage currents. These factors are some of the main reasons RHBD has been made a realistic possibility. Reports have also shown that thin gate oxides are adequate for operating in heavy ion environments, without great risk of gate oxide breakdowns [35]. On the other hand, the impact of technology scaling has also contributed to smaller feature sizes which in turn result in less nodal capacitances which make devices more prone to SEUs due to less energy may result in a SET. Additionally, as the device sizes shrink, higher density is achieved which thereby enables a single particle hit to upset several devices in close proximity. These properties are further discussed in Section 4.1.2 In contrast to thin gate oxides, isolation oxides are thicker and therefore enable for TID induced positive trapped charge, and thereby need to be taken into account when designing circuits for radiation tolerant applications. #### 3.1.2 Radiation induced leakage Radiation-induced leakage may be categorized into intra-device leakage and inter-device leakage, since they make up the main sources of leakage currents contaminating the transistor behavior [36]. Intra-device leakage are the leakage currents that form between the drain and source terminals of a transistor, while inter-device leakage are the leakage currents that form between adjacent transistors. In a standard NMOS, lateral leakage paths can be generated due to buildup of TID induced positive trapped charge in the STI oxide sidewalls, at the SI/SOI2 interface [37, 33]. Figure 3.1 illustrates intra-device leakage current in a standard MOSFET. The positive trapped charge inverts portions of the STI oxide sidewalls, thereby creating an edge leakage path between the source and drain terminals or between adjacent transistors [38]. These leakage paths can be modeled Figure 3.1 Lateral leakage paths, NMOS intra-device leakage as lateral parasitic transistors with the same length as the actual transistor length. The transistor model can however only be used as a guidance model, as the lateral leakage paths also depend on p-substrate doping and radiation dose, as well as other physical parameters such as Length of Diffusion (LOD) and Shallow Thrench Isolation Width (STIW) [39]. Figure 3.2 illustrates the transistor-level modeling of lateral leakage in a single MOSFET. $$W_{main} = n_{finger} \cdot W_{finger} \tag{3.1}$$ $$W_{lateral} = 2 \cdot n_{finger} \cdot W_{lateral, finger} \tag{3.2}$$ $$W_{tot} = W_{main} + W_{lateral} \tag{3.3}$$ As the radiation dose increases, a larger portion of the STI oxide sidewalls near the surface becomes inverted, resulting in higher intra- and inter-device leakage currents. Figure 3.3 illustrates 13 Figure 3.2 Modeling of lateral leakage paths the effect a large radiation dose can have on both intra- and inter-device leakage. As a result of the increased radiation dose, larger portions of the deeper STI edges become inverted, creating current conducting parasitic channels which further increase the leakage currents. It should be noted that the inversion strength is lower at the deeper levels of the STI oxide sidewall. Figure 3.3 Lateral leakage paths, NMOS intra- and inter-device leakage As seen in Figure 3.3, the intra- and inter-device leakage in M2 and M3 is dominated by the lateral leakage paths in and between the transistors when the transistors share active drain/source area. This implies also an increased leakage current vulnerability of transistors utilizing fingered structures, as depicted by equation 3.3 [40]. However, for the case where adjacent transistors do not share drain or source and in addition are separated by STI, then the STI can have a significant impact on reducing the edge leakage. Nevertheless, given high enough radiation dose, parasitic current conducting channels would still form in the deeper portions of the STI, thereby establishing inter-device leakage paths. For the case of PMOS transistors, experimental results have shown that the TID induced leakage in PMOS transistors is very small compared to NMOS transistors for normal bias conditions [41,36]. The reason is that the trapped positive charge results in negative threshold voltage shift for PMOS transistors, thereby not inducing the previously mentioned leakage paths. This will in turn result to a decrease in drive current for the PMOS. TID induced leakage and threshold voltage shifts are regarded to be negligible in sub-180 nm commercial processes when transistors are operated at full supply voltage and annealing effects are taken into account [10]. On the other hand, the TID induced leakage may have a significant impact on low supply voltage operation. In order to be able to utilize supply voltage scaling for the purpose of power saving, the TID leakage currents need to be suppressed for the sake of keeping the minimum supply voltage, $V_{min}$ , as low as possible (subthreshold for example). When operation circuits at low supply voltages, the TID induced leakage my reduce the $I_{ON}/I_{OFF}$ ratio due to degradation in the subthreshold slope. ### 3.2 Alternative CMOS devices In order to counter the radiation induced leakage along the STI oxide sidewalls that occur in the linear MOSFET structure, the ELT (Enclosed layout transistor) has been a popular transistor device for use in radiation environments in the multi-MeV range [2]. A layout of an ELT is shown in Figure 3.4. **Figure 3.4** Layout of an ELT Due to the physical structure of the ELT, the radiation induced leakage is substantially reduced at high TID levels compared to that of the linear transistor. The main reason for the small radiation induced leakage is that the intra-device leakage in the ELT is not affected by the STI sidewalls, since the STI sidewalls do not exist between source and drain terminals in the ELT structure. Furthermore, the utilization of ELT substantially reduces threshold voltage shifts compared to linear transistor devices. Even though customized transistors such as the ELT offer better total dose hardness at higher radiation dose, they also contribute to complications associated with device geometries and modeling [2]. As it may be appearant from Figure 3.4, the source and drain of the ELT transistor are not symmetrical, and thereby the current conductance of the transistor will vary based on where the drain terminal is assigned. If the drain terminal is assigned to the enclosed part of the transistor, the transistor would have a larger current conduction due to lower drain terminal capacitance then what would have been achieved with having the source terminal in the enclosed part of the transistor. Furthermore, the layout geometry of the ELT imposes a minimum W/L ratio of the transistor since the width is defined by the ring length and the length is defined by the ring width (simplified explanation). For small aspect ratios, a large area consumption is to be expected since the only possible way of achieving very small aspect ratios is by increasing the length parameter. More extensive modeling of the ELT may be found in [42]. In the recent couple of years several papers have proposed new radiation tolerant transistor structures with the aim of reducing the area overhead imposed by the ELT layout topology [43] [44]. However, the device behavior still remains a issue when using commercial CAD tools for simulation without extensive alterations, therefore the most reliable device modeling is still achieved 15 using Technology Computer Aided Design (TCAD) tools<sup>1</sup>. Standard commercial CAD tools are by default not intended for extraction the physical geometries of ELTs, and therefore require manual netlist alterations. Moreover, even with altered extracted netlists, the embedded transistor models may not be sufficient to model the complex behavior of ELT layout topologies. <sup>&</sup>lt;sup>1</sup>Examples of such tools are; Sentaurus TCAD from Synopsys and NanoTCAD from SFDRC ## Chapter 4 ## Circuit- and Architectural Level Hardening of SAR Architectures This chapter will focus on the selected mitigation techniques performed in order to achieve SEU and SET tolerance in three SAR architectures which employ individual mitigation techniques. The mitigation techniques include circuit-level radiation hardening techniques as well as architecture-level hardening techniques. Due to technology scaling, also layout considerations need to be taken into account when employing SEU and SET mitigation techniques. A SAR architecture is adapted from [45], which consists of both combinational and sequential logic. Since large systems are built up using combinational and sequential logic blocks which operate in conjunction, the SAR architecture can be regarded as a scaled model of a larger system. By analysing the SET and SEU tolerance of the hardened SAR architectures, a better understanding will be provided on how the different mitigation techniques compare. ## 4.1 Circuit level radiation hardening of logic blocks Circuit level radiation hardening involves the use of circuit topologies capable of withstanding the impact of subatomic particle hits. Several papers have been published on novel mitigation topologies ever since the discovery of particle hit induced errors in electronic devices [46, 50, 49]. Circuit level radiation hardening is primarily aimed at SEU and SET mitigation. Combinational logic is only vulnerable to SETs, sequential logic is however vulnerable to both SETs and SEUs. ## 4.1.1 SET mitigation There are very few circuit-level SET mitigation techniques that have been proposed in literature. This is mainly due to the fact that every interaction between a subatomic particle hit and a sensitive node in a circuit, will result in a SET given that the particle hit produces enough energy. The magnitude of the impact of the SET on a circuit is closely related to the node capacitance and thereby $Q_{crit}$ . The effect of a radiation induced strike on any node in a circuit can be illustrated by Eq. 4.1. $$\triangle q = C \cdot \triangle V \tag{4.1}$$ where $\triangle q$ is the collected charge due to the radiation strike, C is the effective capacitance of a given node and $\triangle V$ is the resulting voltage change at that particular node. As feature sizes shrink, the node capacitance decreases with the square of the feature size, resulting in an additional reduction in $Q_{crit}$ and additionally an increased impact on node voltage. The increased impact in node voltage is a major contributor to single event transients at low supply voltages. This implies that increasing of $Q_{crit}$ with design techniques is a good way to start hardening a circuit against radiation induced errors. This approach is often referred to as capacitive hardening [47]. There are several ways of increasing $Q_{crit}$ , one of them is to increase the node capacitance by increasing the transistor size (i.e drain/source geometries of transistors) or by using circuit topologies that increase node capacitance [48]. Due to transistor geometries and parasitic components, SETs naturally diminish as they propagate through logic gates. However, the longer SETs propagate, the higher the propability is that they will become latched. Therefore, it is beneficial to suppress the impact of SETs is by filtering them out as fast as possible, thereby prohibiting SET propagation to other logic blocks which might cause SEUs, if latched. Several papers have proposed the use of C-elements for this purpose [49,50]. Figure 4.1 shows the schematic of a C-element. Figure 4.1 C-element The C-element has two inputs and one output that changes its value only when the two inputs change their values simultaneously (i.e acts as and inverter if there is correspondence between both inputs). Since a SET only results in a short voltage pulse<sup>1</sup>, the C-element is able to suppress the propagation of the SET, given that the SET occurs only on one of the inputs. This implies that two identical logic blocks are needed to drive the C-element inputs, in order to be able to use the C-element as a SET mitigation technique. Such a configuration fits well with dual modular redundancy (DMR) architectures, see Section 4.2.2. Further SET mitigation is discussed in Section 4.2 in terms of architecture level temporal redundancy. #### 4.1.2 SEU mitigation Circuit level SEU mitigation is primarily aimed at sequential logic blocks capable of storage such as latches and flip flops. SEU mitigation involves the use of circuit topologies that prevent a SET to become latched on the output of sequential logic block. In order to realize the SAR architecture, only a D flip flop was needed and is therefore the only storage element that is evaluated in this thesis. A lot of SEU tolerant latch topologies have been proposed over the years [52, 53, 54]. Several of these topologies have been evaluated and compared against eachother using the simulation method described in Section 5.1. The latch that showed the best SEU tolerance with the best tradeoff in <sup>&</sup>lt;sup>1</sup>Experimental results have shown that the SET pulsewidths can vary from 900 ps to over 3 ns at laser energies from 85 pJ to 179 pJ respectively [51] terms of speed-, area- and power consumption, was the dual interlocked cell (DICE) based latch used in [55], which originates from [56]. The DICE latch is shown in Figure 4.2 Figure 4.2 DICE latch with C-elements This DICE based latch makes use of two locally separated data and clock inputs which, if not corrupted simultaneously, will provide SEU tolerance due to the intrinsic upset tolerance of DICE interlocking configuration. The principle behind a DICE configuration is based on four interlocked nodes where two and two nodes have complementary values at all times (A',C' = B,D). A SET event on any single node in a DICE configuration is not able to trigger the feedback due to the fast recovery time imposed by the interlocked node pairs. However, if a SET occurs on two equal node pairs, the feedback would be triggered and, the state of the latch would flip. ## 4.2 Architectural level hardening Architectural level SET and SEU mitigation involve the use of spatial redundancy and temporal redundancy techniques. By distributing logic blocks in space, the propability of a SEU occurrences on the output of such architectures is substantially reduced. An error is produced if a particle hits several logic blocks at the same time. Temporal redundancy introduces signal distribution in time. This is an effective way of screening vulnerable inputs, such as clock inputs, from incoming SETs. In order to utilize temporal redundancy, two or more correlated data paths need to be two or more logic blocks. The time distributed data paths make sure that a transient pulse can be expected at only one of the logic block inputs at any given time. ## 4.2.1 Triple Modular Redundancy A TMR architecture is based on three identical logic blocks which are connected to one or sometimes three majority voters. The output of the majority voter produces a correct value if at least two of the three identical logic blocks contain intended (correct) values. The voter topology may be implemented as a synchronous or asynchronous circuit, and is also vulnerable to soft errors. Therefore it is critical to harden voter circuits in order to prevent SET propagation. Figure 4.3 shows a TMR architecture, which utilizes both spatial- and temporal redundancy. Several degrees of spatial redundancy can be achieved by using TMR architectures. Standard spatial Figure 4.3 TMR architecture with temporal sampling redundancy is achieved by simply stacking the layout cells close to eachother. Higher degree of spatial redundancy can be achieved by interlaving layout cells, however, such an implementation comes at the cost of higher interconnect complexity. The temporal sampling is in this case applied to the clock input, but can also be applied to the data inputs. Temporal sampling is effective for low/medium frequency operation, however for high performance applications, the delay imposed by the delay elements will amount to a large portion of the overall latency of the circuit. The latency resulting from the delay elements is usually set to be equal to the maximum expected SET duration. The utilization of the TMR architecture contributes to an area overhead more then three times greater compared to a standard implementation of the same circuit. Furthermore, the power consumption is also increased approximately by the same factor, which makes TMR circuit implementations somewhat undesirable in applications where area- and power consumption are key parameters. ## 4.2.2 Dual Modular Redundancy The DMR implementation presented in this thesis is based on utilization of two identical, separated data paths Figure 4.4 shows a DMR architecture which makes use of DICE latches and C-elements. The two separated data paths achieve SET tolerance by interconnecting C-element inputs between identical logic elements in the two data paths. The technique has been proposed in [50], but not in the context of dual modular redundancy. The use of interconnected C-elements ensures that SET mitigation at input stages of latches is achieved as long as two interconnected C-elements don't get hit by a particle strike at the same time. Should a SET in one of the data paths propagate to a neighbouring logic block, the SET would thereby be filtered out by the same C-element interlocking configuration, or the fast recovery time imposed by the interlocked node pairs in the DICE latch. The utilization of C-elements and DICE latches poses a good fit for the DMR architecture in terms that all logic blocks are duplicated and C-elements are added. However the additional C-elements and the additional circuitry required to realize the DICE latch impose a area and power consumption overhead that needs to be taken into account. For further SET and SEU mitigation in the DMR architecture, temporal sampling can also be applied to C-elements and latch inputs [50, 57]. Figure 4.4 DMR architecture ## 4.3 SEE Layout Considerations When designing SEE tolerant circuits, it is important to consider the physical implications of a particle hit on or near a sensitive node in the circuit. Therefore layout considerations need to be taken into account in order to achieve the best possible SEE tolerance. Due to the impact of technology scaling, deep submicron implementations of SEU tolerant topologies are highly affected by the charge sharing between sensitive nodes [58]. Charge sharing implies that the charge generated from a particle hit is shared between transistors in close proximity. The DICE D flip flop is therefore designed with a spatial separation of the sensitive nodes, see Figure 4.5. As mentioned in Section 2.3, the depleted diffusion areas of a transistor are critical points of impact for a particle hit. By separating the the critical diffusion areas that make up nodes A and C, and nodes B and D in the DICE latch, a higher particle hit tolerance can be achieved. The layout of the master/slave DICE configuration is made up by interleaving the critical nodes of two DICE latches. Master latch, node B (MLB) is placed next to slave latch, node B (SLB). Their other critical noes, MLD and SLD are placed as far away as possible from the B nodes in order to minimize the propability for a single particle hit upsetting several sensitive nodes. This approach contributes to more complex nodal interconnect and more parasitic components, nevertheless, the resulting small decrease in speed is a small price to pay for a large increase in SEU tolerance. P+ and partial N+ guardbands have been added to increase TID hardness in terms of inter device leakage currents. The guardbands also increase the single event latch-up threshold by suppressing the positive feedback path between PMOS and NMOS transistors. The use of additional guardbands comes at the expense of additionally increased area overhead in addition to the redundancy factor. ## 4.4 SAR architecture The purpose of implementing the SAR architecture using different SET and SEU mitigation techniques, is to compare the radiation tolerance of each of the mitigation techniques and evaluate the potential penalties in terms of power consumption, speed and area overhead. Three different implementations of the SAR architecture have been realized on the TSMC test Figure 4.5 DICE D flip flop layout chip. Figure 4.6 shows the layout of the SAR architectures, and Table 4.1 shows the attributes of the implemented SAR architectures. The standard (STD) SAR architecture is designed using standard combinational logic and PowerPC603 D flip flops [59] in a master-slave configuration. The TMR SAR architecture is designed using the procedure presented in Section 4.2.1, and DMR SAR architecture is designed using the procedure presented in Section 4.2.2. The TMR SAR utilizes architectural level SET and SEU mitigation techniques and the DMR SAR utilizes both architectural- and circuit level SET and SEU mitigation techniques. Synchronization is very important in CMOS design, especially in high speed, low supply voltage operation due to the impact of propagation delays. Therefore, clock trees have been implemented in all of the SAR implementations. **Figure 4.6** Layout of the thee SAR architectures By comparing the area overhead imposed by the implementation of the different SET and SEU mitigation techniques, we see that the TMR SAR and DMR SAR are approximately 400% and 4.4 SAR architecture 23 160% larger then the STD SAR respectively. It should be noted that the DMR SAR utilizes a radiation tolerand D flip flop topology, and that the TMR SAR utilizes a standard unhardened D flip flop topology. If the TMR SAR would use ciruit level SET and SEU mitigation techniques, the area overhead imposed by the TMR architecture would be even greater. **Table 4.1** SAR architecture implementation comparison | | D flip flop type | Circuit level hardening | Architectural level hardening | |---------|------------------|-------------------------|-------------------------------| | STD SAR | PowerPC603 | none | none | | DMR SAR | DICE DFF | DICE (+ C-elements) | DMR (+ C-elements) | | TMR SAR | PowerPC603 | none | TMR (+ temporal sampling) | ## Chapter 5 ## Soft Error Simulation- and Test Methodology Any electronic device that is intended for, or that risks exposure to any hostile environment, must undergo an extended set of tests in order to be cleared for safe operation in the designated environment. These comprehensive tests often include a vast variety of possible soft- and hard error characterization forms and are often performed under the influence of different radiation sources, depending on the targeted environment of operation and the application for the device. Typical characterization parameters may for example include the radiation induced effects discussed in Chapter 2.3 as well as other application specific parameters, depending on the DUT. However, implementing a complete and reliable hardness assurance flow which includes all these characterization parameters is a very complex and resource demanding procedure [2]. In this thesis, a small portion of a circuit hardness assurance flow is presented and is limited to SEU and SET characterization only. The assurance flow starts at schematic simulation level with SET and SEU evaluation, and ends at a physical measurement level with SEU evaluation only. The procedure utilizes an FPGA development board for interfacing with a PC, and for error detection. The supplementary equipment consists of a power supply and an amperemeter, aiding in the characterization of the DUT by recording current consumption and enabling supply voltage control. ## 5.1 Simulation Methodology When characterizing CMOS cells for SEU and SET tolerance, a current pulse emulating a particle hit may be used for injection of current into sensitive nodes in the cell. The simulation method used in this thesis is similar to the methods used in recent publications on SEU and SET characterization on the schematic level [60,61]. The use of schematic level simulations for SEE characterization is a cheap and effective way of attaining an approximated prediction of the SEU and SET susceptibility in CMOS circuits by using standard SPICE simulation tools. However, the method does not provide a realistic particle hit scenario as the injected current pulse only affects a single node in the circuit, and parameters such as angle of particle impact and device separation are not included. Figure 5.1 shows the current pulse used for emulating a particle hit. In this thesis, the nodal injection of the current pulse was performed using Cadence Spectre simulator by importing the data from a MATLAB generated pulse. The pulse injection was used to determine $Q_{crit}$ of sensitive nodes, which could be determined by evaluating Eq.5.1 Figure 5.1 Particle hit induced current $$Q_{crit} = \int_0^T I(t)dt \tag{5.1}$$ where T is the time required to drive a node from a logic high to a logic low, or vice-versa. Higher level circuit SEU and SET susceptibility analysis can be achieved by emulating a somewhat more realistic radiation environment. This was achieved by using random pulse generators which inject current pulses with variable amplitudes, on all nodes in the circuits, including inputs. See Figure 5.2. In this thesis the analysis was performed at different supply voltages to evaluate the SEU and SET tolerance of RHBD circuit topologies. The idea behind this approach is to determine if radiation tolerant circuit topologies can make up for the loss in $Q_{crit}$ as the supply voltage scales down. Figure 5.2 Simulation method based on random pulse injection The random pulse injection method was used to determined the soft error rate on the circuitand architecture level [62]. ## 5.2 Template-based Soft Error Characterization The methodology chosen for characterization of soft errors evolves around the comparison of preirradiation data with post-irradiation data. Based on a set of input states, we can predict the expected output from the DUT under normal/ideal operation conditions (pre-radiation). The expected response from the DUT can then be implemented as a template for comparison with the DUT response, when the DUT is being irradiated. The comparison function is realized using a XOR function, generating '0's for correct response, and '1's for erroneous response. Based on the number of '1's observed, the response, if erroneous, can be characterized as a single event upset or a single event transient. Figure 5.3 shows the principle behind the template-based soft error characterization approach. **Figure 5.3** Principle behind-template based soft error characterization Figure 5.3 illustrates a 4 bit DUT response (waves 1,2,3 and 4 from the top) and a 4 bit ideal response template (wave 5, 6, 7 and 8 from the top) which are compared using the XOR function (bottom wave). There are two timelines illustrated, one for the DUT, $t_{DUT}$ , and one for interface and error detection platform, $t_{FPGA}$ , which is a FPGA in this case. The two timelines may or may not be executed simultaneously, depending on the error detection implementation and requirements. For simultaneous execution, four XOR functions are needed for the comparison. In this particular implementation, these to timelines are executed sequentially, thereby only one XOR function is needed. In other words, the DUT timeline is initiated, denoted by $t_{DUT} = start$ , which signals the DUT to send data to the error detection platform. As soon as all of the data has been sent, indicated by $t_{DUT} = end$ , the FPGA starts the error detection algorithm indicated by $t_{FPGA} = start$ . At this point, the comparison between DUT response and template is initiated, sequentially evaluating; $DUT\_b0 \oplus TEM\_b0$ , $DUT\_b1 \oplus TEM\_b1$ , $DUT\_b2 \oplus TEM\_b2$ and $DUT\_b3 \oplus TEM\_b3$ . The sequence completion is indicated by $t_{FPGA} = end$ , at which point the processed data can be forwarded for further processing. Figure 5.3 also illustrates a scenario when both a SET and a SEU is detected. A SET is detected during the comparison between $DUT\_b3$ and $TEM\_b3$ , and a SEU is detected during the comparison between $DUT\_b2$ and $TEM\_b2$ . ## 5.2.1 Soft Error Detection using a FPGA The implementation of the template-based soft error characterization was realized using a FPGA Spartan-3 Starter Board from Digilent. The purpose of the FPGA is to function as a interface between a computer and the DUT by stimulating the DUT and by processing the data received from the irradiated ASIC. Figure 5.4 illustrates a functional diagram of a physical implementation of the template based soft error characterization procedure. Figure 5.4 Block level diagram of the template based soft error characterization The FPGA is configured to use 8 x 256 shift registers where 4 x 256 shift registers are dedicated to the irradiated received data, and 4 x 256 shift registers are dedicated to the ideal response, ie. the template. The total number of shift registers used for the comparison procedure is 2048 and are realized using the available CLBs on the FPGA. Each shift register on the receiving end is sequentially XORed with their template shift register counterpart. The utilization of a XOR comparison function, enables for detection of any uncorrelated instances between the response and the template. Since the characterization methodology is aimed at characterizing circuit susceptibility at low supply voltages (and therefore low speed), the available processing power on the FPGA allowed for oversampling of the DUT response. The receiving shift registers are oversampled by $f_s = f \cdot 64$ , ie. 64 times the ASIC output signal frequency. The oversampling approach will therefore provide a certain degree of temporal resolution, and thereby also a certain probability of SET detection. The SET detection is primarily aimed at detecting potential SET occurrences that are results of secondary radiation effects which may interfere with surrounding ICs and equipment. In this particular implementation, the maximum sampling rate is limited by the Spartan 3 maximum clock frequency which is 50 MHz. However, the physical implementation is easily portable to FPGAs capable of sampling at much higher frequencies, such as the Stratix III from Xilinx with the capability to operate at clock frequencies up to 600 MHz. In other words, higher clock frequencies enable higher temporal resolution of the DUT outputs, and thereby higher probability to observe SETs. A limiting factor regarding this particular approach is the somewhat lack of flexibility. Each template needs to be coded in the FPGA depending on the expected response from the DUT. However, since the FPGA can be reconfigured in a matter of minutes, the existing templates may easily be replaced with new templates, even during radiation testing. ## 5.2.2 PCB Design The printed circuit board (PCB) is designed using a two layer FR-4 board. The bottom layer is primarily used as a ground plane for the purpose of ease of routing. The top layer is dedicated to signal- and power routing only. Almost all of the components are placed on the top side, aside from a few decoupling capacitors, which are used for DUT power decoupling and are placed directly below the DUT, on the bottom PCB layer. All ICs have at least one decoupling capacitor in near proximity of the IC package, depending on the requirements. All level shifters are powered by two different supply voltages, and therefore have two decoupling capacitors. The SN740AVC4T774PW level shifters are used both for converting up DUT output voltage signals from 1.2 V to FPGA voltage levels of 3.3 V, and for converting down in the opposite direction. The down-conversion is used for the control signals from the FPGA to the DUT, controlling the SAR architectures. Two level shifters are dedicated to interfacing between an ADS830E 8 bit SAR ADC for the purpose testing of a 8 bit DAC located on the DUT. Two L6932D1.2 voltage regulators are used for supplying power to the level shifters and the DUT. The voltage regulators are configured in 1.2 V and a 3.3 V configuration, and are powered by two 9 V batteries. A 40-pin connector is placed on the PCB for interfacing with a FPGA development board. Several jumpers have been placed on the PCB for debugging purposes and for different DUT input- and power supply configurations. Figure 5.5 shows the PCB design and indicators to the main components. **Figure 5.5** The PCB design The PCB design complexity for this particular experiment is kept relatively low due to the fact that the experiment is run on low speeds. The speed is not limited by the ASIC, but rather by the maximum clock frequency of the FPGA and the oversampling rate, which are given by $max(f_{FPGA\_clk})$ and X respectively. In order to be able to oversample the output signals of the ASIC, the clock frequency of the FPGA must fulfill the requirement in Eq.5.2 $$f_{FPGA\_clk} = X \cdot f_{ASIC\_output} \tag{5.2}$$ where $f_{ASIC\_output}$ is the output signal frequency of the ASIC. The maximum signal frequency present on the PCB, $max(f_{PCB})$ , is thereby given by $$max(f_{PCB}) = f_{ASIC\_clk} = \frac{max(f_{FPGA\_clk})}{X} \cdot 2$$ (5.3) since $$f_{ASIC\ clk} = 2 \cdot f_{ASIC\ output}$$ (5.4) Given an oversampling rate, X=64, and maximum clock frequency of the FPGA, $max(f_{FPGA\_clk})=50MHz$ , the highest signal frequency on the PCB would be < 1.5 MHz, which can be categorized as low speed operation. However, it is evident from Eq.5.3 that higher speeds may become relevant if $max(f_{FPGA\_clk})$ is increased or the oversampling rate, X, is decreased. For such a scenario, more care should be taken when designing the PCB, in order to preserve acceptable signal integrity. A few measures have been taken in order to prepare the PCB design for radiation testing environment. As a result of high energy particles interacting with materials around the electronic devices, secondary particles may scatter in all directions and thereby have a significant impact other nearby ICs or cables [63]. These secondary particles can induce noise or SEU in sensitive ICs. Therefore, every other wire in the 40-wire ribbon cable is grounded for better crosstalk performance when a long ribbon cable needs to be utilized, and/or when higher speeds are needed. Furthermore, it has been made an effort to keep a distance between the DUT and the surrounding ICs in order to suppress the effect of possible occurrences of secondary particles. #### 5.2.3 Measurement Setup The measurement setup used for prototyping and radiation testing is described in this subsection. The setup is used for automated extraction of data from the DUT. Figure 5.6 shows the block diagram of the measurement setup. The setup is based around a host computer that can run the necessary data extractions remotely. A Keithley 6512 Electrometer is configured as an amperemeter for measuring current consumption of the DUT. The amperemeter is used for two purposes; Measuring current consumption of the three SAR architectures, and for measuring the leakage current through the NMOS transistor. Since only one Keithley 6512 is used, only one of these tasks can be performed at any given time. Since, the SAR architectures were to be tested with different supply voltages, the HP/Agilent E3631 voltage source was configured to use two output channels, one for variable power supply for the SAR architectures, and one for gate voltage on the NMOS transistor. The power supply and the amperemeter are connected to a computer using GPIB. The soft error detection algorithm that is implemented on the FPGA development board is controlled by the computer via the RS 232 port. The FPGA development board sends clock, control and reset signals to the DUT and receives the response from specified SAR architectures. The DUT response is evaluated for errors on the FPGA and therafter sent back to the computer via the RS232 port for further processing in Matlab. All cables used for interfacing the DUT PCB are at least 1 meter long in order to minimize any secondary particles hitting the peripheral equipment. The measurement setup has the capability to plot response data from the DUT continuously in real-time. This ability is preferable in order to be able to observe any irregularities that may occur during the time interval of irradiation testing [64]. Accounting for any irregularities early on during radiation characterisation of a device may help save both time and resources, due to the fact that radiation environments can be highly unpredictable. According to JEDC89 standards<sup>1</sup>, radiation testing may be performed using static or dynamic <sup>&</sup>lt;sup>1</sup>http://www4.tsl.uu.se/ bumpen/jedec.pdf Figure 5.6 Measurement setup testing of the DUT. Static testing involves writing a predefined sequence to the DUT, followed by irradiation of the device. After the DUT has been irradiated, the radiation sequence is terminated and the DUT response is recorded. For the case of dynamic testing, the DUT is stimulated and the response is recorded during irradiation. The test setup can be modified to perform both dynamic or static testing, however dynamic testing is adequate for testing algorithms such as the SAR architectures. Static testing can be performed in order to observe if bit flips occur in D flip flops, while dynamic testing is preferable in order to test for SET induced SEU while devices are active. Ultimately, both dynamic and static testing should be performed in order to get the best possible SEU characterization of the DUT [65]. # Chapter 6 ## Results This chapter presents papers related to the field of robust low power CMOS design, as well as results from irradiation testing of the SAR architectures implemented on the TSMC 90 nm chip. The papers are included as potential contributions to future radiation hardened building blocks, and provide individual introductions to their respective topics. ## 6.1 Papers ## 6.1.1 Paper I: Low-Power Subthreshold to Above Threshold Level Shifter in 90 nm Process Published at NORCHIP 2009, November 16-17, 2009, Trondheim, Norway # Low-Power Subthreshold to Above Threshold Level Shifter in 90 nm Process Amir Hasanbegovic Department of Informatics University of Oslo Email: amirh@ifi.uio.no Snorre Aunet Department of Informatics University of Oslo Email: sa@ifi.uio.no Abstract—The use of multiple voltage domains in an integrated circuit has been widely utilized with the aim of finding a tradeoff between power saving and performance. Level shifters allow for effective interfacing between voltage domains supplied by different voltage levels. In this paper we present a low power level shifters in the 90nm technology node capable of converting subthreshold voltage signals to above threshold voltage signals. The level shifter makes use of MTCMOS design technique which gives more design flexibility, especially in low power systems. Post layout simulations indicate low power consumption and low energy consumption across process-, mismatch- and temperature variations. Minimum input voltage attainable while maintaining robust operation is found to be around 180mV, at maximum frequency of 1MHz. The level shifter employs an enable/disable feature, allowing for power saving when the level shifter is not in use. #### I. Introduction Recently there has been a lot of focus on low power electronics capable of maintaining acceptable performance requirements. One effective way of reducing the power consumption of an integrated circuit (IC) is by reducing its supply voltage. The reduction in supply voltage contributes to dramatic decrease in dynamic and static power consumption, however the utilization of low supply voltage has a negative impact on the speed of the circuit. [1] In order to reduce power consumption while limiting the sacrifice of speed, several voltage domains may be implemented on the same IC. By doing so, less critical sections of the circuit may be supplied by a low supply voltage, Vddl, while critical sections are supplied by higher supply voltages, Vddh. In order to connect the different voltage domains in an effective way, the use of level shifters for interfacing is vital. For the sections of the IC where speed has less importance, the supply voltage may be reduced down to the subthreshold region in order to save as much power as possible. Subthreshold operation has the prospective to contribute with considerable power savings which may benefit modern portable devices. Level shifters converting subthreshold signals to above threshold signals are most likely to have transistors operating in the subthreshold region. Transistors operating in above threshold region have a supply voltage above the threshold voltage $V_{th}$ . However, when the supply voltage is reduced below the threshold voltage $Vdd < V_{th}$ , the transistor is considered to be operating in subthreshold region. Transistors operating in subthreshold region have exponential reduction in Ids when $Vgs < V_{th}$ . Id is given by [2] $$I_{Dsub} = I_0 e^{\frac{V_{gs} - V_T}{nV_{th}}} \left( 1 - e^{\frac{-V_{ds}}{V_{th}}} \right) \tag{1}$$ where $I_0$ is the drain current when $V_{gs} < V_T$ $$I_0 = \mu_0 C_{ox} \frac{W}{L} (n-1) V_{th}^2 \tag{2}$$ Circuits employing transistors in subthreshold are prone to design challenges and reliability issues. As the temperature decreases the drive strength of the transistor is substantially weakened, especially for transistors in subthreshold. This presents a problem particularly in the cases where transistors operating in subthreshold region are stacked in the same signal path as transistors operating above threshold region. In the opposite case, when the temperature increases, the drive strength of the transistors increase due to increased carrier mobility resulting in high leakage currents. [3] #### II. CONVENTIONAL LEVEL SHIFTER Fig. 1. Dual Cascode Voltage Switch The conventional level shifter, shown in Figure 1, can be implemented as an interface between two voltage domains as long as the input voltage is above the threshold voltage of MN1 (and MN2). The level shifter has the following operational behavior: When the input goes form a logic low to a logic high, MN1 is turned on and MN2 is turned off. Then, the voltage at node nA is pulled towards ground due to the conducting path established by MN1. If the voltage at node nA reaches (Vdd-VthMP2), the positive feedback is triggered as MP2 turns on and pulls node nB high. The input has then been shifted from a lower voltage level, to a higher voltage level through the output inverter. The voltage shift can be competed only if the pull-up/pull-down ratio is roughly the same. In other words, the pull-up strength has to be close or equal to the pull-down strength. If the pull-up/pull-down ratio is not close to unity, contention will take place between the pull-up transistors(MP1 and MP2) and pull-down transistors (MN1 and MN2), which will increase delay and increased power consumption. [4] This contention worsens when the input signal approaches subthreshold. For subthreshold voltages on the input, the drive strength of MN1 cannot overcome the drive strength of MP1. Hence, node nA cannot be pulled down, and the positive feedback cannot be triggered, labeling the conventional level shifter impractical for subthreshold conversion. In order to enable subthreshold conversion, the pull-up/pull-down ratio has to be equalized. #### III. PROPOSED LEVEL SHIFTER Fig. 2. Proposed level shifter The proposed level shifter (MDCVS), shown in Figure 2, is designed in 90nm process form ST Microelectronics and utilizes multithreshold voltage CMOS (MTCMOS) design technique. Low-threshold transistors (lvt) are placed where speed is of importance at the expense of leakage current and high-threshold transistors (hvt) are placed where leakage current can be reduced at the expense of speed. In addition to the low Vt and the high Vt transisitors, an standard threshold transistor (svt) is provided in the library which presents a tradeoff between leakage and speed in itself. The use of $multiV_{th}$ transistors enables us to find a good tradeoff between static power consumption, dynamic power consumption and propagation delay. The proposed level shifter has been designed in two versions to satisfy the needs of both high speed (MDCVSHS) and low power (MDCVSLP) operation, subthreshold operation taken into consideration. #### A. Modified Dual Cascode Voltage Switch (MDCVS) The new level shifter builds upon the same principal as the DCVS. The circuit exploits several techniques in order to limit the contention between pull-up and pull-down at nodes nA and nC. Primarily, the pull-down transistors (MN1 and MN2) are set to low threshold transistors, meaning they conduct more current at a given threshold voltage. This increase in current makes it easier for the pull-down strength to follow the pull-up strength. However, the low threshold transistors are not enough to establish an even pull-up/pull-down ratio. As proposed in [5], diode connected PMOS (MP5 and MP6) can be used to limit the pull-up strength of the two branches in the level shifter. When the level shifter is in steady state, $|V_{gs}|$ of the diodes is small and is equal to the diode voltage drop $V_{MPD}$ . When the input signal switches, the diode voltage drop is kept stable, limiting the pull-up strength. However, in our design this configuration is taken one step further in an attempt to reduce energy consumption. Since the voltage at node nA only goes up to $Vdd-V_{MPD}$ and the voltage at node nC only goes down to $gnd+V_{MPD}$ , MP4 and MN4 are never completely turned on. This results in reduction of short circuit current, and may thereby contribute to a reduction of dynamic power consumption in the output inverter, given that the capacitive load is relatively low. Another useful feature with the PMOS diodes is that they enable Vddl scaling, which is critical in adaptive voltage scaling systems. [1] When Vddl increases, the pull-down strength of the level shifter increases as a result of increase in conducting sink current. The increase in sink current contributes to a faster increase in voltage drop across the PMOS diodes during the transistor switching time, thereby maintaining the pull-up/pull-down ratio. An increase in both pull-up and pull-down strength results in a reduction of the propagation delay, which allows higher operational frequency as Vddl scales up. In order to achieve robust operation at lower temperatures MP7 and MP8 are added, which may further limit or increase the drive strength of the pull-up transistors (MP1 and MP2), enabling fine adjustments of the pull-up/pull-down ratio. Transistors MP7 and MP8 are biased in off-state, providing a leakage current in their respective branch. This configuration allows the pull-up strength of MP1 and MP2 to be controlled by sizing of MP7 and MP8. By increasing(decreasing) the size of MP7 and MP8, the pull-up transistors will conduct more(less) current, thereby increase(decrease) the pull-up strength. This enables control over the rise- and fall delay on the output. Furthermore, as the temperature increases, the leakage current from MP7 and MP8 becomes larger which in turn increases the pull-up strength of MP1 and MP2. Proper functionality is preserved because of the constant voltage drop over the PMOS diodes, limiting the pull up strength at high temperatures as well. By adding four additional "sleep" transistors, MN5, MN6, MN7 and MN8, the level shifter can be turned on or off (sleep mode) by setting their gate to '1' or '0' respectively. The sleep configuration is implemented using high threshold NMOS transistors since the on-resistance of an NMOS is smaller then the on-resistance of a PMOS, given the same size. [6] When the level shifter is turned off, the output is pulled up to Vddh via MP5, thereby avoiding an intermediate state on the output. #### IV. CIRCUIT SIMULATIONS In this section we present the simulation results focusing around the following design parameters: Static power consumption, energy consumption, propagation delay and average power consumption. These design parameters are put in context with scaling of the lower supply voltage (Vddl). The simulation results are based on post layout simulations at 27°C with a 10 fF capacitive load. Simulations were run for 20 input periods to capture any behavioral irregularities in the circuit that may corrupt the results. Transistor types and sizes used for the simulations are shown in Table I for the MDCVSLP and Table II for the MDCVSHS. Figure 3(a) (a) Static power and energy consumption (b) Propagation delay and total power consumption Fig. 3. Level shifter (MDCVSHS) performance with varying Vddl when Vddh=1V and the input signal frequency is 1MHz and Figure 4(a) show the static power consumption and the energy consumption as a function of Vddl for MDCVSHS and MDCVSLP respectively. Simulation results show that an increase in the lower supply voltage contributes to increase in both static power- and energy consumption. This is primarily due to low threshold transistors MN1 and MN2 leaving the subthreshold region. As indicated, the energy consumption is lowest when Vddl is around 200mV-300mV. At this operating (a) Static power and energy consumption (b) Propagation delay and avrage power consumption Fig. 4. Level shifter (MDCVSLP) performance with varying Vddl when Vddh=1V and the input signal frequency is 500kHz point the contention between pull-up and pull-down is the lowest. However, further reduction of Vddl gives rise to crowbar current due to increasing contention between pull-up and pull-down transistors, thereby increasing the dynamic power consumption. This is also reflected in the average power consumption illustrated in Figure 3(b) and Figure 4(b). Figure 3(b) and Figure 4(b) show how the propagation delay decreases as Vddl leaves the subthreshold region. The decrease in propagation delay confirms the level shifters ability to work at higher operational frequencies as the lower supply voltage increases. The average power consumption is increased with an increase in Vddl as a result of increased dynamic (switching) power consumption in the above threshold region. The robustness of the level shifter was verified with Monte Carlo simulations both for the case of MDCVSLP and MDCVSHS, for all the stated simulation conditions. Figure 5 illustrates the worst case simulation condition for the level shifter, being low temperature operation. Proper operation behavior is demonstrated across a temperature range of -40 °C $\rightarrow$ 150 °C. When the level shifter is put in sleep mode, the MDCVSHS shows a power consumption 175 pW, while MDCVSLP shows 48 pW. #### V. DISCUSSION The level shifter is primarily aimed for converting voltage signals from levels near the transistor threshold voltage to voltage levels several hundred mV above. Nevertheless, the Fig. 5. Level shifter (MDCVSHS) output when Vddl=180mV and Vddh=1V at -40°C (300 post layout Monte Carlo simulations) | Transistor | Type | W/L (µm) | | Transistor | Type | W/L (μm) | |------------|------|-----------|---|------------|------|-----------| | MN1 | lvt | 0.3/1 | İ | MP2 | hvt | 0.12/0.15 | | MN2 | lvt | 0.3/1 | | MP3 | svt | 2/0.2 | | MN3 | svt | 2/0.2 | | MP4 | hvt | 0.6/0.2 | | MN4 | hvt | 0.24/0.2 | | MP5 | lvt | 0.12/0.15 | | MN5 | hvt | 0.12/0.2 | | MP6 | lvt | 0.12/0.15 | | MN6 | hvt | 0.12/0.2 | | MP7 | lvt | 6/0.1 | | MN7 | hvt | 0.12/0.2 | | MP8 | lvt | 6/0.1 | | MN8 | hvt | 0.12/0.2 | | MP9 | hvt | 0.6/0.1 | | MP1 | hvt | 0.12/0.15 | | | | | | Transistor | Type | W/L (µm) | Trans | istor Type | W/L (μm) | |------------|------|-----------|-------|--------------|----------| | MN1 | lvt | 0.12/0.2 | MI | P2 hvt | 0.12/0.1 | | MN2 | lvt | 0.12/0.2 | MI | P3 lvt | 0.36/0.1 | | MN3 | lvt | 0.12/0.1 | MI | P4 hvt | 0.6/0.2 | | MN4 | hvt | 0.12/0.1 | MI | P5 lvt | 0.2/0.1 | | MN5 | hvt | 0.12/0.2 | MI | P6 lvt | 0.2/0.1 | | MN6 | hvt | 0.12/0.2 | MI | 27 lvt | 8/0.1 | | MN7 | hvt | 0.12/0.2 | MI | 28 lvt | 8/0.1 | | MN8 | hvt | 0.12/0.2 | MI | P9 hvt | 0.6/0.1 | | MP1 | hvt | 0.12/0.15 | | | | level shifter shows acceptable performance while converting from a wider range of voltage levels. This makes the level shifter suitable for applications where dynamic voltage scaling is required to satisfy speed requirements by dynamically increase/decrease the lower supply voltage. Table II illustrates the how the level shifter performance varies across different processes. The energy and power needed to complete a voltage shift from subthreshold level to above threshold level is substantially reduced in a sub- $\mu$ m process. #### VI. CONCLUSION By applying MTCMOS design technique to subthreshold level shifter design, low power- and energy consumption may be achieved while maintaining reliable performance. The level shifter also demonstrates compatibility with dynamic voltage scaling, at the expense of an increase in power consumption. For applications requiring voltage level shifting from subthreshold voltages to above threshold voltage, the proposed level shifter may serve as a good solution. However, for voltage shifting in the above threshold region, the conventional level shifter is more efficient in terms of power and energy consumption. #### REFERENCES - [1] V. Gutnik and A. P. Chandrakasan, "Embedded power supply for low-power dsp," vol. 5, no. 4, pp. 425–435, Dec. 1997. - [2] A. C. A. Wang, B.H. Calhoun, Sub-Threshold Design for Ultra Low-Power Systems. Springer-Verlag New York, 2006. - [3] Y. P. Tsividis, Operation and Modeling of the MOS Transistor, 2nd ed. Boston: McGraw-Hill, 1999. - [4] C. Q. Tran, H. Kawaguchi, and T. Sakurai, "Low-power high-speed level shifter design for block-level dynamic voltage scaling environment," in *Proc. International Conference on Integrated Circuit Design and Technology ICICDT* 2005, May 9–11, 2005, pp. 229–232. - [5] H. Shao and C.-Y. Tsui, "A robust, input voltage adaptive and low energy consumption level converter for sub-threshold logic," in *Proc. ESSCIRC* 33rd European Solid State Circuits Conference, Sep. 11–13, 2007, pp. 312–315. - [6] J. Kao, A. Chandrakasan, and D. Antoniadis, "Transistor sizing issues and tool for multi-threshold cmos technology," in *Proc. 34th Design Automation Conference*, Jun. 1997, pp. 409–414. [7] A. Chavan and E. MacDonald, "Ultra low voltage level shifters to - [7] A. Chavan and E. MacDonald, "Ultra low voltage level shifters to interface sub and super threshold reconfigurable logic cells," in *Proc. IEEE Aerospace Conference*, Mar. 1–8, 2008, pp. 1–6. - [8] T.-H. Chen, J. Chen, and L. T. Clark, "Subthreshold to above threshold level shifter design," *J. Low Power Electronics*, vol. 2, no. 2, pp. 251–258, 2006 TABLE III LEVEL SHIFTER COMPARISON | Level Shifter Design | | | | | | | | | | | | |-------------------------|---------------------------|---------------------------|---------------------------|-------------------------|--------------------------|---------------------|-------------|--|--|--|--| | Design Parameter | DSLS1b [7] | DSLS2 [7] | DSLS2b [7] | SSLSb [7] | CMLS [8] | LC [5] | MDCVSHS | | | | | | Propagation delay | 252ns | 125ns | 110ns | 161ns | 50ns | 10us | 32ns | | | | | | Energy cons. per trans. | 25pJ | 21.4pJ | 21.8pJ | 40.5pJ | 25pJ | 8nJ | 17fJ | | | | | | Static power cons. | n/a | n/a | n/a | n/a | 5nW | na | 2.5nW | | | | | | Vddl/min Vddl | 0.35V/0.35V | 0.35V/0.35V | 0.35V/0.35V | 0.35V/0.35V | 0.18V/0.1V | 0.2V/0.13V | 0.18V/0.18V | | | | | | Vddh | 1.2V | 1.2V | 1.2V | 1.2V | 1.2V | 1.8V | 1V | | | | | | Process | SOI $0.25 \mu \mathrm{m}$ | SOI $0.25 \mu \mathrm{m}$ | SOI $0.25 \mu \mathrm{m}$ | SOI $0.25\mu\mathrm{m}$ | $0.13\mu\mathrm{m}$ bulk | $0.18\mu\mathrm{m}$ | 90nm bulk | | | | | ## 6.1.2 Paper II: Memory Elements Based on Minority-3 Gates and Inverters Implemented in 90 nm CMOS Published at DEDECS 2010, April 14-16, 2010, Vienna, Austria 6.1 Papers 39 # Memory Elements Based on Minority-3 Gates and Inverters Implemented in 90 nm CMOS Snorre Aunet, Amir Hasanbegovic Nanoelectronics group, Department of informatics, University of Oslo Postbox 1080 Blindern, 0316 Oslo, Norway Email: aunet@ieee.org, amirh@ifi.uio.no Abstract—Two memory elements, or latches, are introduced. They are similar in functionality to widely used NOR- and NAND-based crosscoupled latches, but unlike the traditional latces they do not risk to produce stable states where Q and Q' have identical binary values. The suggested solutions are built from two inverters and one minority-3 gate. Monte Carlo simulations in 90 nm CMOS are used to demonstrate that the circuits may maintain the digital abstraction under mismatch and process variations for a supply voltage down to 140 mV at 20 degrees C and 100 nm gate lengths. Chip measurements are included. Reliability issues for low fan-in threshold gates might favour them over some traditional Boolean implementations, which may contribute to increased use of CMOS threshold gates, if proven. #### I. Introduction This paper presents, as far as the authors know, two new memory elements with functionalities similar to traditional SR-latches that are found in standard teaching material in electrical engineering, finding their use as memory elements in synchronous circuits, asynchronous circuits [1], and in analog circuits like PLLs. The first circuit, shown in Figure 1, consists of two inverters and a minority-3 gate. As will be shown, the function table ( Table I ) for this circuit is the same as for the SR latch with NOR gates ( Figure 2 ) except for the condition when both S and R equal 1, a state that is normally avoided [1]. This circuit was inspired by a latch consisting of 3 inverters and a majority-3 gate, meant for single electron tunneling technology, from [2]. Our static CMOS solution presented here should normally save at least 4 transistors compared to to a directly mapped static CMOS solution based on [2], since one inverter is saved, and a minority-3 gate is used instead of the majority-3 gate. A second latch having functionality similar to the SR-latch using two NAND gates, depicted in Figure 2 and having it's function table in Table II [1] is made simply by interchanging the S and R nodes in Figure 1. Unlike the traditional SR latches based on Boolean gates, that violate the requirement that the outputs be the complement of each other for certain combinations of S and R [1], the proposed memory elements will never be able to settle with Q and Q' being equal. This is seen from the schematics in Figure 1, since Q is the direct inverse of Q NOT. The threshold gates (depicted in Figure 3), or perceptrons, are in this case restricted to minority-3 gates, that outputs a logic 0 if, and only if, 2 or 3 digital inputs are logic 1. Else, the output is logic 1. A survey of a wide range of threshold gates may be found in [3]. The 10-transistor gate from [4], shown in Figure 4, was used when implementing the two memory elements, as it is relatively reliable in subthreshold operation [5]. A subthreshold implementation was chosen, though the topology initself does not restrict the circuit topology to this region of operation. Subthreshold operation [6], exploiting the lowest possible supply voltages, may provide orders of magnitude less power than comparable regular strong-inversion circuits when operating at the same frequency, and may consume less power than other known low-power circuits [7] Subthreshold circuits are believed to play a significant role in the scaling path towards the 10 nm node, according to [8]. This paper is organized as follows: Section II provides brief background on the traditional SR latches shown in Figure 2, threshold gates and subthreshold operation. Section III include Spice simulations and chip measurements for a standard triplewell 90 nm CMOS process, demonstrating basic functionality. Chapter IV discusses a few aspects, before the conclusion, in section V. Fig. 1. The first among the proposed memory elements. #### II. BACKGROUND #### A. SR Latches implemented with NOR and NAND gates The NOR- and NAND-based SR latches are shown in Figure 2, while overall behaviour confirms to the function tables in Table I and Table II, respectively [1]. These latches are the basic circuits from which many flip-flops are constructed, where differences lie in the number of inputs and how they affect the binary states [1]. For the latch based on NOR gates both inputs remain at 0 unless the state has to be changed. The application of a momentary 1 to the S input causes the latch to go to the set state (Q=1, Q'=0). S must go back to 0 before R=1 puts the latch in the reset state, to avoid the state (Q=0, Q'=0). The latch based on NAND gates (Figure 2) operate with S=R=1 unless the state has to be changed. The application of 0 to the S input puts the latch in set state, where it remains when S goes back to 1. When both inputs are at 1, the state can be changed by placing a 0 in the R input. Then the latch goes to the reset state and stays there until both inputs return to 1. 0 on both inputs should be prohibited, according to Table II. Fig. 2. SR latches implemented using crosscoupled NOR- and NAND-gates. $\label{eq:table_interpolation} TABLE\ I$ Function Table for an SR Latch with NOR gates. | S | R | Q | Qb | after SR = | |---|---|---|----|------------| | 1 | 0 | 1 | 0 | | | 0 | 0 | 1 | 0 | 10 | | 0 | 1 | 0 | 1 | | | 0 | 0 | 0 | 1 | 01 | | 1 | 1 | 0 | 0 | | | S | R | Q | Qb | after SR = | |---|---|---|----|------------| | 1 | 0 | 0 | 1 | | | 1 | 1 | 0 | 1 | 10 | | 0 | 1 | 1 | 0 | | | 1 | 1 | 1 | 0 | 01 | | 0 | 0 | 1 | 1 | | #### B. Threshold Logic and Minority-3 Gates Perceptrons, or threshold gates, are simplified mathematical models of biological neurons, for which the sign of the weighted sum of their inputs [9] is computed and which may be expressed as (1), $$f(x_1, ..., x_n) = sgn(\sum_{i=1}^n w_i x_i - T)$$ (1) where $w_i$ is the synaptic weight associated with $x_i$ , n is the fan-in and T is the threshold. Fig. 3. The input signals, X, Y,... have weights, $W_X$ , $W_Y$ ,... If the weighted sum of inputs exceeds the threshold, T, the binary output, Y, changes. Fig. 4. Minority 3 gate [4]. A minority-N gate produces the inverted binary output compared to a majority-3 gate given identical inputs. The above equation should therefore produce the inverted function to be valid for the minority-3 case. #### C. Subthreshold Operation For an NMOS transistor in subthreshold we have [10]: $$I_{ds,n} = I_{0}exp\{\frac{\kappa V_{gs}}{V_{t}}\}exp\{(1-\kappa)\frac{V_{bs}}{V_{t}}\}(1-exp\{\frac{-V_{ds}}{V_{t}}\}+\frac{V_{ds}}{V_{0}}) \quad (2)$$ which expresses the current between the drain and source. $I_0$ is the zero-bias current for the given device and a constant where all the pre-exponential constants have been absorbed. This includes the channel width ("W") and the length ("L") of the MOSFET structure. $V_{gs}$ is the gate-to-source potential, $V_{ds}$ the drain-to-source potential and $V_{bs}$ the substrate-tosource potential. $V_0$ is the Early voltage, which is proportional to the channel length. $\kappa$ gives the efficiency with which the gate potential controls the channel current. This is often approximately 0.7-0.75 [10]. The thermal voltage is expressed as $V_t = kT/q$ . Including Boltzmann's constant and the elementary charge, $V_t = 25.8$ mV at room temperature, T = 300 degrees Kelvin. Though equation (2) does not take into account of all the physical effects and nonmonotonous behaviour in certain cases [11], it does provide sufficient insight to make a brief analysis of many subthreshold circuits. When $V_{ds} \ge 4V_t$ , ignoring the Early effect, we obtain: $$I_{ds} \approx I_0 exp\{\frac{\kappa V_{gs}}{V_t}\} exp\{(1-\kappa)\frac{V_{bs}}{V_t}\}. \tag{3}$$ It is clear that the substrate voltage is able to control the drainsource current to a significant extent, which is often exploited through body biasing techniques. #### III. RESULTS A. Spice Simulations for the New Memory Having Fuctionality Similar to the SR latch based on NOR Gates The memory in Figure 1 is simulated in Figure 5, for a supply voltage of 180 mV. In this transient simulation time runs from 0 to 10 $\mu s$ . It was implemented in a triple-well 90 nm CMOS process available from CMP [12] using the mirrored gate [4] and two standard inverters. The lengths of all 7 PMOS and 7 NMOS transistors were 100 nm, while the widths of the PMOS and NMOS transistors were 650 nm and 120 nm, respectively. The pwells and nwells were shorted and biased at $V_{dd}/2$ [13]. Simulations for the standard latch are included in Figure 5 to be able to compare the functionalities. The Q and Q' outputs for the standard latch can be seen as signals number 3 (" $Q\_SRNOR2$ ") and 5 (" $QN\_SRNOR2$ ") from the top. Correspondingly, for the proposed memory element, the voltages at the corresponding output nodes are shown as signals number 4 (" $Q\_SR$ ") and 6 (" $QN\_SR$ ") from the top. Initially both S (uppermost), and R (2nd signal) are 0. When S goes high, the Q output is set. Next time S changes it goes low, while Q remains at 1. When R goes low, Q is reset, and stays at 0 after next event, when R go to 0. Later both S and R go high, and the NOR-based latch (Figure 2), produces a 0 on both Q and Q'. So, this demonstrates the 5 states in Table I. Comparing the standard NOR-based latch to the corresponding circuit using the minority-3 gate demonstrates the same behaviour except for the case where S = R = 1. Then the latter circuit computes opposite binary values for Q and Q' (signals number 4 and 6 from the top), unlike the SR latch using cross coupled NOR gates. A comparison for a supply voltage of 120 mV and the Fig. 5. Signals, from top: S, R, Q\_SRNOR2, Q\_SR, QN\_SRNOR2, QN\_SR. Steady state logic 1 signals are close to 180 mV. four allowed states in Table I may be found in Figure 6 and Table III. 100 Monte Carlo simulations ("process and mismatch") at 20 degrees C were performed. The circuits were regarded as functional if they could produce correct logic 1 signals above 0.75 times the $V_{dd}$ , and logic 0 signals below 0.25 times $V_{dd}$ . Transistor sizing were as before, except that "svt" transistors were used, and PMOS widths were 460 nm. Fig. 6. Signals, from top: Q' and Q for the traditional SR latch, Q' and Q for the proposed circuit, R and S. TABLE III MONTE CARLO SIM. PERCENTAGE OF FUNCTIONAL CIRCUITS | Vdd | Traditional SR latch | Minority-3 SR latch | |-----|----------------------|---------------------| | 180 | 100% | 100 % | | 140 | 100% | 100 % | | 130 | 97% | 99 % | | 120 | 91% | 99 % | | 110 | 83% | 92 % | | 100 | 66% | 87 % | #### B. Spice Simulations for the New Memory Having Fuctionality Similar to the SR Latch Based on NAND Gates The SR latch based on NAND gates, depicted in Figure 2, had it's threshold gate counterpart implemented exactly like the previous circuit shown in Figure 1 except that the R input was no longer inverted prior to the minority-3 gate, but an inverter placed between the S input and the minority-3 gate. The PMOS and NMOS dimensions were the same as for the implementation of the new memory element in Figure 1. Simulations for comparison to the function table in Table II are shown in Figure 7. The supply voltage was 180 mV. The Fig. 7. Signals, from top: S, R, Q\_SRNOR2, Q\_SR, QN\_SRNOR2, QN\_SR. input signals S and R are 1st and 2nd from top. The NAND-based latch has it's Q output as the third signal from top (" $Q\_SRNAND2$ "), and Q' as the fifth (" $Q\_SRNAND2$ ") (in Figure 7). The corresponding latch based on the minority gate and inverters have the Q and Q' signals as number 4 (" $Q\_SR$ ") and 6 (" $QN\_SR$ ") from top, respectively. The S and R signals are initially 1 and 0, producing Q=0 and Q'=1 for both latches. Afterwards R goes high, and Q and Q' remain unchanged, which correspond to the 2nd case in Table II. Later S goes low, switching Q and Q' nodes for both implementations, corresponding to the 3rd case in Table II. When S goes high again, and R remains low, all four outputs remain at the same steady states. This means that there is a match between the simulations and the function table for the allowed combinations of inputs under normal operation [1]. For the NAND-based latch the simulated results show that for S=R=0, both Q and Q' stabilize at 1, as can be seen in the 3rd and 5th signals in Figure 7. This is not the case for the minority-3 gate implementation, where S=R=0 results in Q=1 and Q'=0 (Figure 7). #### C. Analysis of Behaviour for the First New Memory Element The analysis is based on methods presented in [1]. The transition table including the *secondary variable* y and the *excitation variable* Y = Q is shown in Table 8, and was obtained by analyzing the schematics in Figure 1. *Stable* states are circled. The difference from the table for the NOR2-based SR latch is that the traditional circuit has an unstable output for SRy = 111. The behaviour of the latch can be investigated | | SR | | | | | | | | | | |---|----|----|----|----|--|--|--|--|--|--| | V | 00 | 01 | 11 | 10 | | | | | | | | 0 | 0 | 0 | 0 | 1 | | | | | | | | 1 | 1 | 0 | 1 | 1 | | | | | | | Fig. 8. Y = SR'+R'y+Sy. from the transition table. With SR = 10, the output Q = Y = 1 and the latch is said to be *set*. Changing S to 0 leaves the circuit in the set state. With SR = 01, the output Y = Q = 0 and the latch is said to be *reset*. A change of R back to o leaves the circuit in the reset state. From the transition table one can also see that going from SR=11 to SR=00 produces and unpredictable result, depending on if S or R goes to 0 first. If S goes to 0 first, the output remains at 0, but if R goes to 0 first, the output goes to 1. This is common to the traditional NOR-based SR latch. For the new circuit, if it is in stable state SRy = 000, changing SR from 00 to 01 to 11 will make the circuit end up with a stable 0 output, while 00 to 10 to 11 gives a stable 1 output. When changing from the stable state SRy = 001 it's different. If going through the sequence SR = 00 - 01 - 11 the Q output will stabilize at 0, while the sequence SR = 00 - 10 - 11 will lead to a stable Y = Q = 1. #### D. D-flip-flop A D-flip-flop making use of the new memory element in Figure 1 was made. It consists of 2 inverters and 2 D-latches, and is depicted in Figure 9. The D-latch embedded in the D-flip-flop in Figure 9 is shown Fig. 9. D flip flop made from two inverters and two D latches. in Figure 10. The three circuit symbols rightmost in Figure 10 and their interconnect make up the structure in Figure 1. The three circuit symbols leftmost represent two NAND2 functions and one INV. The 2 minority-3 gates in the D-latch, each having one of their inputs shorted to ground, implement the NAND2 function, as in [14]. The NAND2 with it's output connected to the R input (referred to Figure 1) have D (Data) and CLK (clock signal) as inputs, while the S input receives D' and CLK from it's corresponding NAND2. Simulated results for the D-latch and the D-flip-flop, using Fig. 10. D-latch structure. the same transistor dimensions as previously mentioned, but lvt transistors, are shown in Figure 11. A layout for the D- Fig. 11. From top: CLK, D, Q (D-latch), Q (D-flip-flop). flip-flop from a prototype chip in 90 nm CMOS submitted in March 2009 is shown in Figure 12. The width and length of Fig. 12. Layout for the D-flip-flop. the D-flip-flop are approximately 26.8 $\mu m$ times 11.95 $\mu m$ . Chip measurements for a supply voltage of 160 mV (Layout in Figure 12) are included in Figure 13. Fig. 13. The clock signal (100 kHz) and the output (uppermost) are shown, for a supply voltage of 160 mV and input frequency of about 50 kHz. #### IV. DISCUSSION #### A. Regarding the analysis of the first memory element As seen in the previous section the circuit may end up in different stable states if the SR inputs change from 00 to 11, or 11 to 00, too close in time. Similar situations may appear in many asynchronous circuits, meaning that only one input variable can change at any one time, and that the time between two input changes must be longer than time it takes the circuit to reach a stable state. This is called operating in *fundamental mode* [1]. The transition table in Figure 8 has no columns containing unstable states only, meaning that no fixed value of S and R inputs can give the situation that y and Y are never having the same binary values. Therefore the circuit will be stable, and not oscillate. The SR latch with NOR gates, shown in Figure 2, will violate the requirement that the outputs are the complement of each other when S = R = 1, as can be seen in Table I, meaning that this condition is avoided under normal operation [1]. The circuit in Figure 1 does not have this problem, as Q is logically the inverted Q' signal. The standard SR latch based on NAND gates must, in a similar way, avoid simultaneous 0's on the S and R inputs, to keep up the abstraction $Q \neq Q'$ . #### B. Verifying SR latch, D-latch and D-flip-flop The functionalities for the standard SR latches using NORor NAND-gates were demonstrated, as well as the new memory elements. Exploiting the circuit in Figure 1, for a D-flip-flop and a D-latch (Figures 9 and 10) were demonstrated by simulations in Figure 11. Monte-Carlo simulations across temperature would provide results regarding robustness of the basic memory elements as well as the D-latch and D-flip-flop exploiting them. The Monte-Carlo simulations demonstrated that both the traditional SR latch and the circuit containing the minority-3 gate had a 100 "yield" at a supply voltage of 140 mV and above. Similar numbers for static CMOS and the "mirrored" minority-3 gate has been reported in [5]. # C. General digital systems / Finite State Machines may be built entirely from minority-3 gates and inverters Minority-3 gates and inverters have been used as the only basic building blocks for 32-bit serial ripple carry and Kogge-Stone adders [22]. The minority-3 gate function in combination with inverters is sufficient to implement any basic Boolean function like NAND, NOR, AND, OR, INV, BUF, EXOR and EXNOR. In this paper we have demonstrated how minority-3 gates and inverters provide enough functionality to build lathches and flip-flops. Therefore, minority-3 gates and inverters are sufficient to build any Finite State Machine, in principle. #### D. Circuit complexity and regularity The circuit may have a relatively symmetric and regular layout, as it contains minority-3 gates and inverters only, providing symmetry between the PMOS and NMOS transistors. This may also be reflected in the name of the mirrored gate [4]. Such symmetry is not the case for traditional Boolean logic, including for example static CMOS NAND2 and NOR2 circuits. Regularity in layout may be favourable to enable good matching [15], [16], which is especially important in subthreshold operation, [5], due to the many exponential dependencies controlling for example the drain currents, which may be seen from equation 2. In general, it's better to have circuits with a relatively small number of transistors for a given function, to reduce area, price and power consumption. The well known so called basic NAND D-flip-flop, which may be found in [17] contains 8 NAND gates and 2 inverters and may be said to resemble the D-flip-flop presented here. In a straightforward static CMOS implementation this could lead to 8 times 4 plus 2 times 2 transistors, for a total of 36 transistors. The D-flip-flop in Figure 9 include 6 minority 3 gates and 7 inverters, using totally $(6 \times 10 + 7 \times 2)$ 74 transistors. If the 6-transistor gate from [18] was used the number of transistors could be reduced to 50, but at the expense of increased power consumption and less tolerance for parameter variations [5]. Floating-Gate, or pseudo floating-gate implementations, could reduce the transistor count to 2 per minority-3 gate [14] to 28 transistors for the D-flip-flop, but would probably not be robust and useful for contemporary and future state-of-the-art CMOS technologies [19]. Anyway the minority-3 gate implementations contain a higher number of transistors than for example the PowerPC603 D-latch from [20] with 16 transistors, as well as the 10-transistor version in [21]. The SR latches based on NAND2 or NOR2 gates would in static CMOS implementations normally contain 8 transistors each. Comparing the suggested memory elements with these would give a lower relative penalty than for the D-flip-flops, giving a total of 14 transistors when using the mirrored-gate from [4], 10 transistors if using the one from [18] or 16 transistors if building the basic functionality by using the 12-transistor minority-3 gate in [22]. If automatic body bias / substrate biasing is applied it should not be necessary to bias individual types of gates differently, as one could limit the types of gates to minority-3 an inverters. One or more substrate voltages could then be identical for the minority-3 gates. Biasing of inverters would maybe not be necessary, as the inverter is regarded as the static CMOS circuit being most tolerable to lowest $V_{dd}s$ [23]. #### E. Reliability and small fan-in threshold gates Some recent studies have suggested that certain circuits exploiting threshold gates with fan-in 3 are more defect tolerant and reliable than traditional Boolean implementations [24], [25]. If minority-3 gates should prove to have an advantage in this respect, it could possibly make them more widespread for implementations of general digital systems as well as neural networks. #### V. CONCLUSION Two new memory circuits containing minority-3 gates and flip-flops are introduced. Basic functionalities are analyzed and simulated for subthreshold operation, in 90 nm CMOS. The basic circuits have been demonstrated when used in a D-latch and a D-flip-flop, for supply voltages around 180 mV. The basic memory element produced adequate digital outputs for supply voltages of 140 mV and 180 mV for 100 % of the Monte Carlo runs over process and temperature. A D-flip-flop exploiting the proposed memory structure has been demonstrated by chip measurements, for a supply voltage of 160 mV. It seems to be a significantly higher number of transistors for many basic functionalities for implementing logic and memory when exploiting minority-3 gates instead of traditional Boolean gates. If minority-3 gates should be preferred over traditional Boolean gates, for general digital circuit in future nanoscale CMOS, would depend on whether they offer any competitive advantages in terms of for example robustness to parameter variations and defects. #### REFERENCES - [1] M. Mano, Digital Design, 3rd ed. Prentice-Hall, 2002. - [2] C. Lageweg, S. Cotofana, S. Vassiliadis Single Electron Encoded Latches and Flip-Flops IEEE Trans. on Nanotechn., No. 2, June '04, pp. 237-248. - [3] V. Beiu, J. M. Quintana, M. J. Avedillo VLSI Implementations of Threshold Logic - a comprhensive survey IEEE Transactions on Neural Networks, No. 5, June 2003, pp.1217-1243. - [4] D. Hampel, K. J. Prost, N. R. Scheinberg, Threshold Logic Using Complementary MOS Device, U. S. Patent 3 900 742, Jun. 24, 1974. - [5] S. Aunet, H. K. Otnes Berge Statistical Simulationsfor Exploring Defect Tolerance and Power Consumption in 4 Subthreshold 1-bit Addition Circuits, Lecture Notes in Comp. Sci., No. 4507, Jun.'07, pp. 455-462. - [6] F. Leuenberger, E. Vittoz Complementary-MOS low-power low-voltage integrated binary counter IEEE Journal of Solid-State Circuits, No. 9, Sep. 1969, pp. 1528-1532. - [7] H. Soeleman, K. Roy, Bipul C. Paul, Robust Subthreshold Logic for Ultra-Low Power Operation, IEEE Transactions on Very Large Scale Integration Systems, V 9, Feb. 2001, pp. 90-99. - [8] E. J. Nowak, Maintaining the benefits of CMOS scaling when scaling bogs down, IBM J. of Res. and Developm., Mar./May, 2002, pp. 169-180. - [9] W. S. McCullogh, W. Pitts, A logical calculus of the ideas immanent in nervous activity, Bull. Math. Biophysiol., Vol. 5, pp. 115-133, 1943. - [10] A. G. Andreou, K. A. Boahen, P. O. Pouliquen, A. Pavasovic, R. E. Jenkins, K. Strohben, Current-Mode Subthreshold MOS Circuits for Analog VLSI Neural Systems, IEEE Trans. on Neural Networks, No. 2, Mar. 1991, pp. 205-213. - [11] B. H. Calhoun, A. Wang, A. Chandrakasan, Device Sizing for Minimum Energy Operation in Subthreshold Circuits., Proc. of IEEE Custom Integrated Circuits Conference, 2004, Vol. 3, pp. 95-98. - [12] Circuits Multi Projets, CMOS090 process:http://www.imag.fr/products/ic - [13] A. Bryant, J. Brown, P. Cottrell, M. Ketchen, J. Ellis-Monaghan, E. J. Nowak *Low Power CMOS at Vdd=4kT/q*, Proceedings of Device Research Conference, pp. 22-23, Notre Dame, IN, USA, Jun. 25-27, 2001. - [14] S. Aunet, Y. Berg, T. Sæther Real-Time Reconfigurable Linear Threshold Elements Implemented in Floating-GAte CMOS, IEEE Trans. on Neural Networks, No. 5, Sept. 2003, pp. 1244-1256. - [15] B. P. Wong, A. Mittal, Y. Cao, G. Starr Nano-CMOS Scaling problems and Implications, in Nano-CMOS Circuit and Physical Design, Wiley, 2005, pp. 1-23. - [16] L. Lewyn, T. Ytterdal, C. Wulff, K. Martin Analog Circuit Design in Nanoscale CMOS Technologies Proceedings of the IEEE, No. 10, Oct. 2009, pp. 1687-1714. - [17] H. P. Alstad, S. Aunet Seven Subthreshold Flip-Flop Cells., Proc. of IEEE Norchip Conference, Aalborg, Denmark, 2007. - [18] S. Aunet, *Kretselement*, Norwegian patent 320344, application 20035537, Dec. 11, 2003. - [19] J. Alfredsson Limitations of Subthreshold Digital Floating-Gate Circuits in Present and Future Nanoscale CMOS Technologies, Dissertation for the degree Technologie Doktor, Mid-Sweden University, Sweden, 2008. - [20] H. P. Alstad, S. Aunet Three Subthreshold Flip-Flop Cells Characterized in 90 nm and 65 nm CMOS Technology., Proc. of Design and Diagnostics of Electronic Circuits and Systems, Bratislava, Slovakia, 2008. - [21] S. Fisher, A. Teman, D. Vaysman, A. Gertsman, O. Yadid-Pecht, A. Fish *Ultra-Low Power Subthreshold Flip-Flop Design*, Proc. of IEEE Int'l Symp. on Circuits and Systems, Taiwan, 2009, pp. 1573-1576. - [22] S. Aunet Subthreshold Minority-3 Gates and Inverters used for 32-bit Serial and Parallel Adders Implemented in 90 nm CMOS., Proc. of IEEE Norchip Conference, Trondheim, Norway, 2009. - [23] G. Schrom, S. Selberherr *Ultra-Low-Power CMOS Technologies*, Proc. of International Semiconductor Conference, Sinaia, Oct. 1996, pp. 237-246. - [24] S. Roy, V. Beiu Majority Multiplexing Economical Redundant Fault-Tolerant Designs for Nanoarchitectures , IEEE Transactions on Nanotechnology, V 4, Jul. 2005, pp. 441-451. - [25] W. Ibrahim, V. Beiu, M. H. Sulieman On the Reliability of Majority Gates Full Adders, IEEE Trans. Nanotechn., V 7, Jan. '08, pp. 56-67. 6.1 Papers 45 ## 6.1.3 Paper III: Low-Power Subthreshold to Above Threshold Level Shifters in 90 nm and 65 nm Process Invited paper to Microprocessors and Microsystems: Embedded Hardware Design (MICPRO) Submitted #### Low-Power Subthreshold to Above Threshold Level Shifters in 90 nm and 65 nm Process Amir Hasanbegovica, Snorre Aunetb <sup>a</sup>Email: amirh@ifi.uio.no <sup>b</sup>Email:sa@ifi.uio.no Department of Informatics University of Oslo #### Abstract In this paper we present low power level shifters in the 90 nm (general purpose) and 65 nm (low power) technology nodes capable of converting subthreshold voltage signals to above threshold voltage signals. The level shifters make use of the MTCMOS design technique which gives more design flexibility, especially in low power systems. Post layout simulations indicate static power consumption down to 1 nW and 83 pW in the 90 nm and 65 nm process respectivly. Energy consumption per transition is recorded to be below 30 fJ in both processes, orders of magnitude lower then other published level shifter implementations. Propagation delay is found to be as low as 32 ns for subthreshold logic high input signals of 180 mV. The functionality of the level shifters is verified across process-, mismatch- and temperature variations between -40 °C and 150 °C. Minimum input voltage attainable while maintaining robust operation is found to be around 180 mV at operational frequencies above 1 MHz in the 90 nm process, and 350 mV at operational frequencies above 500 kHz in the 65 nm process. The level shifters employ an enable/disable feature, allowing for power saving when the level shifter is not in use. Keywords: Level shifter, low power, subthreshold operation, MTCMOS, dynamic voltage scaling #### 1. Introduction Recently there has been a lot of focus on low power electronics capable of maintaining acceptable performance requirements in terms of speed and power consumption. The most effective means of reducing the power consumption of an integrated circuit (IC) is by reducing the supply voltage. The reduction in supply voltage contributes to dramatic decrease in dynamic and static power consumption, however the utilization of low supply voltage has a negative impact on the speed of the circuit [1]. In order to reduce power consumption while limiting the sacrifice of speed, multiple voltage domains may be implemented on the same IC. By doing so, less critical sections of the circuit may be supplied by a low supply voltage, VDDL, while critical sections are supplied by higher supply voltages, VDDH. In order to connect the different voltage domains in an effective way, the use of level shifters for interfacing is vital for minimizing delay and power consumption. The different voltage domains may also be represented by digital logic (VDDL) interfaced with I/O buffers (VDDH) via level shifters, which is important for effective off-chip communication. Figure 1(a) illustrates a typical application area for level shifters, and Figure 1(b) shows the transient response of a previously fabricated level shifter<sup>1</sup> which is based on a different circuit topology. For the sections of the IC where speed is of less importance, the supply voltage may be reduced down to the subthreshold region with the goal of saving as much power as possible. Subthreshold operation has the prospective to contribute with considerable power savings which may highly benefit modern portable devices. Level shifters converting subthreshold signals to above threshold signals are most likely to have transistors operating in the subthreshold region. Transistors operating in above threshold region have a supply voltage above their inherent threshold voltage $|V_{th}|$ . However, when the supply voltage is reduced below the threshold voltage $V_{dd} < V_{th}$ , the transistor is considered to be operating in subthreshold region. Transistors operating in subthreshold region have exponential reduction in $I_{ds}$ when $V_{gs} < V_{th}$ . $I_{Dsub}$ is given by [2] $$I_{Dsub} = I_0 e^{\frac{V_{gs} - V_T}{nV_{th}}} \left( 1 - e^{\frac{-V_{ds}}{V_{th}}} \right) \tag{1}$$ where $I_0$ is the drain current when $V_{gs} < V_T$ $$I_0 = \mu_0 C_{ox} \frac{W}{I} (n-1) V_{th}^2 \tag{2}$$ Circuits employing transistors in subthreshold are prone to design challenges and reliability issues. As the temperature decreases the drive strength of the transistor is substantially weakened, especially for transistors in subthreshold. This presents a problem particularly in the cases where transistors operating in subthreshold region are stacked in the same signal path as transistors operating above threshold region. In the opposite case, when the temperature increases, the drive strength of the transistors increase due to increased carrier mobility resulting in high leakage currents [3]. $<sup>^1\</sup>mathrm{Measuements}$ anno May 12. 2010, of a level shifter fabricated in a 90 nm process from STMicroelectronics. 6.1 Papers 47 (a) Typical application for the level shifters (b) Measured transisent response of a level shifter interfacing two voltage domains, VDDL = 160 mV and VDDH = 1.2 V Figure 1: Typical level shifter application area, and transient response #### 2. Conventional level shifter Figure 2: Dual Cascode Voltage Switch The conventional level shifter, shown in Figure 2, can be implemented as an interface between two voltage domains as long as the input voltage is above the threshold voltage, $V_m$ , of Mn1 (and Mn2). The level shifter has the following operational behavior: When the input goes from a logic low to a logic high, Mn1 is turned on and Mn2 is turned off. Then, the voltage at *node* nA is pulled towards ground due to the conducting path established by Mn1. If the voltage at node nA reaches $(V_{dd}-V_{th(Mp2)})$ , the positive feedback is triggered as Mp2 turns on and pulls node nB high. The input has then been shifted from a lower voltage level, to a higher voltage level, through the output inverter. The circuit behavior is inverted for opposite input case. The voltage shift can be completed only if the pullup/pull-down ratio is roughly the same. In other words, the pullup strength has to be close or equal to the pull-down strength. If the pull-up/pull-down ratio is not close to unity, contention will take place between the pull-up transistors (Mp1 and Mp2) and pull-down transistors (Mn1 and Mn2), which will increase delay and increase power consumption [4]. This contention worsens when the input signal approaches subthreshold. For subthreshold voltages on the input, the drive strength of Mn1 may not be able to overcome the drive strength of Mp1. Hence, node nA cannot be pulled down, and the positive feedback cannot be triggered, labelling the conventional level shifter impractical for subthreshold conversion. In order to enable subthreshold conversion, the pull-up/pull-down ratio has to be equalized. #### 3. Proposed level shifters in 90 nm and 65 nm process Figure 3: Proposed circuit topology of the level shifters The proposed level shifter topology (MDCVS), shown in Figure 3, is designed in a 90 nm general purpose process from ST Microelectronics and utilizes multithreshold voltage CMOS (MTCMOS) design technique. Low-threshold transistors ("lvt") are placed where speed is of importance at the expense of leakage current and high-threshold transistors ("hvt") are placed where leakage current can be reduced at the expense of speed. In addition to the low- $V_{th}$ and the high- $V_{th}$ transistors, a standard threshold transistor ("svt") is provided in the library which presents a tradeoff between leakage and speed in itself. The use of multi- $V_{th}$ transistors enables us to find a good tradeoff between static power consumption, dynamic power consumption and propagation delay. The proposed level shifter has been designed in two versions to satisfy the needs of both high speed (MDCVSHS) and low power (MDCVSLP) operation, subthreshold operation taken into consideration. The level shifter topology has also been implemented in 65 nm (MDCVS65) *low power* process from STMicroelectronics and is based on the same topology as the level shifter in the 90 nm process. The level shifter works in the same matter as the level shifter in Figure 3. The main changes done in the 65 nm version are choice of transistor types and sizes. #### 3.1. Modified Dual Cascode Voltage Switch (MDCVS) The new level shifter topology is inspired by the DCVS circuit topology, and makes use of topological modifications in order to enable subthreshold to above threshold level conversion. The circuit exploits several techniques in order to limit the contention between pull-up and pull-down at nodes nA and nC. Primarily, the pull-down transistors (Mn1 and Mn2) are set to low threshold transistors, meaning they conduct more current at a given gate voltage. This increase in current makes it easier for the pull-down strength to follow the pull-up strength. However, the low threshold transistors are not enough to establish an even pull-up/pull-down ratio for subthreshold inputs. As proposed in [5], diode connected PMOS (Mp5 and Mp6) can be used to limit the pull-up strength of the two branches in the level shifter. When the level shifter is in steady state, $|V_{gs}|$ of the diodes is small and is equal to the diode voltage drop $V_{MPD}$ . When the input signal switches, the diode voltage drop is kept stable, limiting the pull-up strength. However, in our design this configuration is taken one step further in an attempt to reduce energy consumption. Since the voltage at node nA only goes up to $V_{dd} - V_{MPD}$ , and the voltage at node nC only goes down to $gnd + V_{MPD}$ , Mp4 and Mn4 are never completely turned on. This results in reduction of short circuit current, and may thereby contribute to a reduction of dynamic power consumption in the output inverter, given that the capacitive load is relatively low. Another useful feature with the PMOS diodes is that they enable VDDL scaling, which is critical in adaptive voltage scaling systems [1]. When VDDL increases, the pull-down strength of the level shifter increases as a result of increase in conducting sink current. The increase in sink current contributes to a faster increase in voltage drop across the PMOS diodes during the transistor switching time, thereby maintaining the pull-up/pull-down ratio. An increase in both pull-up and pull-down strength results in a reduction of the propagation delay, which allows higher operational frequencies as VDDL scales up. In order to achieve robust operation at lower temperatures Mp7 and Mp8 are added, which may further limit or increase the drive strength of the pull-up transistors (Mp1 and Mp2) by sizing the transistors up or down respectively, enabling fine adjustments of the pull-up/pull-down ratio. Transistors Mp7 and Mp8 are biased in off-state, providing a leakage current in their respective branch. This configuration allows the pull-up strength of Mp1 and Mp2 to be controlled by sizing of Mp7 and Mp8. By increasing(decreasing) the size of Mp7 and Mp8, the pull-up transistors will conduct more(less) current, thereby increase(decrease) the pull-up strength. This enables control over the rise- and fall delay on the output by balancing the pull-up/pull-down ratio. Furthermore, as the temperature increases, the leakage current from Mp7 and Mp8 becomes larger which in turn increases the pull-up strength of Mp1 and Mp2. Proper functionality is preserved because of the constant voltage drop over the PMOS diodes, limiting the pull up strength at high temperatures as well By adding four additional sleep transistors, Mn5, Mn6, Mn7 and Mn8, the level shifter can be turned on or off (sleep-mode) by setting their gate to '1' or '0' respectively. The sleep configuration is implemented using high threshold NMOS transistors since the on-resistance of an NMOS is smaller then the onresistance of a PMOS, given the same size [6]. When the level shifter is turned off, an isolation cell is needed to separate neighbouring cells supplied by different supply voltages. When EN is '0' (i.e. the level shifter is in sleep-mode), the output node is pulled up to VDDH via Mp9, thereby avoiding an intermediate state on the output. Intermediate or floating outputs may cause a large static current path between power and ground, therefore a pull-up transistor is needed if the output is driving a transistor gate. The combination of the sleep- and isolation configuration make the level shifter power gating compliant while utilizing a single cell. #### 4. Implementation considerations Factors such as area-, speed- and power consumption are very important when it comes to effectively implementing level shifters in larger designs. Therefore, some implementation considerations need to be taken into account. #### 4.1. Implementation of sleep transistors The length and width of the sleep transistors will have a significant impact on the NMOS sink current. Increase in the sleep transistor length will increase the source voltage of the input NMOS transistor due to an increase in the ON resistance in the sleep transistor, thereby decrease its $V_{gs}$ of the input NMOS transistors, Mn1 and Mn2 in Figure 3 [7]. This decrease in $V_{gs}$ leads to a decrease in the sink current of the NMOS transistor, which contributes to lower power consumption, as well as degrading of the speed of the circuit. The speed is affected by the reduction in rise- and fall times. Therefore, the lengths of the sleep transistors should be set to meet a reasonable trade-off between speed and power. In order to achieve an effective sleep-mode in a low power system, one should carefully choose the approach to realizing the sleep-mode configuration. As presented in [8], "sneak leakage" can have a substantial impact on the power consumption when in sleep-mode. In the proposed level shifter topology, sleep transistors have been implemented in a NMOS footer configuration due to the low ON-resistance of NMOS transistors. Keeping this in mind, the circuitry the level shifter is interfacing should utilize the same sleep-mode configuration as the level shifter, i.e. a NMOS footer configuration, given that the circuitry is based on MTCMOS. An extended set of rules for avoiding sneak leakage may be found in [8]. A sleepmode configuration comprising both footer and header transistors may be an unattractive approach due to a larger area overhead with little to gain in terms of leakage control [9]. Therefore, for this particular level shifter topology, it is sufficient with a sleep-mode configuration that utilizes either a header or a footer configuration. #### 4.2. Level shifter layout When designing a larger system that makes use of level shifters, the layout of the level shifter becomes an important factor. The proposed level shifters have been implemented using the same layout strategy as in [10]. The routing of the level shifters is realized using metal layer 1 and 2, thereby leaving the other metal layers available for interconnect on a higher abstraction level. The layout of the MDCVS65 is illustrated in Figure 4, and utilizes dual-height cell approach, meaning the cell height of the level shifter is twice the cell height of a standard cell. The layout dimensions of the level shifters are 9.8 $\mu$ m x 9.2 $\mu$ m, 9.8 $\mu$ m x 6.1 $\mu$ m and 6.4 $\mu$ m x 6.4 $\mu$ m for MDCVSLP, MDCVSHS and MDCVS65 respectivly. Figure 4: Layout of MDCVS65 In Figure 4, two power supply lines are shown; VDDL (bottom), VDDH (top) and GND (middle). The advantage of a dual-height cell implantation of the proposed level shifter topology is the ability to utilize well-sharing. Well-sharing leads to a compact physical implementation by relieving the dimensional limitations related to physical layout constraints. Although dual-height cells may imply some impracticalities when deployed in a single-height cell design, a streamlined interface between VDDL and VDDH domains can easily be accomplished by flipping layout cells. Figure 5 illustrates a physical layout strategy for interfacing VDDL and VDDH domains while exploiting the advantages of well-sharing. "Level shifter 1" and "Level shifter 2" can be put next to each other horizontally in order to comply with the cell height of two digital cells (denoted as "VDDL cell" and "VDDH cell" in Figure 5). By flipping "VDDL cell 2" across the x-axis, in respect to "VDDL cell 1", proper placement configuration is achieved for exploiting the advantages of well-sharing. For this particular case, "VDDL cell 1" and "VDDL cell 2" would be sharing the ground connection. Correspondingly, by flipping "VDDL cell 3" across the x-axis in respect to "VDDL cell 2" (i.e. "VDDL cell" 3 has the same orientation as "VDDL cell 1"), enables "VDDL cell 3" to share its n-well with "VDDL cell 2". In the same way as for the digital cells, flipping the level shifters will also lead to well-sharing. "Level shifter 3" and "Level shifter4" are flipped across the x-axis, enabling them to share an n-well with "Level shifter1" and "Level shifter2". By applying the necessary connections between the VDDL cells (in the VDDL domain) and the level shifters, the VDDH domain can easily be interfaced, given that the VDDH cells have the same orientation as the VDDL cells. This approach will however impose special placement and orientation rules on a placement tool. #### 5. Circuit simulations In this section we present the simulation results focusing around the following design parameters: Static power consumption, energy consumption, propagation delay and average power consumption. These design parameters are put in context with scaling of the lower supply voltage (VDDL). The simulation results are based on post layout simulations at 27 °Cwith a 10 fF capacitive load and input signal rise- and fall times of 10 ns. The rise- and fall times of the input signal are chosen based on worst case Monte Carlo simulations of a minimum sized inverter under the same temperature conditions. The capacitive load is chosen arbitrarily, emulating a (large) buffer input or equivalent ciruit. In most cases however, the level shifter would drive smaller capacitive loads, such as close to minimum sized transistor gates. Simulations were run for 20 input periods to capture any behavioral irregularities in the circuit simulations that may corrupt the results. All simulation results are extracted from the Cadence@Virtuoso@Spectre@Circuit Simulator. Transistor types and sizes used for the simulations are shown in Table 1 for the MDCVSLP, Table 2 for the MDCVSHS and Table 3 for the MDCVS65. #### 5.1. Simulation Results of 90 nm Level Shifter Figure 6(a) and 7(a) show the static power consumption and the total energy consumption per transition as a function of VDDL for MDCVSHS and MDCVSLP respectively. Simulation results show that an increase in VDDL contributes to increase in both static power- and energy consumption when VDDL > $V_m$ . This trend is primarily due to transistors Mn1 and Mn2 being in the above threshold region and depends on their capability to sink current. When VDDL is scaled further up in the above threshold region, the static current in Mn1 and Mn2 is increased, which also increases the leakage energy. As indicated, the energy consumption is lowest when VDDL is around 200mV-300mV. At this operating point the combination of the contention between pull-up and pull-down transistors and the leakage energy, is the lowest. However, the reduction of VDDL below $V_{tn}$ gives rise to crowbar currents due to increasing contention between pull-up and pull-down transistors. The contention results in higher propagation delay and thereby an increase in the active energy consumption, since the active energy consumption is determined by the time period necessary to perform a voltage transition. This is also reflected in the average power consumption illustrated in Figure 6(b) and Figure 7(b), where the active energy is the dominating power consumption contribution when VDDL $< V_{tn}$ . Figure 5: Physical layout strategy for interfacing VDDL and VDDH domains Figure 6: Level shifter (MDCVSHS) performance with varying VDDL when VDDH=1V and the input signal frequency is $1\mbox{MHz}$ When VDDL is in the subthreshold region, the propagation delay has exponential dependency on VDDL, arising from the exponential behavior of $I_{ds}$ of Mn1, Mn2, Mn3 and Mp3. Figure 6(b) and Figure 7(b) show how the propagation delay varies with VDDL. The decrease in propagation delay with increasing VDDL confirms the level shifters ability to work at higher operational frequencies as the VDDL increases. The average power Figure 7: Level shifter (MDCVSLP) performance with varying VDDL when VDDH=1V and the input signal frequency is 500 kHz consumption is increased with an increase in VDDL where the main contribution is due to an increase in dynamic (switching) power consumption in the above threshold region. The robustness of the level shifters is verified with Monte Carlo simulations both for the case of MDCVSLP and MDCVSHS, for all the stated simulation conditions. Figure 8 illustrate the worst case simulation condition for the level shifters, Figure 8: Level shifter (MDCVSHS) output when VDDL = 180 mV and VDDH = 1 V at -40 $^{\circ}C$ (300 post layout Monte Carlo simulations) being low temperature operation. Proper operation behavior is verified across a temperature range of -40 $^{\circ}\text{C} \rightarrow 150 ^{\circ}\text{C}$ . Table 1: Transistor types and sizes for the MDCVSLP | | | <b>7</b> 1 | | | | |------------|------|------------|------------|------|-----------| | Transistor | Type | W/L (µm) | Transistor | Type | W/L (μm) | | MN1 | lvt | 0.3/1 | MP2 | hvt | 0.12/0.15 | | MN2 | lvt | 0.3/1 | MP3 | svt | 2/0.2 | | MN3 | svt | 2/0.2 | MP4 | hvt | 0.6/0.2 | | MN4 | hvt | 0.24/0.2 | MP5 | lvt | 0.12/0.15 | | MN5 | hvt | 0.12/0.2 | MP6 | lvt | 0.12/0.15 | | MN6 | hvt | 0.12/0.2 | MP7 | lvt | 6/0.1 | | MN7 | hvt | 0.12/0.2 | MP8 | lvt | 6/0.1 | | MN8 | hvt | 0.12/0.2 | MP9 | hvt | 0.6/0.1 | | MP1 | hvt | 0.12/0.15 | | | | | | | | | | | Table 2: Transistor types and sizes for the MDCVSHS | Transistor | Type | W/L (µm) | Transistor | Type | W/L (µm) | |------------|------|-----------|------------|------|----------| | MN1 | lvt | 0.12/0.2 | MP2 | hvt | 0.12/0.1 | | MN2 | lvt | 0.12/0.2 | MP3 | lvt | 0.36/0.1 | | MN3 | lvt | 0.12/0.1 | MP4 | hvt | 0.6/0.2 | | MN4 | hvt | 0.12/0.1 | MP5 | lvt | 0.2/0.1 | | MN5 | hvt | 0.12/0.2 | MP6 | lvt | 0.2/0.1 | | MN6 | hvt | 0.12/0.2 | MP7 | lvt | 8/0.1 | | MN7 | hvt | 0.12/0.2 | MP8 | lvt | 8/0.1 | | MN8 | hvt | 0.12/0.2 | MP9 | hvt | 0.6/0.1 | | MP1 | hvt | 0.12/0.15 | | | | When the level shifters are put in sleep-mode, the MD-CVSHS shows an average power consumption of 175 pW while MDCVSLP shows 48 pW. #### 5.2. Simulation Results of 65 nm Level Shifter The simulation results of the 65 nm implementation are based on the same design parameters and simulation specifications as the simulation results in the 90 nm process. The simulation results of the MDCVS65 show the same trends as Table 3: Transistor types and sizes for the MDCVS65 | Transistor | Type | W/L (µm) | Transistor | Type | W/L (µm) | |------------|-------|-----------|------------|-------|-----------| | MN1 | svtlp | 0.2/0.4 | MP2 | hvtlp | 0.12/0.06 | | MN2 | svtlp | 0.2/0.4 | MP3 | lvtlp | 0.36/0.1 | | MN3 | svtlp | 0.8/0.08 | MP4 | hvtlp | 0.36/0.06 | | MN4 | svtlp | 0.12/0.12 | MP5 | lvtlp | 0.3/0.2 | | MN5 | hvtlp | 0.12/0.2 | MP6 | lvtlp | 0.3/0.2 | | MN6 | hvtlp | 0.12/0.2 | MP7 | svt | 6/0.06 | | MN7 | hvtlp | 0.12/0.2 | MP8 | svt | 6/0.06 | | MN8 | hvtlp | 0.12/0.2 | MP9 | hvtlp | 0.5/0.5 | | MP1 | hvtlp | 0.12/0.15 | | | | the 90 nm implementation in respect to the design parameters. Since the MDCVS65 is based on the exact same topology as the level shifters in 90 nm, the same contention problems between pull-up and pull-down are observed in the subthreshold region. Figure 10(a) and (b) show the simulation results of the MDCVS65. We observe a dramatic decrease in static power consumption in the 65 nm implementation compared to the 90 nm implementation, as seen in Figure 10(a). The main source of impact to the low static power consumption is that the process is a *low power* process, which inherently provides low leakage transistor behavior. Due to the $|V_{th}|$ being relatively higher in the 65 nm process compared to the 90 nm process, the contention currents become appearant when VDDL shifts approximately below 475 mV. This is observed in the increase of the energy consumption when VDDL is around the threshold voltage, $V_{tn}$ , of Mn1 and Mn2. The 65 nm implementation displays the same robustness qualities as seen in the 90 nm simulations of the level shifter topology. Figure 9 shows Monte Carlo simulation of the level shifter output, at -40 °C. When put in sleep-mode, the MD-CVS65 shows an average power consumption of 9 pW. Figure 9: Level shifter (MDCVS65) output when VDDL = 350 mV and VDDH = 1.2 V at -40 °C (300 post layout Monte Carlo simulations) | Table 4: | Level | shifter | comparison | |----------|-------|---------|------------| | | | | | | Level Shifter Design | | | | | | | | | | | |-------------------------|---------------|---------------|---------------------------|--------------------------|---------------|---------------|---------------|--|--|--| | Design Parameter | DSLS2 [11] | DSLS2b [11] | CMLS [12] | LC [5] | MDCVSLP | MDCVSHS | MDCVS65 | | | | | Propagation delay | 125 ns | 110 ns | 50 ns | 10 us | 120 ns | 32 ns | 64 ns | | | | | Energy cons. per trans. | 21.4 pJ | 21.8 pJ | 25 pJ | 8 nJ | 21 fJ | 17 fJ | 23 fJ | | | | | Static power cons. | n/a | n/a | 5 nW | n/a | 1 nW | 2.5 nW | 84 pW | | | | | VDDL/min. VDDL | 0.35 V/0.35 V | 0.35 V/0.35 V | 0.18 V/0.1 V | 0.2 V/0.13 V | 0.18 V/0.18 V | 0.18 V/0.18 V | 0.35 V/0.35 V | | | | | VDDH | 1.2 V | 1.2 V | 1.2 V | 1.8 V | 1 V | 1 V | 1.2 V | | | | | Process | SOI 0.25 μm | SOI 0.25 μm | $0.13~\mu\mathrm{m}$ bulk | $0.18\mu\mathrm{m}$ bulk | 90 nm bulk | 90 nm bulk | 65 nm bulk | | | | (a) Static power- and energy consumption (b) Propagation delay and average power consumption Figure 10: Level shifter (MDCVS65) performance with varying VDDL when VDDH=1.2V and the input signal frequency is 500kHz #### 6. Discussion The level shifters are primarily aimed for converting voltage signals from subthreshold voltage to voltage levels several hundred mVs above. Nevertheless, the level shifters shows acceptable performance while converting from a wider range of voltage levels. This makes the level shifters suitable for applications where dynamic voltage scaling is required to satisfy speed requirements by dynamically increasing/decreasing VDDL. When interfacing two voltage domains, level shifter performance is of critical importance. Our analysis shows that by adjusting transistor- sizing and type, we are able to modify the performance of the level shifter while using the same topology. This adjustment will however have an impact of the physical layout size of the level shifter, but will most probably not extend the dual-height cell specification of the level shifter since these adjustments can be implemented while exploiting the cell width instead. When lowering VDDL below the threshold voltage, we see a rapid increase in propagation delay. This indicates that this level shifter topology is highly susceptible to supply bounce and IR drop. In order to achieve intended performance of the level shifters, it is critical to minimize the effects of supply bounce and IR drops. Alternatively, the transistor- sizing and type may be adjusted to meet the delay requirements based on the expected supply bounce and IR drop behavior. The resulting decrease in delay will therefore come at an expense of increased power- and potentially, area consumption. Table 4 illustrates the how the level shifter performance varies across different processes with emphasis on subthreshold to above threshold conversion. The energy and power needed to complete a voltage shift from subthreshold level to above threshold level is substantially reduced in a deep submicron process. To the authors knowledge, according to simulation results, this is the lowest power- and energy consumption reported regarding subthreshold to above threshold level shifters with similar design parameters. #### 7. Conclusion Our work shows that by applying MTCMOS design technique to subthreshold level shifter design, low power- and energy consumption may be achieved while maintaining reliable performance in 90 nm and 65 nm process. The level shifters also demonstrate compatibility with dynamic voltage scaling, at the expense of variations in power- and energy consumption. We also illustrate a physical implementation approach which confirms effective integration of the level shifters in a larger design, in terms of maximum area utilization. The proposed level shifters are designed with sleep-mode and isolation capability while making use of a single dual-height cell physical implementation strategy. For applications requiring voltage level shifting from subthreshold voltages to above threshold voltage, the proposed level shifters may serve as a good solution. However, for voltage shifting in the above threshold region, the conventional level shifter is more efficient in terms of power and energy consumption. #### References - [1] V. Gutnik and A. P. Chandrakasan, "Embedded power supply for low-power dsp," vol. 5, no. 4, pp. 425–435, Dec. 1997. - [2] A. C. A. Wang, B.H. Calhoun, Sub-Threshold Design for Ultra Low-Power Systems. Springer-Verlag New York, 2006. - [3] Y. P. Tsividis, Operation and Modeling of the MOS Transistor, 2nd ed. Boston: McGraw-Hill, 1999. - [4] C. Q. Tran, H. Kawaguchi, and T. Sakurai, "Low-power high-speed level shifter design for block-level dynamic voltage scaling environment," in Proc. International Conference on Integrated Circuit Design and Technology ICICDT 2005, May 9–11, 2005, pp. 229–232. - [5] H. Shao and C.-Y. Tsui, "A robust, input voltage adaptive and low energy consumption level converter for sub-threshold logic," in *Proc. ESSCIRC* 33rd European Solid State Circuits Conference, Sep. 11–13, 2007, pp. 312–315. - [6] J. Kao, A. Chandrakasan, and D. Antoniadis, "Transistor sizing issues and tool for multi-threshold cmos technology," in *Proc. 34th Design Au*tomation Conference, Jun. 1997, pp. 409–414. - [7] M. Seok, S. Hanson, D. Sylvester, and D. Blaauw, "Analysis and optimization of sleep modes in subthreshold circuit design," in *Design Automation Conference*, 2007. DAC '07. 44th ACM/IEEE, june 2007, pp. 694–699. - [8] B. Calhoun, F. Honore, and A. Chandrakasan, "Design methodology for fine-grained leakage control in mtcmos," aug. 2003, pp. 104 – 109. - [9] K. Roy, S. Mukhopadhyay, and H. Mahmoodi-Meimand, "Leak-age current mechanisms and leakage reduction techniques in deep-submicrometer cmos circuits," *Proceedings of the IEEE*, vol. 91, no. 2, pp. 305 327, feb 2003. - [10] F. Ishihara, F. Sheikh, and B. Nikolic, "Level conversion for dual-supply systems," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, vol. 12, no. 2, pp. 185 – 195, feb. 2004. - [11] A. Chavan and E. MacDonald, "Ultra low voltage level shifters to interface sub and super threshold reconfigurable logic cells," in *Proc. IEEE Aerospace Conference*, Mar. 1–8, 2008, pp. 1–6. - [12] T.-H. Chen, J. Chen, and L. T. Clark, "Subthreshold to above threshold level shifter design," J. Low Power Electronics, vol. 2, no. 2, pp. 251–258, 2006 ## 6.2 Irradiation of SAR architectures The process which eventually lead to irradiation of the DUT consisted of two phases; A preliminary testing phase and irradiation testing phase using a Caesium 137 radiation source. The preliminary testing phase revealed that only two of the implemented SAR architectures, STD SAR, and TMR SAR, worked correctly. The DMR SAR architecture did not function properly<sup>1</sup>, and was therefore excluded from the testing procedure. The STD SAR was included in order to observe the magnitude of th impact redundancy has on radiation tolerance. The test parameters intended for the irradiation testing are displayed in Table 6.1 | | Device on chip | Characterization setup | Supply voltage scaling | Observation | |----------|-----------------|------------------------|--------------------------------|---------------------| | Param. 1 | STD/TMR SAR | Template 1 | 1.2 V, 0.8 V, 0.5 V and 0.35 V | SEU | | Param. 2 | STD/TMR SAR | Template 2 | 1.2 V, 0.8 V, 0.5 V and 0.35 V | SEU | | Param. 3 | NMOS transistor | I/V sweep | 1.2 V | TID induced leakage | Table 6.1 Test parameters SEU characterization of the SAR architectures was performed by analysing their response with two different input combinations. The analysis also takes supply voltage into account, and performs SEU characterization at four different supply voltages, including subthreshold operation. Furthermore, the test parameters include NMOS TID induced leakage characterization which is performed by sweeping the gate voltage of the transistor. The NMOS analysis is executed between supply voltage scaling runs. The SEU characterization of this particular experiment was performed according to the dynamic testing procedure described in Section 5.2.3. Stimuli was applied to the DUT every 3 seconds, with a clock frequency of 391 kHz (which amounts to 1 run sequence). ## **6.2.1** Preliminary Testing Prior to radiation testing of the DUT, several preliminary prototyping runs were performed in a non-radiation lab environment in order to ensure correct operation of the test setup. A key requirement for the test setup was that it should be capable of sustaining stable operation without the need of continuous supervision. Figure 6.1 and Figure 6.2 show the preliminary error detection runs at a supply voltage of 1.2 V and 350mV respectively. The top left plot shows raw data from the XOR function on the FPGA, which displays a '1' if an uncorrelated event between the template and the DUT response is observed, and a '0' if the event is correlated. The top right plot shows accumulated errors per RS 232 package that is received from the FPGA. The lower left plot shows how many SETs are observed per run sequence, and the lower right plot shows how many SEUs are observed per run sequence. The plot in the lower right corner is the key plot, as the other plots are only included for observation and post analysis purposes. No errors were observed in the preliminary testing runs of the DUT and the test setup. The test setup showed sufficient ability to operate continuously without any significant irregularities with respect to the specified test parameters. Figure 6.2 shows some irregularities, which are due to delay occurrences in subthreshold operation. These irregularities can be compensated for both in software or on the FPGA. However as it can be observed from the plots, the patter is regular when the delay occurrences are pronounced, and any error occurrences of significance can be distinguished from the delay induced patterns. <sup>&</sup>lt;sup>1</sup>Due to time constraints on this project, the reason remains unknown. **Figure 6.1** Preliminary error detection of the TMR SAR architecture with 1.2 V supply voltage **Figure 6.2** Preliminary error detection of the TMR SAR architecture with 350 mV supply voltage Before the test setup could be cleared for radiation testing, confirmation of the SEU detection capability needed to be tested. Thereby, an error generating environment was emulated by exploiting propagation delay variations that occur in circuits that operate in the subthreshold region. Early on in the prototyping procedure, it was discovered that the Spartan 3 development board generated a lot of noise on the power traces on the DUT PCB. The noise was not of significant importance when the SAR architectures were operated in the above threshold region and at low operating frequencies. However, when the supply voltage is scaled down to the subthreshold region, the power supply noise imposes a considerable impact on propagation delay variation. By reducing the supply voltage well below the threshold voltage of the transistors in the SAR architecture, the error detection setup was able to detect errors that resemble SEUs. Figure 6.3 shows a plot of the error detection algorithm when the SAR architectures are operated at a supply voltage of 300 mV in a non-radiation environment. Figure 6.3 Error detection at 300 mV supply voltage Due to a fixed clock frequency on the DUT and the propagation delay variations in the circuits, it was possible to force erroneous response from the DUT. One hypothesis is that the propagation delay variations caused setup- and hold time violations for the D flip flops in the SAR architectures. These errors caused the D flip flops in the SAR algorithms to malfunction and thereby produce wrong outputs, which were registered as SEUs by the error detection setup. A preliminary measurement of the NMOS drain current was also performend. The drain current measurements were compared to schematic simulations of a NMOS transistor with the same dimensions as the transistor implemented on the test chip. Figure 6.4 shows the simulation and measurements of the NMOS drain current as a function of gate voltage. The measurements of the NMOS drain current revealed a deviation from the simulation results around and bellow a gate voltage of 250 mV. In the gate voltage range between 0 V to 250 mV, the measured current was lower the the drain current observed in the simulation results. The reasons for this trend are unknown, and therefore need more thourough examination. Moreover, the deviation occurrence takes place in a region of operation which is of high interest to this experiment, namely the subthreshold region. In order to determine the TID induced leakage effects on the threshold voltage and the $I_{ON}/I_{OFF}$ ratio of a linear transistor, the reasons for the deviation in the current measurements need to be identified and corrected. Figure 6.4 Current measurements NMOS #### 6.2.2 Experimental radiation testing of the DUT The initial plan was to perform radiation tests at the Cyclotron Laboratory at the Department of Physics, University of Oslo, however due to scheduling and maintenance issues this was unfortunately not possible to achieve in the time scope of this thesis [66]. Therefore, a simplified radiation test was set up at SAFE - Centre for Accelerator Based Research and Energy Physics at the University of Oslo. The simplified test procedure involved the use of a Caesium 137 (Cs-137) gamma emitting radiation source with energies up to 0.662 MeV. The DUT was irradiated by placing two radiation sources on top of the test chip. The package on the testchip was a QFN64 package with the possibility of delidding. However, in order to minimize the risk of damaging the chip during delidding, the lid was left intact on the test chip. This also contributed to a longer distance between the radiation source and the silicon die, which has an impact on the radiation dose applied to the circuits on the test chip due to the radial distribution effect. Figure 6.5 shows the test setup at SAFE. The DUT PCB and radiation sources were placed in a lead shielded area, so that people working in close proximity were not exposed to necessary radiation. All peripheral equipment was placed outside the range of the radiation. The irradiation of the DUT lasted for 13 hours and included all test parameters (se Table 6.1) but the NMOS transistor I/V sweep. The NMOS transistor characterization was excluded due to technical difficulties encountered at the time of radiation testing. Figure 6.6 shows the measurement results of the STD SAR, which is based on the unhardened implementation. The SAR architecture is biased in the worst case supply voltage condition, being 340 mV. No errors were observed during radiation testing of 1000 run sequences. The current consumption of the SAR architectures, $I_{DD}$ , was also measured during irradiation of the DUT. There were no deviations observed when compared to pre radiation current consump- Figure 6.5 Radiation testing setup at SAFE **Figure 6.6** Error detection of the STD SAR architecture with 340 mV supply voltage, during irradiation tion. # Chapter 7 ## Discussion The memory elements and level shifters discussed in the papers included in this thesis are all aimed for reliable low power applications. The level shifters show high reliability in terms of process-missmatch and temperature variations, while displaying low power- and energy consumption. The memory elements implemented using minority-3 gates and inverters have symmetric and regular physical implementation, which is very important for subthreshold operation due to the exponential current dependencies in that specific operating region. Yield and defect tolerance of the minority-3 gates have already been evaluated and confirmed in [67], which makes the use of minority-3 gates relevant for applications requiring radiation tolerance. The minority-3 gate in combination with inverters is sufficient to implement any basic Boolean function (NAND, OR, AND ...). Therefore, a larger system utilizing dynamic voltage scaling and multiple voltage domains can be implemented using only minority-3 gates, inverters and level shifters only. By applying circuit- and architectural level SET and SEU mitigation techniques, in conjunction with redundancy techniques, such systems may become relevant for harsh radiation environments, even outside the Earths atmosphere. The radiation testing using a non-focused radiation source proved to be insufficient to determine the radiation tolerance of a electronic device. A focused beam test in the form of a pulsed laser or a particle accelerator is required for proper SEU characterization of the DUT [2]. The proton beam configuration at the Cyclotron lab in Oslo may be much better suited for SEU characterization since the radiation would be focused on a small area on the chip. Energetic protons interaction with the semiconductor material are enough to cause both ionizing leakage effects as well as SEUs due to nuclear interaction with p-type silicon. Although it was verified that the proposed test methodology is sufficient for SEU characterization of the DUT, the test setup can still be regarded to be in an infant state. There is however a lot of potential for further implementation and development of new functionality on the test setup. Latch-up detection can be implemented in software while utilizing the tools allready in use in the test setup. As technology scales down even further, multiple bit upsets become important issue due to less separation between transistors in high density designs. Therefore, it would be desirable to expand the test software to enable for multiple bit upset detection with a certain spatial resolution. Such an approach would provide better understanding of how particle strikes effect high density designs and how they interact with the specific process used in this thesis. The pulse injection simulations are the only simulations that were performed solely on the schematic level. All other necessary simulations were performed based on extracted netlist from the layout. Monte carlo simulations were used to verify the circuit reliability and temperature dependence. For more precise SET and SEU evaluation of different circuit and architectural topologies, pulse injection may be performed on extracted netlists, followed by monte carlo simulations. Thereby the impact of process- mismatch- and temperature variations on SET and SEU performance can be evaluated. All fabricated designs that are covered in this thesis are made up by using linear transistors. Even though linear transistors are more susceptible to TID induced leakage current then for example ELTs, the utilization of linear transistors contributes to quick realization of circuits while maintainting total dose hardness. However, the use of linear transistors with low supply voltage may be problematic due to the TID induced leakage and thereby the reduction of $I_{ON}/I_{OFF}$ ratio. For circuits operating with low supply voltages, ELT transistors may be completely necessary in order to perserve correct operational behavior of a circuit. Therefore, incorporation of ELTs in customized cell libraries may become relevant if the linear devices show insufficient operational behaviour at low supply voltages. # Chapter 8 ## Conclusion In this thesis, design and analysis of several digital building blocks have been presented within the field of subthreshold digital circuit design. Simulation results have shown, to the authors knowledge, the lowest power- and energy consumption of subthreshold to above threshold level shifters in 90 nm and 65 nm process. Furthermore, a D latch and a D flip flop based on minority-3 gates have been presented. The memory elemtens demonstrate robust operational behaviour across process- and missmatch variations, and their functionality is confirmed with chip measurements in the 90 nm general purpose process from ST Microelectronics. Radiation induced effects in commercially fabricated ICs have been evaluated. A test chip has been fabricated in the 90 nm low power process from TSMC with the goal of evaluating different SET and SEU mitigation techniques. A SEU characterization test setup has been proposed and was used in experimental radiation tests at SAFE. There were no radiation induced errors observed during radiation testing while irradiating the test chip with a Caesium 137 gamma emitting radiation source. For further research, a SEU characterization of the test chip should be conducted with a focused radiation beam in order to determine the radiation tolerance of the different SEU mitigation techniques discussed in this thesis. Such an experiment may determine potential radiation environments that are suited for each of the SET and SEU mitigation techniques, given that they are implemented in the TSMC 90 nm process used in this thesis. After the radiation characterization has been conducted, work on building a radiation tolerant digital cell library can be initiated. Such a customized cell library would open a lot of possibilities for design of larger systems using a hardware description language and place and route tools. Multiple voltage domains may be implemented for optimization of power consumption and reliability. #### **Bibliography** - [1] H. M. C. J. M. B. C. M. N. J.F. Ziegler, H. W. Curtis, "IBM experiments in soft fails in computer electronics," IBM Journal of Reasearch and Development 40 (1996). - [2] D. M. F. R. D. Schrimpf, Radiation Effects and Soft Errors in Integrated Circuits and Electronic Devices (World Scientific, 2004), Chap. Hardness Assurance for Commercial Microelectronics - [3] G. Semeraro, G. Magklis, R. Balasubramonian, D. Albonesi, S. Dwarkadas, and M. Scott, "Energy-efficient processor design using multiple clock domains with dynamic voltage and frequency scaling," In *High-Performance Computer Architecture, 2002. Proceedings. Eighth International Symposium on*, pp. 29 40 (2002). - [4] T. Burd and R. Brodersen, "Energy efficient CMOS microprocessor design," In *System Sciences, 1995. Proceedings of the Twenty-Eighth Hawaii International Conference on,* **1,** 288 –297 vol.1 (1995). - [5] J.-M. Chang and M. Pedram, "Energy minimization using multiple supply voltages," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on **5**, 436 –443 (1997). - [6] S. Luetkemeier, T. Kaulmann, and U. Rückert, "A Sub-200mV 32bit ALU with 0.45pJ/instruction in 90nm CMOS," In *Semiconductor Conference Dresden*, (2009). - [7] R. C. Bauman, in Radiation Effects and Soft Errors in Integrated Circuits and Electronic Devices, D. M. F. R. D. Schrimpf, ed., (World Scientific, 2004), Chap. SoftErrors in Commercial Integrated Circuits, pp. 15–25. - [8] V. R. V. S. R. D. S. D. Lunardini, B. Narasimham and W. H. Robinson, "A performance comparison between hardened-by-design and conventional-design standard cells," Workshop on Radiation Effects on Components and Systems, Radiation Hardening Techniques and New Developments (2004). - [9] W. T. HOLMAN, in *Radiation Effects and Soft Errors in Integrated Circuits and Electronic Devices*, D. M. F. R. D. Schrimpf, ed., (World Scientific, 2004), Chap. Radiation-tolerant Design for High Performance Mixed-Signal Circuits, pp. 69–366. - [10] R. Lacoe, "Improving Integrated Circuit Performance Through the Application of Hardness-by-Design Methodology," Nuclear Science, IEEE Transactions on **55**, 1903 –1925 (2008). - [11] K. Roy, S. Mukhopadhyay, and H. Mahmoodi-Meimand, "Leakage current mechanisms and leakage reduction techniques in deep-submicrometer CMOS circuits," Proceedings of the IEEE **91**, 305 327 (2003). [12] R. Swanson and J. Meindl, "Ion-implanted complementary MOS transistors in low-voltage circuits," Solid-State Circuits, IEEE Journal of 7, 146 – 153 (1972). - [13] F. Leuenberger and E. Vittoz, "Complementary-MOS low-power low-voltage integrated binary counter," Proceedings of the IEEE **57**, 1528 1532 (1969). - [14] D. Liu and C. Svensson, "Trading speed for low power by choice of supply and threshold voltages," Solid-State Circuits, IEEE Journal of **28,** 10 –17 (1993). - [15] A. C. A. Wang, B.H. Calhoun, Sub-Threshold Design for Ultra Low-Power Systems (Springer-Verlag New York, 2006). - [16] S. Aunet and H. K. O. Berge, Computational and Ambient Intelligence (2007), Vol. 4507/2007, Chap. Statistical Simulations for Exploring Defect Tolerance and Power Consumption for 4 Subthreshold 1-Bit Addition Circuits, pp. 455–462. - [17] E. I. Y. Y. H. K. Takashi Nakamura, Mamoru Baba, *TERRESTRIAL NEUTRON-INDUCED SOFT ERRORS IN ADVANCED MEMORY DEVICES* (World Scientific, 2008), Chap. 1.2 General Description of the SEE Mechanism. - [18] D. M. F. R. D. Schrimpf, Radiation Effects and Soft Errors in Integrated Circuits and Electronic Devices (World Scientific, 2004). - [19] D. M. S. Buchner, "Single Event Transients in Linear Integrated Circuits," IEEE Nuclear and Space Radiation Effects Conference Short Course. - [20] A. Tipton, X. Zhu, H. Weng, J. Pellish, P. Fleming, R. Schrimpf, R. Reed, R. Weller, and M. Mendenhall, "Increased Rate of Multiple-Bit Upset From Neutrons at Large Angles of Incidence," Device and Materials Reliability, IEEE Transactions on 8, 565 –570 (2008). - [21] F. Wrobel, "Use of nuclear codes for neutron-induced nuclear reactions in microelectronics," In *On-Line Testing Symposium*, 2005. IOLTS 2005. 11th IEEE International, pp. 82 86 (2005). - [22] G. Swift and S. Guertin, "In-flight observations of multiple-bit upset in DRAMs," Nuclear Science, IEEE Transactions on 47, 2386 –2391 (2000). - [23] M. Baze, B. Hughlock, J. Wert, J. Tostenrude, L. Massengill, O. Amusan, R. Lacoe, K. Lilja, and M. Johnson, "Angular Dependence of Single Event Sensitivity in Hardened Flip/Flop Designs," Nuclear Science, IEEE Transactions on **55**, 3295 –3301 (2008). - [24] D. Cochran, S. Buchner, C. Poivey, K. LaBel, R. Ladbury, M. O'Bryan, J. Howard, A. Sanders, and T. Oldham, "Compendium of Current Total Ionizing Dose Results and Displacement Damage Results for Candidate Spacecraft Electronics for NASA," In *Radiation Effects Data Workshop*, 2007 IEEE, **0**, 146–152 (2007). - [25] P.-S. Y. Gary K. Maki, "Radiation Tolerant Ultra Low Power CMOS Microelectronics: Technology Development Status," NASA Earth Science Technology Conference, (2003). - [26] K. L. J. G. C. P. Ken Li, Mike Xapsos and R. F. Stone, "Single Event Effect and Total Ionizing Dose Testing of CULPRiT Reed Solomon Encoders," (2004). - [27] M. Xapsos, "TECHNOLOGY READINESS OVERVIEW:CMOS ULTRA-LOW POWER RADIATION TOLERANT (CULPRIT) INTEGRATED CIRCUITS," (2004). [28] C. Tai-Hua, C. Jinhui, L. Clark, J. Knudsen, and G. Samson, "Ultra-Low Power Radiation Hardened by Design Memory Circuits," Nuclear Science, IEEE Transactions on **54**, 2004 –2011 (2007). - [29] K. Wang, L. Chen, and J. Yang, "AN ultra low power fault tolerant SRAM design in 90nm CMOS," In *Electrical and Computer Engineering*, 2009. CCECE '09. Canadian Conference on, pp. 1076 –1079 (2009). - [30] K.-C. Wu and D. Marculescu, "Power-aware soft error hardening via selective voltage scaling," In *Computer Design, 2008. ICCD 2008. IEEE International Conference on*, pp. 301 –306 (2008). - [31] A. Chavan, G. Dukle, B. Graniello, and E. MacDonald, "Robust Ultra-Low Power Subthreshold Logic Flip-Flop Design for Reconfigurable Architectures," In *Reconfigurable Computing and FPGA's*, 2006. ReConFig 2006. IEEE International Conference on, pp. 1–7 (2006). - [32] L. Gonella, F. Faccio, M. Silvestri, S. Gerardin, D. Pantano, V. Re, M. Manghisoni, L. Ratti, and A. Ranieri, "Total Ionizing Dose effects in 130-nm commercial CMOS technologies for HEP experiments," Nucl. Instrum. Methods Phys. Res., A **582**, 750–754 (2007). - [33] T. Oldham and F. McLean, "Total ionizing dose effects in MOS oxides and devices," Nuclear Science, IEEE Transactions on **50**, 483 499 (2003). - [34] A. P. A. Cester, in *Radiation Effects and Soft Errors in Integrated Circuits and Electronic Devices*, D. M. F. R. D. Schrimpf, ed., (World Scientific, 2004), Chap. Ionizing Radiation Effects on Ultra-Thin Oxide MOS Structures, pp. 279–290. - [35] L. Massengill *et al.*, "Heavy-ion-induced breakdown in ultra-thin gate oxides and high-k dielectrics," Nuclear Science, IEEE Transactions on **48**, 1904 –1912 (2001). - [36] N. F. Haddad, "Considerations for the development of radiation resistant devices and VLSI circuits," J. Electron. Mater. **16**, 603–608 (1990). - [37] J. Schwank, D. Fleetwood, M. Shaneyfelt, and P. Winokur, "Latent thermally activated interface-trap generation in MOS devices," Electron Device Letters, IEEE 13, 203 –205 (1992). - [38] M. Shaneyfelt, P. Dodd, B. Draper, and R. Flores, "Challenges in hardening technologies using shallow-trench isolation," Nuclear Science, IEEE Transactions on **45**, 2584 –2592 (1998). - [39] A. B. Kahng, P. Sharma, and R. O. Topaloglu, "Exploiting STI stress for performance," In *ICCAD*, G. G. E. Gielen, ed., pp. 83–90 (IEEE, 2007). - [40] M. McLain, H. Barnaby, I. Esqueda, J. Oder, and B. Vermeire, "Reliability of high performance standard two-edge and radiation hardened by design enclosed geometry transistors," In *Reliability Physics Symposium*, 2009 IEEE International, pp. 174 –179 (2009). - [41] L. D.-M. et al, "Study of total ionizing dose radiation effects on enclosed gate transistors in a commercial CMOS technology," Chinese Phys 19 (2007). - [42] G. Anelli *et al.*, "Radiation tolerant VLSI circuits in standard deep submicron CMOS technologies for the LHC experiments: practical design aspects," Nuclear Science, IEEE Transactions on **46**, 1690 –1696 (1999). [43] S. Binzaid and J. Attia, "Configurable Active-Region-Cutout-Transistor for radiation hard-ened circuit applications," In *Electrical and Computer Engineering*, 2008. CCECE 2008. Canadian Conference on, pp. 001215 –001218 (2008). - [44] W. Snoeys, T. Gutierrez, and G. Anelli, "A new NMOS layout structure for radiation tolerance," Nuclear Science, IEEE Transactions on **49**, 1829 1833 (2002). - [45] R. Dlugosz and K. Iniewski, "Flexible architecture of ultra-low-power current-mode interleaved successive approximation analog-to-digital converter for wireless sensor networks," VLSI Des. **2007**, 1–13 (2007). - [46] R. M. Orndorff, "RADIATION HARDENED FLIP FLOP,", 1974. - [47] J. W. G. K. J. H. S. R. Whitaker, "Radiation Hardness of Ultra Low Power CMOS VLSI," 11th NASA Symposium on VLSI Design (2003). - [48] P. Roche, M. Lysinger, G. Gasiot, J.-M. Daveau, M. Zamanian, and P. Dautriche, "Growing Interest of Advanced Commercial CMOS Technologies for Space and Medical Applications. Illustration with a New Nano-Power and Radiation-Hardened SRAM in 130nm CMOS," In *On-Line Testing Symposium, 2008. IOLTS '08. 14th IEEE International*, pp. 46–48 (2008). - [49] M. Zhang, S. Mitra, T. M. Mak, N. Seifert, N. J. Wang, Q. Shi, K. S. Kim, N. R. Shanbhag, and S. J. Patel, "Sequential Element Design With Built-In Soft Error Resilience," Very Large Scale Integration (VLSI) Systems, IEEE Transactions on 14, 1368 –1378 (2006). - [50] A. Balasubramanian, B. Bhuva, J. Black, and L. Massengill, "RHBD techniques for mitigating effects of single-event hits using guard-gates," Nuclear Science, IEEE Transactions on **52**, 2531 2535 (2005). - [51] B. Narasimham, V. Ramachandran, B. Bhuva, R. Schrimpf, A. Witulski, W. Holman, L. Massengill, J. Black, W. Robinson, and D. McMorrow, "On-Chip Characterization of Single-Event Transient Pulsewidths," Device and Materials Reliability, IEEE Transactions on 6, 542 –549 (2006). - [52] Z. Huang and H. Liang, "A New Radiation Hardened by Design Latch for Ultra-Deep-Sub-Micron Technologies," In *On-Line Testing Symposium, 2008. IOLTS '08. 14th IEEE International*, pp. 175–176 (2008). - [53] T. Monnier, F. Roche, and G. Cathebras, "Flip-flop hardening for space applications," In *Memory Technology, Design and Testing, 1998. Proceedings. International Workshop on*, pp. 104 –107 (1998). - [54] L. Wang, S. Yue, and Y. Zhao, "Low-Overhead SEU-Tolerant Latches," In *Microwave and Millimeter Wave Technology, 2007. ICMMT '07. International Conference on*, pp. 1–4 (2007). - [55] S. Bonacini, F. Faccio, K. Kloukinas, and A. Marchioro, "An SEU-Robust Configurable Logic Block for the Implementation of a Radiation-Tolerant FPGA," Nuclear Science, IEEE Transactions on **53**, 3408 –3416 (2006). - [56] T. Calin, M. Nicolaidis, and R. Velazco, "Upset hardened memory design for submicron CMOS technology," Nuclear Science, IEEE Transactions on **43**, 2874 –2878 (1996). [57] B. K. Kim, "Reliability analysis of real-time controllers with dual-modular temporal redundancy," In *Real-Time Computing Systems and Applications*, 1999. RTCSA '99. Sixth International Conference on, pp. 364–371 (1999). - [58] M. L. B. M. B. B. W. A. B. J. B. A. C. M. B. D. A. J. R. R. M. M. Amusan, O.A., "Mitigation Techniques for Single-Event-Induced Charge Sharing in a 90-nm Bulk CMOS Process," Device and Materials Reliability, IEEE Transactions on 9, 311 –317 (2009). - [59] H. Alstad and S. Aunet, "Three Subthreshold Flip-Flop Cells Characterized in 90 nm and 65 nm CMOS Technology," In *Design and Diagnostics of Electronic Circuits and Systems*, 2008. DDECS 2008. 11th IEEE Workshop on, pp. 1 –4 (2008). - [60] L. Wang, S. Yue, and Y. Zhao, "Low-Overhead SEU-Tolerant Latches," In *Microwave and Millimeter Wave Technology, 2007. ICMMT '07. International Conference on*, pp. 1–4 (2007). - [61] R. Aitken and B. Hold, "Modeling soft-error susceptibility for IP blocks," In *On-Line Testing Symposium*, 2005. IOLTS 2005. 11th IEEE International, pp. 70 73 (2005). - [62] M. Omana, D. Rossi, and C. Metra, "Latch Susceptibility to Transient Faults and New Hardening Approach," Computers, IEEE Transactions on Computers **56**, 1255 –1268 (2007). - [63] R. Rasmussen, "Spacecraft electronics design for radiation tolerance," Proceedings of the IEEE **76**, 1527 –1537 (1988). - [64] M. R. S. James R. Schwank and P. E. Dodd, "Radiation Hardness Assurance Testing of Microelectronic Devices and Integrated Circuits: Radiation Environments, Physical Mechanisms, and Foundations for Hardness Assurance," SANDIA NATIONAL LABORATORIES DOC-UMENT SAND (2008). - [65] M. R. S. James R. Schwank and P. E. Dodd, "Radiation Hardness Assurance Testing of Microelectronic Devices and Integrated Circuits: Radiation Environments, Physical Mechanisms, and Foundations for Hardness Assurance," James R. Schwank, Marty R. Shaneyfelt, and Paul E. Dodd 6851P (2008). - [66] J. P. Omtvedt, "Personal communication,". - [67] K. Granhaug and S. Aunet, "Improving Yield and Defect Tolerance in Multifunction Subthreshold CMOS Gates," In *Defect and Fault Tolerance in VLSI Systems*, 2006. DFT '06. 21st IEEE International Symposium on, pp. 20 –28 (2006). #### Appendix A ## VHDL code for template-based soft error characterization ``` 1 library IEEE; use IEEE.std_logic_1164.all; 3 use ieee.numeric_std.all; 5 entity X_OR is generic (n : NATURAL := 256); port ( areset : in std_logic; 9 clk : in std_logic; tmpdata1 : in std_logic_vector(n-1 downto 0); 11 tmpdata2 : in std_logic_vector(n-1 downto 0); tmpdata3 : in std_logic_vector(n-1 downto 0); 13 tmpdata4: in std_logic_vector(n-1 downto 0); data_out1 : in std_logic_vector(n-1 downto 0); 15 data_out2 : in std_logic_vector(n-1 downto 0); data_out3 : in std_logic_vector(n-1 downto 0); 17 data_out4 : in std_logic_vector(n-1 downto 0); data_ready : in std_logic; 19 active : out std_logic; error : out std_logic --- RS232 output 21); end X_OR; 23 architecture behave of X_OR is 25 signal cnt: integer range 0 to 256; 27 signal errorent: integer range 0 to 1025; signal shift_cnt : integer range 1 to 6; 29 signal abort_hit : std_logic; signal slowclk : std_logic; 31 signal slowcount: integer range 0 to 5500; signal sender_buff: std_logic_vector(10 downto 0); --RS232 data out 33 signal RS232_cnt: integer range 0 to 10; signal RS232_ready : std_logic; 35 signal write_cnt : integer range 0 to 11; signal error_detected : std_logic; 37 begin 39 sender buff (10) <= '1'; sender_buff(9) <= '1'; ``` ``` 41 | sender_buff(0) <= '0'; active <= abort_hit;</pre> 43 45 process (areset, clk) begin—setting up RS232 band rate 47 if areset = '1' then slowclk <= '0'; 49 slowcount <= 0; 51 elsif rising_edge(clk) then if data_ready = '1' then slowcount <= slowcount + 1;</pre> 53 if slowcount = 4970 then 55 slowclk <= not slowclk;</pre> slowcount <= 0; end if; 57 end if; 59 end if; end process; 61 63 process (areset, slowclk) 65 variable reg1 : std_logic_vector(7 downto 0); -- Shiftregister for data- packs on RS232 67 begin 69 if areset = '1' then 71 cnt <= 0; errorcnt <= 0; shift_cnt <= 1; 73 abort_hit <= '0'; error_detected <= '0';</pre> 75 RS232_ready <= '1'; 77 RS232_cnt <= 0; write_cnt <= 0; 79 elsif rising_edge(slowclk) and abort_hit = '0' then 81 if RS232_ready = '1' then -- writing to buffer error <= '1'; 83 reg1 := error_detected & reg1 (7 downto 1); -- shiftregister for errordata RS232_cnt <= RS232_cnt + 1; --8 bits to shiftregister 85 if RS232_cnt = 7 then -- 8 bits to rs232 register RS232_ready <= '0'; --ready for sending 87 RS232_cnt <= 0; sender_buff(8 downto 1) <= reg1; --Loading register to sender register 89 end if; 91 if data_ready = '1' then 93 cnt <= cnt + 1; errorent <= errorent + 1; -- Index for read-in of error values 95 error_detected <= '0'; --- = 0 as far as no error is detected 97 if errorent < 1024 then ``` ``` if cnt = 255 then 99 shift_cnt <= shift_cnt + 1; -- Leser et annet registersett cnt <= 0; 101 if shift_cnt > 4 then shift\_cnt <= 1; end if; 103 end if; 105 else errorcnt <= 0; abort_hit <= '1'; 107 cnt <= 0; end if; 109 111 if shift_cnt = 1 then 113 if tmpdata1(cnt) /= data_out1(cnt) then —XOR for first register set. error_detected <= '1'; 115 end if; elsif shift_cnt = 2 then if tmpdata2(cnt) /= data_out2(cnt) then -- second.. 117 error_detected <= '1'; end if; 119 elsif shift_cnt = 3 then if tmpdata3(cnt) /= data_out3(cnt) then 121 error detected <= '1'; 123 end if: elsif shift_cnt = 4 then if tmpdata4(cnt) /= data_out4(cnt) then 125 error_detected <= '1';</pre> 127 end if; end if; 129 end if; 131 end if; if RS232_ready = '0' then-RS232 registeret writing 133 write_cnt <= write_cnt + 1;</pre> error <= sender_buff(write_cnt); 135 if write_cnt = 10 then-10 bits are sent 137 write_cnt <= 0; RS232_ready <= '1'; end if; 139 end if; 141 end if; 143 end process; end architecture; ``` Code/xor.vhd #### Appendix B #### Matlab code for error detection ``` function Error_detection (x, y5, timer, VDD) 2 | % Error_detection recieves error data that has been processed by a FPGA. % The FPGA needs to be connected via a serial port. The FPGA processes the 4 % error data based on a expected input and compares (XOR) the input data % with a data expectancy template. The script accepts several OPERATION 6 | % MODES that determine what input (architecture) is to be observed and what % stimuli (template) is imposed on the UUT (TSMC Chip) 8 % % OPERATION MODES: '1' => TMR, template1 '2' 10 % => STD, template1 '3' => DMR, template1 % % 12 '<sub>4</sub>' % => TMR, template2 '5' 14 % => STD, template2 % => DMR, template2 16 % % The plots show: 18 | % Error occurrences -> SET and SEU, not distinguished. % Accumulated errors -> Sum of Error occurrences over all executed data 20 % streams 22 | % Serial port info: BaudRate = 4800, 8bit, stopbit = 1. 24 | % Syntax: Error_detection(x, y5); x number of data streams 26 % Arguments: % y5 is a ASCII string that determines OPERATION MODE 28 % timer is time between data streams, in seconds Error_detection (10, '2', 60) -> 10 runs, template2, one 30 | % Example: % run every minute 32 % (Plots displayed continuously) % 34 % % Created: Amir Hasanbegovic, 2010 April 10 36 % % Last Modified: Amir Hasanbegovic, 2010 April 26 38 % % 40 %%String inits 42 string = 'oooops'; %% Plot Names ``` ``` if strcmp(y5, '1') == 1 string = 'TMR, template1'; elseif strcmp (y5, '2') == 1 46 string = 'STD, template1'; elseif strcmp (y5, '3') == 1 string = 'DMR, template1'; 48 elseif strcmp (y5, '4') == 1 50 string = 'TMR, template2'; elseif strcmp (y5, '5') == 1 52 string = 'STD, template2'; elseif strcmp (y5, '6') == 1 54 string = 'DMR, template2'; elseif strcmp (y5, 'A') == 1 string = 'TMR, template1'; 56 elseif strcmp (y5, 'B') == 1 string = 'TMR, template2'; 58 elseif strcmp (y5, 'C') == 1 string = 'STD, template1'; elseif strcmp (y5, 'D') == 1 62 string = 'STD, template2'; elseif strcmp (y5, 'E') == 1 string = 'TMR, template1'; 64 elseif strcmp(y5, 'F') == 1 string = 'TMR, template2'; 66 end 68 70 % Initiate dataset 72 binTemp = 0; accumulated_error = 0; 74 SET = 0; SEU = 0; 76 \mid SER = 0; 78 \mid seq_nr = 0; seqplot=0; 80 | % Create a serial port object. obj1 = instrfind('Type', 'serial', 'Port', 'COM3', 'Tag', ''); 82 % Create the serial port object if it does not exist 84 % otherwise use the object that was found. if isempty (obj1) obj1 = serial ('COM3'); 86 e l s e 88 fclose (obj1); obj1 = obj1(1); 90 end 92 | % Configure instrument object, obj1. set (obj1, 'BaudRate', 4800); 94 \( \set (obj1 , 'Timeout', 3); 'TimerPeriod', 2.0); %set (obj1, 96 | %set (obj1, 'OutputBufferSize', 10000); %set(obj1, 'InputBufferSize', 10000); 98 set (obj1, 'Readasyncmode', 'continuous'); 100 | % --- set (obj1, 'Parity', 'space'); ``` ``` 102 % Open connection to instrument object, obj1. fopen(obj1); 104 flushinput (obj1); flushoutput (obj1); 106 for i = 1 : x %get(obj1); 108 % Communicating with instrument object, obj1. 110 112 %% increase sequence number seq_nr = seq_nr + 1; 114 seqplot = [seqplot seq_nr]; 116 flushinput (obj1); flushoutput (obj1); 118 pause (timer); 120 fwrite(obj1, y5, 'uint8', 'async') readData = fread(obj1, 128, 'uint8'); 122 data = [readData]; % Converting to binary representation (8 bit presicion) 124 126 temp = 0; y=uint8 (data); 128 for i = 1 : length (y) x1 = int2bin8(y(i)); 130 binTemp = vertcat (binTemp, x1); temp = vertcat (temp, x1); 132 134 % Accumulated error (in a single data stream) accumulated_error = vertcat(accumulated_error ,sum(binTemp)); 136 end 138 % SET Evaluation - to get total SEU, multiply SEU with number of 140 % runs (SEU*seqplot) SEU=[SEU SEU_eval(temp)]; 142 % SET Evaluation - to get total SET, multiply SET with runs (SET*x) 144 % IF a SEU is detected 61 SET are substracted from the SET detection 146 SET = [SET (SET_eval(temp) - (SEU(length(SEU)) *61))]; %Removal of delay induced "SETs" at low supply voltages 148 if strcmp (y5, 'E') == 1 150 if temp(456) == 1 SET(length(SET)) = SET(length(SET)) - 2; 152 if temp (585) = = 1 154 SET(length(SET)) = SET(length(SET)) - 2; end 156 if temp(200) == 1 SET(length(SET)) = SET(length(SET)) - 2; 158 end ``` ``` if temp(649) == 1 160 SET(length(SET)) = SET(length(SET)) - 2; end 162 if temp(840) == 1 SET(length(SET)) = SET(length(SET)) - 2; 164 end if temp(839) == 1 SET(length(SET)) = SET(length(SET)) - 2; 166 end end 168 170 if strcmp(y5, 'F') == 1 if temp(200) == 1 172 SET(length(SET)) = SET(length(SET)) - 2; end 174 if temp(378) == 1 SET(length(SET)) = SET(length(SET)) - 2; 176 end if temp(393) == 1 178 SET(length(SET)) = SET(length(SET)) - 2; end 180 if temp(585) == 1 SET(length(SET)) = SET(length(SET)) - 2; 182 end if temp(840) == 1 184 SET(length(SET)) = SET(length(SET)) - 2; end 186 i f temp(841) == 1 SET(length(SET)) = SET(length(SET)) - 2; 188 end end 190 192 if strcmp(y5, 'B') == 1 if temp(201) == 1 194 SET(length(SET)) = SET(length(SET)) - 2; end 196 end if strcmp(y5, 'C') == 1 198 temp(201) == 1 SET(length(SET)) = SET(length(SET)) - 2; 200 end 202 i f temp(393) == 1 SET(length(SET)) = SET(length(SET)) - 2; 204 end if temp(456) == 1 206 SET(length(SET)) = SET(length(SET)) - 2; end 208 if temp(585) == 1 SET(length(SET)) = SET(length(SET)) - 2; 210 if temp (649) == 1 212 SET(length(SET)) = SET(length(SET)) - 2; end temp(841) == 1 214 SET(length(SET)) = SET(length(SET)) - 2; 216 end ``` ``` end 218 if strcmp(y5, 'D') == 1 220 if temp(265) == 1 SET(length(SET)) = SET(length(SET)) - 2; 222 end if temp(393) == 1 224 SET(length(SET)) = SET(length(SET)) - 2; end 226 if temp (585) = = 1 SET(length(SET)) = SET(length(SET)) - 2; 228 end if temp(841) == 1 SET(length(SET)) = SET(length(SET)) - 2; 230 end 232 end 234 %Draw output subplot (2, 2, 1); 236 y1 = plot(binTemp); title ('Raw data form FPGA'); 238 ylabel('Uncorrelated events'); xlabel('FPGA sample'); 240 % Good for survey during the experiment 242 subplot (2,2,2); y2 = plot (accumulated_error); title (sprintf ('Accumulated errors, VDD = %d -> SETUP: %s.', VDD, string)) 244 ylabel('Upsets'); 246 xlabel('RS232 package (FPGA sample/8 bit)'); 248 subplot (2, 2, 3); y2 = plot (seqplot, SET); 250 title (sprintf ('Single event transients, VDD = %d -> SETUP: %s.', VDD, string)); ylabel('SET'); xlabel('Run Sequence'); 252 254 subplot (2, 2, 4); y2 = plot (seqplot, SEU); title(sprintf('Single event upsets, VDD = %d -> SETUP: %s', VDD, string)) 256 ylabel('SEU'); 258 xlabel ('Run Sequence'); 260 store = sprintf('VDD = %d -> SETUP: %s',VDD, string); %drawnow update 262 save accumulated_error; save binTemp; 264 save SEU; save SET; 266 end 268 | % Disconnect from instrument object, obj1. fclose (obj1); ``` ## Appendix C ## **PCB** Layout Figure C.1 PCB layout ### Appendix D # ST Microelectronics chip (90 nm general purpose process) **Figure D.1** ST Microelectronics chip with level shifter and a D flip flop based minority-3 gates and inverters ## Appendix E ## TSMC chip (90 nm low power process) Figure E.1 TSMC chip for radiation testing