# Ultra Low-Power Fault-Tolerant SRAM Design in 90nm CMOS Technology A Thesis Presented to the College of Graduate Studies and Research In Fulfillment of the Requirement For the Degree of Master of Science In the Department of Electrical and Computer Engineering University of Saskatchewan Saskatoon, Saskatchewan Canada By # Kuande Wang © Copyright Kuande Wang, June 2010. All rights reserved. #### PERMISSION TO USE In presenting this thesis in partial fulfilment of the requirements for a Postgraduate degree from the University of Saskatchewan, I agree that the libraries of this University may make it freely available for inspection. I further agree that permission for copying of this thesis in any manner, in whole or in part, for scholarly purposes may be granted by the professor or professors who supervised my thesis work or, in their absence, by the Head of the Department or the Dean of the College in which my thesis work was done. It is understood that any copying or publication or use of this thesis or parts thereof for financial gain shall not be allowed without my written permission. It is also understood that due recognition shall be given to me and to the University of Saskatchewan in any scholarly use which may be made of any material in my thesis. Requests for permission to copy or to make other use of material in this thesis in whole or part should be addressed to: Head of the Department of Electrical and Computer Engineering 57 Campus Drive University of Saskatchewan Saskatoon, Saskatchewan, Canada S7N 5A9 #### **ACKNOWLEDGEMENTS** First, I would like to express my sincere gratitude and appreciation to my supervisor, Dr. Li Chen, for his tremendous support, invaluable guidance and constant encouragement during the course of my studies. The completion of this thesis would not have been possible without Dr. Chen's exceptional supervision and everlasting support. I am also grateful to him for providing me with various opportunities to pursue a dynamic and fascinating area of digital systems as well as explore opportunities out of the lab. I also wish to thank all the members of VLSI lab; working with them made my time during graduate study a wonderful experience. A countless and sincere thanks also goes to my parents and my girlfriend, for their continuous support and encouragement throughout my studies. #### **ABSTRACT** With the increment of mobile, biomedical and space applications, digital systems with low-power consumption are required. As a main part in digital systems, low-power memories are especially desired. Reducing the power supply voltages to sub-threshold region is one of the effective approaches for ultra low-power applications. However, the reduced Static Noise Margin (SNM) of Static Random Access Memory (SRAM) imposes great challenges to the subthreshold SRAM design. The conventional 6-transistor SRAM cell does not function properly at sub-threshold supply voltage range because it has no enough noise margin for reliable operation. In order to achieve ultra low-power at sub-threshold operation, previous research work has demonstrated that the read and write decoupled scheme is a good solution to the reduced SNM problem. A Dual Interlocked Storage Cell (DICE) based SRAM cell was proposed to eliminate the drawback of conventional DICE cell during read operation. This cell can mitigate the singleevent effects, improve the stability and also maintain the low-power characteristic of subthreshold SRAM, In order to make the proposed SRAM cell work under different power supply voltages from 0.3 V to 0.6 V, an improved replica sense scheme was applied to produce a reference control signal, with which the optimal read time could be achieved. In this thesis, a 2K×8 bits SRAM test chip was designed, simulated and fabricated in 90nm CMOS technology provided by ST Microelectronics. Simulation results suggest that the operating frequency at $V_{DD}$ = 0.3 V is up to 4.7 MHz with power dissipation 6.0 $\mu$ W, while it is 45.5 MHz at $V_{DD}$ = 0.6 V dissipating 140 $\mu W$ . However, the area occupied by a single cell is larger than that by conventional SRAM due to additional transistors used. The main contribution of this thesis project is that we proposed a new design that could simultaneously solve the ultra low-power and radiation-tolerance problem in large capacity memory design. *Key words* — SRAM, sub-threshold, Dual Interlocked cell (DICE), fault tolerance, Single Event Upset (SEU) # TABLE OF CONTENTS | PERMISSION TO USE | i | |--------------------------------------------------------------|-----| | ACKNOWLEDGEMENTS | ii | | ABSTRACT | iii | | TABLE OF CONTENTS | iv | | LIST OF FIGURES | vi | | LIST OF TABLES | ix | | LIST OF ABBREVIATIONS | X | | CHAPTER 1 BACKGROUND AND MOTIVATION | 1 | | 1.1 Ultra Low-Power Design | 1 | | 1.2 Single Event Upset Tolerant Design | 3 | | 1.3 Thesis Outline | | | CHAPTER 2 INTRODUCTION | 5 | | 2.1 Conventional SRAM Cell and SNM | 5 | | 2.2 Challenges for Sub-Threshold SRAM Design | 8 | | 2.2.1 Stability of SRAM Cells | 8 | | 2.2.2 Sense Amplifier Problems | 9 | | 2.2.3 Reduced Number of Cells Per Bitline | 9 | | 2.3 Previous Sub-Threshold SRAM Design | 11 | | 2.3.1 Verma's 8T Sub-Threshold SRAM | 11 | | 2.3.2 Kim's 10T Sub-Threshold SRAM | 13 | | 2.3.3 Jinhui Chen's Sub-Threshold Register File Design | 15 | | 2.3.4 Zhai's 6T Sub-Threshold SRAM design | 16 | | 2.3.5 Chang's 10T Sub-Threshold SRAM Design | 17 | | 2.3.6 Sub-Threshold SRAM Design Summary | 18 | | 2.4 Single-Event Upset in SRAM Design and Mitigation Methods | 18 | | CHAPTER 3 FAULT TOLERANT ULTRA LOW-POWER SRAM DESIGN | 23 | | 3.1 | Overview of the Chip Design | 23 | |-----|-------------------------------------------------------------------------|----| | 3.2 | Detailed SRAM Design | 25 | | 3. | 2.1 Fault Tolerant Sub-Threshold SRAM Cell Design | 25 | | 3. | 2.2 Replica Technique for Read Operation Employing Dummy Column and Row | 28 | | 3. | 2.3 Sense Amplifier Design | 31 | | 3. | 2.4 Boost Circuit Design | 33 | | 3. | 2.5 Row and Column Decoders | 34 | | 3. | 2.6 Address Transition Detector (ATD) | 35 | | 3. | 2.7 Input / Output Buffer Design | 37 | | 3.3 | Read and Write Control Sequences | 38 | | 3. | 3.1 Read Control Sequence | 38 | | 3. | 3.2 Write Control Sequence | 40 | | 3.4 | Layout of the SRAM Design | 42 | | | | | | CHA | APTER 4 DESIGN SIMULATION AND VERIFICATION | | | 4.1 | Verification of Critical Components in SRAM Chip | | | | 1.1 SRAM Cell Simulation | | | | 1.2 Sense Ampplifer Simulation | | | | 1.3 Verification of Voltage Boosting Circuit | | | 4. | 1.4 Verification of ATD Circuit | | | 4.2 | Verification for the Overall SRAM Chip | | | 4.3 | Performance Simulation of the SRAM Design | 54 | | CHA | APTER 5 CHIP TESTING RESULTS | 56 | | 5.1 | SRAM Functional Testing | 56 | | 5. | 1.1 Testing Board Configuration | 56 | | 5. | 1.2 Functional Testing Results | 59 | | 5.2 | Single Event Upset Tolerance Testing | 62 | | 5.3 | Testing Results Analysis | 63 | | CHA | APTER 6 CONCLUSION AND FUTURE WORK | 65 | | 6.1 | Summary and Conclusions | 65 | | 6.2 | Future Work | 66 | ## **LIST OF FIGURES** | Figure 2.1 The conventional 6-transitor SRAM cell | 5 | |-----------------------------------------------------------------------|----| | Figure 2.2 The cross coupled inverter with noise source included [5] | 6 | | Figure 2.3 SNM estimation based on maximum square [7] | 7 | | Figure 2.4 Circuit implementation of SNM simulation [7] | 7 | | Figure 2.5 Leakage current in one bitline | 10 | | Figure 2.6 Verma's 8T cell [17] | 12 | | Figure 2.7 Charge pump circuit for read buffer feet [17] | 12 | | Figure 2.8 Kim's 10T cell [18] | 13 | | Figure 2.9 Read buffer circuit used in sensing | 14 | | Figure 2.10 Register file cell design proposed by Chen [19] | 15 | | Figure 2.11 6T SRAM cell design in [20] | 16 | | Figure 2.12 Chang's sub-threshold SRAM cell [22] | 17 | | Figure 2.13 SEU in SRAM cell | 19 | | Figure 2.14 SEU tolerant DICE cell | 20 | | Figure 2.15 Single event transient responses in DICE cell (from [30]) | 21 | | Figure 3.1 The simplified SRAM test chip architecture | 23 | | Figure 3.2 Simulation waveforms for upsets in DICE cell | 26 | | Figure 3.3 Proposed SEU tolerant sub-threshold SRAM cell | 27 | | Figure 3.4 Proposed cell layout | 28 | | Figure 3.5 Dummy cells in replica technique | 30 | | Figure 3.6 Self-timed sensing scheme | 30 | | Figure 3.7 Differential latch-typed sense amplifier | 32 | | Figure 3.8 Voltage at cell nodes with and without boosted write voltage | 33 | |-------------------------------------------------------------------------|----| | Figure 3.9 Write word line boost circuit | 34 | | Figure 3.10 Row and column decoders | 35 | | Figure 3.11 Address transition detector circuit | 36 | | Figure 3.12 Inverter chain based delay circuit | 36 | | Figure 3.13 Bidirectional data bus | 37 | | Figure 3.14 The complete read control circuit | 38 | | Figure 3.15 Read operation timing sequence | 39 | | Figure 3.16 Write control circuit | 40 | | Figure 3.17 Write operation timing sequence | 41 | | Figure 3.18 The whole SRAM layout design | 42 | | Figure 4.1 Single event tolerance of the proposed cell | 45 | | Figure 4.2 Simulation waveform of sense amplifier | 46 | | Figure 4.3 Monte-Carlo simulation of sense amplifier | 47 | | Figure 4.4 Boost circuit simulation waveform | 48 | | Figure 4.5 Simulation waveforms of ATD | 49 | | Figure 4.6 Simulation waveform for supply voltage at 0.3 V | 50 | | Figure 4.7 Simulation waveform for supply voltage at 0.4 V | 51 | | Figure 4.8 Simulation waveform for supply voltage at 0.5 V | 52 | | Figure 4.9 Simulation waveform for supply voltage at 0.6 V | 53 | | Figure 5.1 Schematic of functional testing board | 57 | | Figure 5.2 Simple on-chip level shifter for input signals | 58 | | Figure 5.3 SRAM scanning process for read functionality testing | 60 | | Figure 5.4 | SRAM scanning process f | or write operation | testing | 61 | |---------------|-------------------------|--------------------|---------------|----| | $\mathcal{C}$ | $\mathcal{O}$ 1 | 1 | $\mathcal{C}$ | | # LIST OF TABLES | Table 2.1 | Hold SNM under various supply voltages in different technologies [9] | 9 | |-----------|------------------------------------------------------------------------|------| | Table 2.2 | comparison of SNM of 10T and conventional 6T cell | . 13 | | Table 3.1 | IO pads used in SRAM chip | . 43 | | Table 4.1 | Performance simulation results with different power supplies | . 54 | | Table 4.2 | Standby leakage current comparison of three Cells | . 55 | | Table 5.1 | Main devices used in testing board | . 59 | | Table 5.2 | Data collected from Address 0 – 15 without writing | . 60 | | Table 5.3 | Data collected from Address 0 – 15 with Data Pattern "0x00" written in | 63 | #### LIST OF ABBREVIATIONS ATD Address Transient Detection BL Bitline CMOS Complementary Metal-Oxide-Semiconductor CQFP Ceramic Quad Flat Package DAC Dummy Active Cell DFT Design For Test DICE Dual-Interlocked storage Cell DLT Differential Latch-Typed DRBL Dummy Read Bitline DSC Dummy Sleeping Cell DUT Device Under Testing DWWL Dummy Write Wordline ECC Error Correction Coding ESD Electro Static Discharge FPGA Field Programmable Gate Array GBV Global Boosted Voltage HVT High Voltage Transistor IO Input and Output LVS Layout Versus Schematic PC Personal Computer PCB Printed Circuit Board RBL Read Bitline RD Read signal RHBD Radiation Harden By Design RWL Read Wordline SA Sense Amplifier SEE Single Event Effect SEN Sense amplifier Enable signal SEQ Sense amplifier Equalization signal SEU Single Event Upset SNM Static Noise Margin SRAM Static Random Access Memory SSSC Saskatchewan Structural Sciences Centre SVT Standard Voltage Transistor VGND Virtual Gound VVDD Virtual VDD WBL Write Bitline WL Wordline WR Write signal WWL Write Wordline #### **CHAPTER 1** #### **BACKGROUND AND MOTIVATION** There are a large number of register files and SRAMs (Static Random Access Memory) utilized in modern microprocessors for high speed computation. These data-storing sub-systems occupy a significant area of the whole microprocessor chips. Therefore, the leakage current of a memory contributes a large partition of its total power consumption. With the increased complexity of the microprocessors and digital signal processors, on-chip register files and SRAMs are expected to increase significantly while the high speed demand is insisted. The motivation of ultra low-power fault-tolerant memory design comes from two emerged requirements, ultra low-power consumption and Single-Event Upset (SEU) tolerance. #### 1.1 Ultra Low-Power Design With the increment of mobile, biomedical and space applications these years, there is an aggressive demand to reduce power consumption and improve the reliability of semiconductor systems, while the speed is the secondary consideration. In biomedical applications, for both invasive and non-invasive medical monitoring systems, low-power, or even ultra low-power dissipation of the devices is desired due to the light-weight and long-term requirements of such systems. In most recent realizations, power is always the most critical limitation since heavy battery module will make wearable monitoring systems inconvenient to patients [1]. In battery free (using power harvest device instead of battery) monitoring systems, this constraint of power dissipation is extremely critical since the energy harvesting module can provide only small amount of power consumed by normal semiconductor devices. In space applications, most of the power is provided by the solar panels during five to ten years of the satellite lifetime. If the power consumption of devices is lowered, the number of solar panels, i.e. the weight can be reduced, or more equipment can be carried on the satellites. Also, although the rechargeable battery is applied, battery operation time is still a limitation for most mobile consumer electronics. The most efficient method to reduce the overall power consumption in semiconductor devices is to lower the supply voltage. As the supply voltage reduced, both static and dynamic power dissipation are reduced. Static power is reduced due to the reduced leakage current. Equation (1.1) indicates the relationship between supply voltage and dynamic power dissipation. $$P_{dyn} \propto f \cdot V_{DD}^2 \cdot C \tag{1.1}$$ Where, f is operation frequency of the circuits, $V_{DD}$ is the power supply, and C is the load capacitance switched between ground and $V_{DD}$ . The ultra low-power performance can be achieved by aggressively lowering $V_{DD}$ to sub-threshold range of CMOS transistors. However, when the supply voltage is decreased to sub-threshold range, the conventional 6-Transistor SRAM will no longer be functional due to the significantly reduced Static Noise Margin (SNM), while combinational logic can still work well only at the price of the reduced speed. Therefore, in order to implement ultra low-power by using sub-threshold operation, the initial work is to design a functional memory such as SRAM in sub-threshold voltage region. #### 1.2 Single Event Upset Tolerant Design Single Event Effects (SEE) in microelectronics are caused when energetic particles present in the natural space environment (e.g., protons, neutrons, alpha particles, or other heavy ions) strike sensitive regions or paths of a microelectronic circuit [2]. If a change of logic state occurs due to the SEE, it is called Single Event Upset (SEU). For SEUs in memory, there were several reports both in terrestrial and space microelectronics back to 1970's [3, 4]. They were either induced by alpha-particle contaminant packages for terrestrial memories or cosmic-rays for satellites' memory subsystem. With the scaling of CMOS technologies, SEUs become a major concern when designing highly reliable microelectronics. Especially in space applications, even the error rate of one per day cannot be ignored [2]. When a SRAM is working under sub-threshold voltage, it becomes more vulnerable to SEUs since the noise margin is restricted by the decreased supply voltage compared to the SRAMs working under normal voltage [5]. Therefore, if we want to implement ultra low-power SRAMs in space or reliability critical applications, techniques to mitigate SEU should be applied. Error Correction Coding (ECC) is a general approach to solve this problem. However, when multi-bit upset occurs, ECC may not be suitable since the price of complicated coding and decoding scheme for multi-bit correction is too high. It will bring large overhead and high power consumption, as well as the degraded speed performance. Therefore, other Radiation-Harden-By-Design (RHBD) techniques targeted at the memory cells should be explored. Although some attempts to employ storage cell-based RHBD techniques in sub-threshold memory design (register file) do exist now [6], the data storage SRAM with reasonable cell density still needs to be studied due to its essentiality in ultra low-power radiation-tolerant applications. #### 1.3 Thesis Outline The rest of this thesis is organized as follows. In Chapter 2, SRAM design and the stability issue is introduced first. Then the challenges in sub-threshold SRAM design will be examined. After that, previous ultra low-power sub-threshold SRAM designs will be introduced and reviewed, as well as the detailed methods which they used to solve the problems in sub-threshold operation. The introduction to SEU and its mechanism in SRAMs, as well as previous SEU mitigation methods are also given in Chapter 2. The design of the sub-threshold DICE-based SRAM is given in Chapter 3. The whole architecture, as well as the detailed designs are introduced, such as the SRAM cell, sense amplifier, Address Transient Detector (ATD), decoder and read and write sequences generation circuits, etc. The chip layout and the pads design are also included. Chapter 4 will show some simulation results about the SRAM design in 90nm CMOS technology, including both components and the whole chip design simulation. This SRAM design was fabricated with 7 metal-layer ST 90nm CMOS technology and has a core size of $520\mu m \times 600\mu m$ . The testing results of the fabricated chip will be shown in Chapter 5, following the description of test PCB circuit architecture. Conclusion and future work to improve this design will be given in Chapter 6. ## **CHAPTER 2** ## **INTRODUCTION** Like most sequencing elements in digital systems, memory cells used in on-chip memories can be divided into static and dynamic structures. While dynamic structure uses a capacitor, static structure employs cross-coupled inverters to keep the data. Static memories are faster and more stable, but require more area per bit. #### 2.1 Conventional SRAM Cell and SNM The most popular and widely implemented structure in commercial SRAMs is the conventional 6-Transistor (6T) structure, shown in Figure 2.1. Figure 2.1 The conventional 6-transitor SRAM cell In microelectronics, cross-coupled inverter pair in the 6T cell is the foundation of the static storage elements, including register files, SRAMs, latches and flip-flops. As indicated in [7], there are two important aspects in SRAM design, one is cell area and the other is the stability. The former can be reduced by technology scaling down, but the later aspect is always a significant issue. To analyze the stability of SRAM cells, a number of researches have proposed different ways to define stability of SRAM [7, 8]. Static Noise Margin (SNM) and its analytic criteria proposed by Seevinck [8], are widely used to evaluate the stability of SRAM cells. As shown in Figure 2.2, the SNM is defined as the maximum value of $V_n$ that can be tolerated by the SRAM cell before changing states. *INV1* and *INV2* are the two cross-coupled inverters in cell. Figure 2.2 The cross coupled inverter with noise source included [5] The SNM can be found during read operation and at standby mode of SRAM cells, called read SNM and hold SNM respectively. During read operation, N3 and N4 in Figure 2.1 are turned on, BL and $\overline{\text{BL}}$ are connected to $V_{DD}$ , while in standby mode, N3 and N4 are turned off. Therefore, read SNM is much smaller than hold SNM, which means read SNM is much more critical in studying the stability of SRAM cell. In the following part, SNM is always referred to read SNM unless specified. From the I-V characteristic of the cross-coupled inverters, shown in Figure 2.3, we can estimate the SNM according to the maximum squares inside the two I-V curves. The edge length of this maximum square is quantitatively measured as the SNM. In designing SRAM cells, we always pursue the larger maximum square, which means the higher SNM, and the higher stability of the cell. Figure 2.3 SNM estimation based on maximum square [7] Besides this estimation method, a convenient simulation method was also proposed by Seevinck [7]. The difference between the two outputs, $V_{out1} - V_{out2}$ , can be calculated from simulation. The maximum value of $V_{out1} - V_{out2}$ is equal to the diagonals of the maximum squares. SNM can be obtained by multiplying the smaller one of the two by $1/\sqrt{2}$ . Figure 2.4 Circuit implementation of SNM simulation [7] Seevinck's method provides the foundation of studying SRAM stability and gives us the criteria of designing SRAM cells. In the following parts of this thesis, this method is applied when comparing the stability of different cell designs and same design under different supply voltages. #### 2.2 Challenges for Sub-Threshold SRAM Design There are several challenges when the supply voltage is scaled down to sub-threshold region, including the stability of cells, reduced number of cells per bitline and sense amplifier problems. #### 2.2.1 Stability of SRAM Cells When a SRAM is working with the standard supply voltage in a certain technology, the SNM of the 6T cell is large enough to make sure that the stability is sufficient, and the soft error rate is acceptable. However, if the supply voltage is lowered down to near-threshold, the noise margin decreases significantly. Table 2.1 shows the hold SNM under different $V_{DD}$ in 130nm, 90nm and 65nm CMOS technology (Part of data from [9]). At 1 V voltage supply, the hold SNMs are 356.2 mV, 346.0 mV and 335.2 mV respectively. However, when the supply voltage is reduced to near-threshold, the SNMs are only 153.8 mV, 148.4 mV and 142.5 mV for 0.4 V $V_{DD}$ in 130nm, 90nm and 65nm technologies respectively. They are less than one half of those at 1 V $V_{DD}$ . It was reported that for conventional 6T cell SRAM, $V_{DD}$ cannot be scaled down below 0.7 V for a SRAM to function well in 65nm CMOS technology [10, 11]. Table 2. 1 Hold SNM under various supply voltages in different technologies [9] | Technology | $V_{DD}(V)$ | SNM (V) | |------------|-------------|---------| | | 1.0 | 0.3562 | | 130nm | 0.8 | 0.3019 | | | 0.6 | 0.2334 | | | 0.4 | 0.1538 | | 90nm | 1.0 | 0.3460 | | | 0.8 | 0.2949 | | | 0.6 | 0.2273 | | | 0.4 | 0.1484 | | 65nm | 1.0 | 0.3352 | | | 0.8 | 0.2880 | | | 0.6 | 0.2218 | | | 0.4 | 0.1425 | #### 2.2.2 Sense Amplifier Problems Another critical component in SRAM design is the sense amplifier. The performance of sense amplifier strongly affects the speed and power consumption of whole SRAM design. Therefore it needs to be examined carefully under sub-threshold operation. There are various sense amplifiers used in previous super-threshold SRAM designs [12, 13, 14, 15, 16]. However, only those sense amplifiers that can function in wide power supply range and are fast enough are suitable in sub-threshold SRAM design. Since all the transistors in sense amplifiers work at ultra low voltage, the speed is significantly decreased compared to the normal operation. #### 2.2.3 Reduced Number of Cells per Bitline When the conventional SRAM cells are used in sub-threshold design, the total number of the cells per bitline is reduced due to the reduced ratio between the read current $I_{Read}$ of accessed cell and the leakage current $I_{Leakage}$ of the un-accessed cells, shown in Figure 2.5. This phenomenon limits the cell density of memory chip. Figure 2.5 Leakage current in one bitline It was reported that at most 64 cells could be attached to one bitline if no other technique was applied [17]. This number is too small compared to more than 256 cells attached to one bitline for most commercial available memories. Other components in SRAM such as address decoder, address transient detector, read and write sequences generation circuits, etc., are also to be designed properly, though they are not as critical as cell and sense amplifier designs. In conclusion, the design of sub-threshold memories faces great challenges compared to conventional super-threshold memories design due to the aggressively reduced supply voltage. #### 2.3 Previous Sub-Threshold SRAM Design There are a number of sub-threshold SRAM designs introduced previously. Different structures were employed to increase the read SNM and obtain acceptable stability of cells. Since the hold SNM is much larger than the read SNM for conventional 6T cells, most of the previous designs applied a read and write decoupled structure to decouple read and write operation, which can increase the read SNM to the level of the hold SNM. Therefore, the noise margin is not reduced during read operation. The price of this increased noise margin is the additional transistors to decouple the read and write operation. #### 2.3.1 Verma's 8T Sub-Threshold SRAM Minimum power supply voltage 350mV was achieved by an 8T sub-threshold SRAM proposed by Verma in a 65nm CMOS technology [17]. The schematic of the cell is shown in Figure 2.6. To address the challenges of sub-threshold SRAM, this two-port cell topology has a 6T storage cell and a 2T read-buffer which isolates the data-retention structure during read-accesses. As in Figure 2.6, when node *Q* is low and *QB* is high (data "0" is stored in the cell) during read operation, transistor *N5* is turned on and read bitline RBL is discharged and consequently data "0" is read out. This 8T structure has two additional transistors to achieve higher read SNM. Figure 2.6 Verma's 8T cell [17] In order to add more cells attached to the bitline, "zero leakage read buffer" is applied in Verma's design. During read operation, "Buffer foot" of the accessed cell is set to GND while that of the other un-accessed cells are set to $V_{DD}$ . Since the read bitline RBL is pre-charged to $V_{DD}$ , there is no leakage current from the un-accessed cells, which can result in the increasing of cell number attached to one bitline. Up to 256 cells can be attached to one bitline. Also, in order to make the buffer foot be able to sink enough current to drive more cells in one row, a charge pump circuit is implemented, as shown in Figure 2.7. Furthermore, the sense amplifier redundancy technique is used to reduce the offset induced sensing error for a given area constraint [17]. Figure 2.7 Charge pump circuit for read buffer feet [17] #### 2.3.2 Kim's 10T Sub-Threshold SRAM A 10T 0.2 V 1K cells per bitline sub-threshold SRAM design employing 130nm CMOS technology was proposed by Kim [18]. Figure 2.8 shows the 10T cell design. It consists of a conventional 6T structure and four decoupled read access transistors. Figure 2. 8 Kim's 10T cell [18] The write bitlines (WBL and WBLB) and the read bitline (RBL) are pre-charged to $V_{DD}$ before the cell is accessed. During read operation, if QB is high and Q is low ("0" is stored in cell), RBL will be discharged and the data "0" will consequently be read out. Otherwise, the RBL will be kept at $V_{DD}$ , then data "1" will be read out. Regarding to the stability of this cell, the SNM reported in [18] is listed in table 2.2. Table 2. 2 comparison of SNM of 10T and conventional 6T cell | $V_{DD}$ | 10T proposed in [18] | Conventional 6T | |----------|----------------------|-----------------| | 0.2 V | 82 mV | 24 mV | | 1.2 V | 534 mV | 105 mV | The SNM of the 10T cell increases to more than three times of that of the conventional 6T cell, which makes the cell more stable during read operation. In order to attach more cells to one bitline, the four read access transistors were designed so that the bitline leakage current is independent of the stored data. When the SRAM is unaccessed, the drain voltage of P3 is $V_{DD}$ and forces the leakage current to flow from the cell into the bitline regardless of the data stored in the cell. If the accessed cell has data "0", then the level of RBL is decided by the balance of the pull-down read current of the cell and the pull-up leakage current of the rest un-accessed cells. If the accessed cell has data "1", then the level of RBL will reserve at $V_{DD}$ . By doing so, the bitline voltage swing during read operation can reach as high as 130 mV at a 0.2 V supply voltage for a 1k cell bitline. The sense amplifiers are replaced with static inverter-type read buffers in this design, shown in Figure 2.9(a). a. Read buffer with VGND b. Read buffer with zero ground level Figure 2.9 Read buffer circuit used in sensing The virtual ground VGND is obtained from a dummy column which can generate the lower level as the ground level used in read buffers. This can make the trip point of the read buffer set to the middle of the RBL's logic high and low levels. Therefore the sensing margin of the read buffer is improved compared to conventional inverter-type read buffer which has the zero ground level, shown in Figure 2.9(b). #### 2.3.3 Chen's Sub-Threshold Register File Design Register file cells are similar to SRAM cells and sometimes can be built from SRAM cells. Chen proposed a 10T sub-threshold register file design fabricated in 130nm CMOS technology. The schematic of this cell is shown in Figure 2.10 [19]. Figure 2.10 Register file cell design proposed by Chen [19] During read operation, when data "1" is stored in the cell (Node X is high), the read bitline RBL pre-charged at $V_{DD}$ is driven low. In order to increase write ability of this cell, a transistor Pgate is used to cut off the feedback path of the cell. This design applies hierarchy structure for bitline and hence has global and local bitlines. Due to the small number of cells attached to local bitline, it uses only two stacked read access transistors, N1 and N2. $V_{DD}$ can be decreased to as low as 103 mV for read operation and 129 mV for write operation. The full chip can function properly at 216 mV with frequency of 260 KHz. #### 2.3.4 Zhai's 6T Sub-Threshold SRAM design In order to increase the integrity of SRAM chip in sub-threshold operation, Zhai proposed a 6T SRAM cell design fabricated in 130nm CMOS technology [20], shown in Figure 2.11. Figure 2.11 6T SRAM cell design in [20] The read and write access transistors are carefully sized to get enough SNM during read operation. Since the access transistors have relatively smaller size, write ability is constrained. To deal with this issue, virVDD and virGND are applied in this design. During write operation, level of virVDD decreases and level of virGND increases (but still at levels that un-accessed cell value can be held). In this way, the write ability of the cell increases significantly. To address the problem of reduced $I_{Read}/I_{Leakage}$ ratio, there are only 16 cells attached to one bitline, which may cause area problem since the less number of cells per bitline means the lower cell density. The authors solved this problem by using inverters to read the bitline instead of sense amplifiers. The area of the relatively high number of inverters can be compensated by the eliminated sense amplifiers. It is reported that this SRAM can function properly at the supply voltage as low as 193mV, and it is already used in a sensor processor [21]. #### 2.3.5 Chang's 10T Sub-Threshold SRAM Design For all the above sub-threshold cell designs, there is a common drawback, i.e. single-ended read access structure, which can lead to reduced bitline swing due to the leakage current of bitlines. In order to address this problem, a differential sub-threshold SRAM cell design was proposed by Chang [22]. It was fabricated in 90nm CMOS technology and its schematic is shown in Figure 2.12. Figure 2.12 Chang's sub-threshold SRAM cell [22] In this design, during read operation, wordline WL is high and bitlines BL or $\overline{\rm BL}$ is driven from the pre-charged $V_{DD}$ to low according to the values at node Q and QB. The virtual ground VGND should be grounded to zero during read operation. In standby mode, VGND will rise to $V_{DD}$ to lower the leakage current from the bitlines to VGND and consequently more cells can be attached to the bitline. Since the two stacked transistors for write operation decrease the write ability of the cell, a boosted voltage 4/3 $V_{DD}$ is applied to the gate of transistors N1, N3 and N2, *N4* without enlarging size of these write access transistors, hence to save area. In this design, a latch-type sense amplifier is used to minimize the sensing margin compared to the above inverter-type read buffers. Another merit of this sub-threshold SRAM design is to interleave the bits in one byte, which can effectively reduce the multi-bit soft error rate. Since the single-bit soft error could be easily corrected by conventional Error Correction Coding (ECC), this bit-interleaving technique shows great potential in improving the performance of the whole SRAM chip. According to [22], this CMOS 90nm SRAM design can function properly at the supply voltage as low as 160 mV at 500 Hz frequency. #### 2.3.6 Sub-Threshold SRAM Design Summary From all the above sub-threshold SRAM designs, we can notice that the most critical part is to improve the read SNM of the cell. They use either read and write decouple scheme or sizing the read and write access transistors to obtain higher SNM. We can also conclude that once read SNM is increased to the level of hold SNM for conventional 6T cell, the cell can function properly at as low as 200 mV supply voltage. The second issue in sub-threshold SRAM design is the reduced $I_{Read}/I_{Leakage}$ ratio. Various ways were proposed to address this problem. As a key component of the conventional SRAM designs, sense amplifiers are not always applied in sub-threshold design, since the read speed at such low voltage is not the top priority. Therefore, inverter-type read buffers can satisfy the speed requirement and are often applied. #### 2.4 Single-Event Upset in SRAM Design and Mitigation Methods After the discussion about the sub-threshold SRAM design, the single-event upset in SRAM should be discussed if high reliability is required. In a modern processor, in order to improve the speed and functional complexity, a large number of register file/SRAM need to be implemented in the processor chip. Therefore, these storage devices occupy a large portion of the chip size, and consequently capture a large portion of the particles in radiation environment. Unlike the other combinational parts of the processor, memories can "remember" the SEU induced by the particles and affect the functionality of the processor. In instruction and data memory of a processor or the configuration bit in a SRAM-based FPGA, SEU can result in big disaster for the system. In addition, for standalone SRAM chips, avoiding SEUs is similarly vital for the performance of the whole digital circuit system, especially for high reliable applications. To mitigate the effects of single event, the mechanism of SEU needs to be studied first. There are a number of SEU mechanisms for SRAM discussed by previous publications [23-28]. The most accepted and simulation confirmed mechanism is summarized by Dodd [28]. As shown in Figure 2.13, node Q is low and node QB is high, i.e. transistors N2, P1 are off and N1, P2 are on. Figure 2.13 SEU in SRAM cell Normally, there are two sensitive locations in SRAM hold state, i.e. the reverse-biased drain junctions of the two transistors in off state, namely N2 and P1. When an energetic particle strikes one of these sensitive locations in a SRAM cell (for example, particle strikes transistor N2), charge collected by the junction results in a transient current in the struck transistor. As this current flows through the struck transistor, the restoring transistor P2 sources current in an attempt to balance the particle-induced current. However, P2 has only a finite amount of current drive, namely a finite channel conductance. Therefore, the current flowing through the P2 causes a voltage drop at node QB. If this voltage drop is large enough, it will be locked due to the positive feedback of the cross-coupled inverters. The voltage transient pulse in response to the single-event current transient is actually the mechanism that causes upset in SRAM cells, and it is essentially similar to a write voltage pulse at one node and can cause the wrong memory state to be locked into the memory cell. To address this problem, various SRAM structures have been proposed [29-31], among which, the Dual-Interlocked storage Cell (DICE) is the most popular one and considered as the most robust radiation-hardened memory cell [30], shown in Figure 2.14. Figure 2.14 SEU tolerant DICE cell This upset immune DICE cell design employs two conventional cross-coupled inverter latch structures and has double redundancy. The four nodes *X0*, *X1*, *X2* and *X3* store the data as two pairs of complementary values (i.e., 1010 or 0101) which are simultaneously accessed using transmission gates for write or read operation. Assuming that the DICE cell is at hold state and data 0101 is stored (i.e. nodes X0-X3 are 0101 respectively). If a particle strikes transistor N2 and a transient current on N2 is induced, there will be a voltage drop at node X1. Therefore, transistor P3 is on and it attempts to drive node X2. However, the drive capacity of N3 is larger than P3, which results in only slightly level increase of node X2. Meanwhile, the voltage drop at node X1 will turn off the transistor N1 and leaves the node X4 driven by only leakage current in P1 and N1. The capacitor at node X1 will make the level keep at the original value if the single-event transient time is not too long. The single-event transient response of the above case is shown in Figure 2.14(a). Figure 2.15 Single event transient responses in DICE cell (from [30]) If a particle strikes transistor *P3* and voltage level at node *X2* increase, *X1* will flip and *X3*, *X1* will keep at their original values due to their capacitance, shown in Figure 2.15(b). In conclusion, the DICE cell is immune to both negative and positive voltage level fluctuations induced by single event and it is widely used as register file cells and memory units in single-event tolerant circuit systems. There is also attempt to use the DICE cell in sub-threshold memory design in [6]. In this design, a 32×18 bits sub-threshold radiation tolerant register file design is proposed. To further mitigate the single-event upset in a cell, the four nodes of the DICE cell in the layout are interleaved. In this way, multi-node transient can be avoided. Therefore, SEUs in DICE cell can be corrected since no single node transient is locked in the cell. This DICE based register file chip can work at 206 mV supply voltage. Since most of the sub-threshold SRAM designs introduced in Section 2.3 only deal with the ultra low-power issue, and only a register file design which is not suitable for mass data storage due to the reduced number of cells per bitline was proposed [6], a radiation-tolerant SRAM design with reasonable cell density is in great demand. In the following chapter, the DICE-based sub-threshold design is discussed targeted in ultra low-power high-reliable applications in radiation environments. #### **CHAPTER 3** #### FAULT TOLERANT ULTRA LOW-POWER SRAM DESIGN #### 3.1 Overview of the Chip Design The test chip designed in this project is based on DICE that takes advantage of its ability to be immune to SEU at sub-threshold range. The test chip was designed and simulated in ST Microelectronics 90nm CMOS technology. It is an asynchronous SRAM with capacity of 2K×8 bits and 256 cells attached to each bitline. It is targeted to work at the supply voltage ranging from 0.3 V to 0.6 V. The simplified architecture of this SRAM design is shown in Figure 3.1. Figure 3.1 The simplified SRAM test chip architecture There are 8 row address inputs (A0-A7) and 3 column address IOs (A8-A10), 8 bidirectional data IOs (D0-D7), as well as read (RD) and write (WR) enable signal inputs. There are two major parts of this SRAM chip, the memory core and periphery circuits. The memory core contains the cell array, dummy column, dummy row, sense amplifiers and boost circuit. Their names and functions are listed as follows. - Cell Array: 2K×8 bits DICE-based cell array consisting of 8 columns with 256×8 bits each column. - Dummy Column: An additional column used to generate the enabling signal for sense amplifiers. It has 256×1 bits and the same bitline capacitance as that in the cell array. However, the read current is two times of that in cell array in order to get the SA enable signal. - Dummy Row: An additional row used to generate the dummy read wordline signal which "reads" the cells in dummy column. This additional row makes the dummy read wordline have the same capacitance as that for wordlines in cell array. - Sense Amplifier: Dynamic latch-type sense amplifier with differential inputs. - Boost Circuit: A boost circuit is used to obtain higher voltage to drive the write access transistors during write operation. It can boost the input voltage to two times of $V_{DD}$ . Therefore, since the highest voltage allowed to be applied in 90nm CMOS technology is 1.2V, the highest supply voltage of this test chip is 0.6V, slightly higher than threshold voltage. Periphery circuits contain the input/output buffers, read and write control circuits including decoders and address transient detector. The names are shown in Figure 3.1 and the functions of the periphery circuits are listed as follows. - Row Decoder: An 8-256 decoder which decodes the lower 8 address inputs (A0-A7) to select the accessed row. There are two decoding outputs for each row, Write WordLine (WWL) and Read WordLine (RWL), active during write and read operation respectively. - Column Decoder: A 3-8 decoder which decodes the higher 3 address inputs (A8-A10) to select the accessed column. - ATD: The address transient detection circuit which generates the reading sequence when the transient of the address or RD input is detected. - BUF: The tri-state bidirectional buffer for data input and output. In the following section, the detailed design for each part is given. #### 3.2 Detailed SRAM Design #### 3.2.1 Fault Tolerant Sub-Threshold SRAM Cell Design As a Radiation-Harden-By-Design (RHBD) memory cell design, DICE cell provides SEU immunity and is a good choice in sub-threshold memory circuits. There are four nodes redundantly storing the data as two pairs of complementary values, shown in Figure 2.13. During the standby mode of DICE cell, neither a negative nor a positive upset pulse could affect the logic state stored in the cell, which shows the immunity of DICE cell to radiation-induced upsets. However, this immunity does not exist due to the turned-on access transistors and the precharged bitlines during read operation. Let us recall the schematic of DICE in Figure 2.14, when the bitlines are both precharged to $V_{DD}$ , the turned-on access transistors will drive the storage nodes. If an upset pulse occurs during the read operation, the logic state will be flipped, shown in Figure 3.2. Supply voltage is 1V and the current pulses induced by SEU are 70 ps long. (a) Negative pulse at node X2 (b) Positive pulse at node X3 Figure 3.2 Simulation waveforms for upsets in DICE cell As indicated in Figure 3.2(a), logic state 1010 was stored in node X0, X1, X2, X3 (refer to Figure 2.14) before a positive current upset pulse occurred in node X1. Here, a 70 ps wide current pulse with magnitude of 75 $\mu$ A was applied to simulate a single event induced current. Since it was at read operation, N5-N8 were all on. During the current pulse, node X1 would be charged to '1', due to which, negative transient of node X0 was induced since N1 and N8 were both sinking X0. Furthermore, both N4 and P4 were off when node X0 was '0', so the current of the access transistor N5 would drive node X3 to '1'. Once X3 was '1', X3 was on, which would sink node X2 to '0'. Finally, all the nodes were flipped, i.e. the immunity of SEUs in storage nodes is destroyed during read operation. The key difference between hold state and read operation is whether there is driving current in access transistor *N5* to flip node *X3*. Figure 3.2(b) shows a negative current upset pulse at node *X2*. The logic state is also flipped by this pulse. In order to keep the immunity to SEUs during read operation, the decoupled scheme is introduced in addition to DICE cell design, shown in Figure 3.3. Figure 3.3 Proposed SEU tolerant sub-threshold SRAM cell In this proposed fault tolerant cell, transistors N1-N4 and P1-P4 form the DICE structure for SEU tolerance. Transistors N7-N10 are added for read operation so that there is no access transistor driving the nodes of cell during read operations. Instead of using four transistors for write operation, we use only N5 and N6 in this cell. It could save some static power as well as area, and word line driving capability is also decreased. Here, there are two word lines, one is Read WordLine (RWL) for read access, and the other is Write WordLine (WWL) for write access. Since only two transistors are used for write operation, boosted voltage of WWL is necessary when "1" is written to the cell. In this design the boost voltage $V_{boost}$ is two times of $V_{DD}$ , i.e. $V_{boost} = 2V_{DD}$ . Although the read and write operations become more complicated and an additional boost circuit is needed, this cell is still valuable and promising in high reliable applications, because it employs the basic DICE structure to store the data. Instead of any extra transistors or special structure applied to obtain large number of cell per-bitline, Standard Threshold (SVT) transistors are used as read access transistors, i.e. N7-N10, while the storage transistors were implemented by High Threshold (HVT) transistors in ST 90nm CMOS technology. SVT means larger driving capability and higher on-off current ratio $I_{On}/I_{Off}$ , but higher static leakage current. The cell layout is shown in Figure 3.4 (two cells). Figure 3.4 Proposed cell layout #### 3.2.2 Replica Technique for Read Operation Employing Dummy Column and Row In sub-threshold operation, due to the aggressively reduced supply voltage, the gain and speed of a voltage-mode sense amplifier are both degraded, while they are better in current-mode sense amplifier. Also, in low-power SRAM designs, clocked current-mode sense amplifiers are preferred for the sake of saving power. Therefore, the Differential Latch-Typed (DLT) sense amplifier was applied in this design, which, however, brought the problem of how to generate the sense clock. In many previous works, inverter chain was often applied to match the delay of data path. However, with the supply voltage varying from 0.3V to 0.6V, inverter chain could not always track and match the data path delay very well. A self-timing sense scheme based on replica technique was proposed [32], in which the sense clock was obtained by using dummy cells and dummy bitlines. This method was used to generate the proper sensing clock in super-threshold operations. However, it needs to be modified when it comes to sub-threshold operation and the standby leakage current has to be carefully considered. In terms of leakage current during read operation, the worst case is that all un-accessed cells in a bitline have opposite data of the accessed cell. For example, the data stored in the accessed cell is '1', while the other cells in the same bitline store data '0'. In this case, the current of the accessed cell $I_{Read}$ will discharge the BL while the total leakage current $I_{Leakage}$ of un-accessed cells in this column will discharge $\overline{\rm BL}$ simultaneously. In the sub-threshold SRAM, when each column has a large number of cells, the ratio $I_{Read}/I_{Leakage}$ will decrease significantly, which could result in varied time delay to get sufficient voltage swing between BL and $\overline{\rm BL}$ . In order to sense this swing correctly, the sense clock should be generated according to the largest time delay. Therefore, in our design, two different types of dummy cells were proposed for dummy column: Dummy Active Cell (DAC) and Dummy Sleeping Cell (DSC), which are shown in Figure 3.5(a) and (b) respectively. (a) Dummy active cell (DAC) (b) Dummy sleeping cell (DSC) Figure 3.5 Dummy cells in replica technique DAC is the proposed cell with two nodes and Write WordLine WWL connected to GND so that it always stores a "0" with no write operation, shown in Figure 3.5(a). DSC is the proposed cell with the other two nodes and both wordlines connected to ground, shown in Figure 3.5(b). DSC always stores an "1" and has no read and write operations. The dummy column consists of 2 DACs and 254 DSCs, which matches the worst case of read operation. Figure 3.6 Self-timed sensing scheme Figure 3.6 shows the self-timing sense scheme. During read operation, RWL<sub>i</sub> and Dummy RWL DRWL go high simultaneously. The storage cell in the cell array is accessed by RWL<sub>i</sub> while the two DACs in dummy column are accessed by DRWL to discharge DBL. When DBL is discharged to low and DBL remains high, the output of the XOR gate, i.e. the sense clock, will go to high to activate the sense amplifier. In this scheme, a DAC drives the same current as the accessed cell, and two DACs will drive 2 times of current on DBL. Therefore, the falling transient of DBL will be sped up, and the voltage swing between accessed BL and $\overline{BL}$ is about 10% of $V_{DD}$ , which is large enough for sensing. In this self-timing scheme, no delay element is needed and hence there is no inverter chain. The scheme based on the worst case can eliminate the negative effect of standby leakage current and obtain high reliability. ## 3.2.3 Sense Amplifier Design In Chapter 2, we have known that both of the latch-type sense amplifier and inverter-type read buffer can be used in sub-threshold SRAM design. The inverter type read buffer requires the bitline swing to be almost rail-to-rail, which leads to longer delay and higher power consumption. Therefore, the latch-type sense amplifier is preferred when the speed performance of the SRAM is required since it does not require large voltage swing on bitline. In this thesis project, a dynamic latch was implemented as shown in Figure 3.7. Figure 3.7 Differential latch-typed sense amplifier This sense amplifier has separated inputs and outputs for low voltage operation. This is achieved by inserting two input transistors MI and M2 in the cross-coupled structure of the sense amplifier, so that each of these two transistors is driven by a bitline. If the enable signal SEN of the sense amplifier is turned to high, the sense amplifier will be forced to latch according to the input bitlines voltage difference. Compared to the sense amplifier in [13], this design sacrifices some speed for the easier reading sequence generation, because it has only one enable signal. During read operation, SEN (enable signal) and SEQ (equalization signal) of sense amplifier are set to low at first to pre-charge the bitlines to $V_{DD}$ and make sure the voltage of the two complimentary bitlines are equalized. Then, SEQ will be set to high and the accessed cell starts to drive one of the bitlines to low. When the voltage difference between BL and $\overline{\text{BL}}$ is large enough, i.e. approximately one tenths of the supply voltage $(\frac{1}{10}V_{DD})$ in this design, SEN will be set to high to sense this voltage difference, and the data will be locked until SEN signal goes high at next read operation. # 3.2.4 Boost Circuit Design For the proposed cell shown in Figure 3.3, only two NOMS access transistors connected to nodes with the same value were applied for write operation. It means that there is no problem to write '0' to the cell while writing '1' to the cell is too difficult, or impossible without a boosted voltage. The example is shown in Figure 3.8 when $V_{DD}$ is 0.3V. Figure 3.8 Voltage at cell nodes with and without boosted write voltage Compared to writing '0' (Figure 3.8(a)), writing '1' to the cell will fail obviously since only approximately 0.1V voltage can be reached at the cell nodes, shown in Figure 3.8(b). This write ability problem of the cell can be solved by boosting the gate voltage of the access transistors, i.e. write word line voltage, shown in Figure 3.8 (c), (d). Figure 3.9 shows the boost circuit used in this design, which is adopted from the boost circuit in [15]. This boost circuit is simple in structure and can generate the boosted voltage with only a negative input pulse which is WWL in Figure 3.9. Hence, it is suitable in this design. Figure 3.9 Write wordline boost circuit In this design, the value of $C_{boost}$ is 5pF, made of fringe capacitor from Metal 1 to Metal 5 in ST 90nm CMOS technology. It generates the Global Boosted Voltage (GBV) which is approximately $2V_{DD}$ and drives the write wordline of accessed row and then the corresponding byte. For this boost circuit, the key challenge is to address the problem of large capacitance on GBV line. The value of $C_{boost}$ and size of the transistors driving GBV is carefully chosen by simulation. Another challenge is the delay time of the two-stage GBV switching before boosted voltage arrives the write access transistors of cells. This will limit the write time of the whole SRAM chip. Therefore, the positive feedback switching circuit needs to be simulated carefully, and the size of transistors should be selected properly. #### 3.2.5 Row and Column Decoders Since the test SRAM chip has capacity of 2K×8 bits, there are 11 address pins. These pins are divided into 8 row addresses and 3 column addresses. Therefore, an 8-256 row decoder and a 3-8 column decoder are implemented. The 8-256 row decoder is implemented in two stages to improve the speed and area performance, shown in Figure 3.10(a). Figure 3.10 Row and column decoders Since we use read and write decoupling scheme, the row decoder has RD and WR as its inputs, as well as $V_{boost}$ . Compared to row decoder, the column decoder is much simpler and has only one stage, shown in Figure 3.10(b). ## 3.2.6 Address Transition Detector (ATD) Since the SRAM designed in this thesis project is an asynchronous SRAM, therefore there will be no external clock signal to generate the read or write control sequences. All the control sequences must come from the signal transient of the input addresses and the read or write enable signals. A reading signal is generated when read enable signal is active and any address input changes. This reading pulse will be used to generate the sense amplifier equalization signal and consequently read wordline signal and some other control signals. A typical ATD circuit is shown in Figure 3.11. Figure 3.11 Address transition detector circuit During read operation, if A0 jumps from "1" to "0", the output of the XOR gate with A0 and delayed A0 inputs will generate a positive pulse. This pulse will turn on one of the NMOS transistors and consequently a negative pulse will be generated at the output of the buffer. The transient waveform can be found in Figure 3.11, in which a small PMOS transistor with gate shorted to ground is used as a pseudo resistor. It keeps the output of ATD high while there is no address transition. The key component in ATD is the delay circuits which generate a proper delay time of the input addresses. There are two main ways to design a delay component in a circuit, including RC delay circuits and inverter chains. In this design, the delay circuit is made of inverter chain with irregular transistor sizing, shown in Figure 3.12. Figure 3.12 Inverter chain based delay circuit It can generate an approximately 25 ns reading pulse at 0.3 V supply voltage. The small first stage inverter will drive the relatively large second stage inverter (25 times of the first stage inverter). The second inverter can be viewed as a large capacitor due to its large gate area. The third and fourth stages are used as buffers to obtain steeper signal transient edges. ### 3.2.7 Input / Output Buffer Design Since we want to use bi-directional data bus, the data buffer should be designed carefully. It should latch the data in the buffer during read operation and be switched off during write operation. In addition, the buffer should have enough drive capacity to obtain a reasonable speed at sub-threshold voltages. This concern of sub-threshold drive capacity imposes an additional requirement for this data buffer design. The data buffer used in this SRAM design is shown in Figure 3.13. Figure 3.13 Bidirectional data bus During read operation, $\overline{WR}$ is low, the buffers driving the bitlines BL and $\overline{BL}$ are disabled, therefore they can be pre-charged to $V_{DD}$ during equalization phase. When the sense amplifier is enabled and data is sensed, $\overline{Data}$ will be locked in latch and drive the I/O pin through a buffer. This buffer is used to drive the I/O pad's large capacitance (often several pico-farads) and reshape the output data signal. #### 3.3 Read and Write Control Sequences After introducing the overall architecture of the SRAM and the detailed design of each part, it is necessary for us to examine the read and write control sequences at the system level. ## 3.3.1 Read Control Sequence The read operation is divided into two phases, equalization phase $\phi_{eq}$ and sensing phase $\phi_{se}$ . During $\phi_{eq}$ , the sense amplifier will drive the two complementary bitlines to the same level $(V_{DD})$ , which is to achieve higher read speed. Otherwise the accessed cell will need much more time to drive the bitlines to obtain enough voltage difference if the original bitlines voltage difference is contrary to the required bitlines voltage difference. After $\phi_{eq}$ , the read wordline and dummy read wordline will be enabled simultaneously and the read process introduced in Section 3.2.2 will start. Figure 3.14 shows the complete read control circuits. Figure 3.14 The complete read control circuit To make the figure as clear as possible, some buffers for input and internal signals are ignored here. They are used to obtain better rising and falling edges and will not affect the logic function of the circuit. The read signal waveforms are shown in Figure 3.15 which gives the expected timing sequence of the read control circuit shown in Figure 3.14. Figure 3.15 Read operation timing sequence When read signal RD is high and there is an address transition, for example, A0 goes to "1" from "0", ATD generates a negative pulse used as equalization signal SEQ for sense amplifier. During this pulse, the bitlines BL and $\overline{\rm BL}$ are both pre-charged to $V_{DD}$ , and the equalization phase is completed. While SEQ is rising, "1" is written into the D-flipflop, i.e. RWLEN goes high, which enables the row decoder. RWL<sub>i</sub> and DRWL then go high simultaneously, and the voltage difference on bitlines of accessed cell and dummy bitlines starts to increase. Once the dummy bitlines DBL and $\overline{DBL}$ have adequate voltage difference, the output of the exclusive OR gate, i.e. SEN signal would rise, which can turn on the sense amplifier to sense the data on BL and $\overline{BL}$ . Once the data is obtained and locked in sense amplifier, RWLEN is cleared to lock the data in output latch, and disenable the row decoder at the same time. Then all signals except the output latch go to their initial states, waiting for the next read operation. The advantage of this read control scheme is its high speed, since there is no delay circuit except the ATD, and all the signals go back to their initial values once the data is locked in the output latch. ## 3.3.2 Write Control Sequence Write control circuit is much simpler compared to the read control circuit, because it just need to receive the input data and generate the proper boosted voltage for writing. The complete write control circuit is shown in Figure 3.16. Figure 3.16 Write control circuit Before the write signal WR is active, the addresses and data should be present first. When the WR is active, a global boosted voltage is generated. Then the write wordline of accessed row can be selected and switched to the boosted voltage. Since there are 8 columns in the cell array, write wordline of the un-accessed columns cannot be high, otherwise the value can be rewritten by the random values. Therefore, the write wordline WWLi has to be divided into 8 subwordlines, each of which accesses one column. When the sub-wordline is high (boosted voltage), the nodes of the cells can receive the data from the bitlines. Write timing sequence is shown in Figure 3.17. Figure 3.17 Write operation timing sequence #### 3.4 Layout of the SRAM Design The layout of this SRAM design was carried out in ST Microelectronics 90nm CMOS technology, and their standard cells in library CORE90GPSVT were used for implementation of the digital parts such as decoders and control circuits. The total size of the SRAM layout (without IO pads) is 520μm×600μm and the size of the die is 1mm×1mm. In order to test the SEU tolerance of the cells and sense amplifiers, there are several regions that were not filled by the dummy metal layers, shown in Figure 3.18. These unfilled regions can be accessed by the laser which is available in Saskatchewan Structural Sciences Centre (SSSC). Figure 3.18 The whole SRAM layout design Since this SRAM is targeted to work under different supply voltages, analog IO pads were used here as both power pins and IO signal pins instead of digital IO pads as in most commercial SRAM chips. Therefore, the interfacing circuit for this test chip must be designed carefully and extra level shifting circuits are desired during testing. This concern will be discussed in Chapter 5 in detail. Also, additional two pairs of power and ground pads, as well as a substrate connection pad were added to enhance the Electro Static Discharge (ESD) protection of the chip. Table 3.1 shows the implemented IO pads from ST Microelectronics standard IO library IO90GPHVT\_ANA\_50A\_7M2T. Table 3. 1 IO pads used in SRAM chip | Pad Name | Function | Pin Name in SRAM Design | |-------------------------|----------------|-------------------------| | IP1_60U_ANA_LIN | Input/output | A0-A10, RD, WR, D0-D7 | | | Power supply | VDD, GND | | VDDIOCO_1V0_60U_ANA_LIN | ESD protection | N/A | | VSSIOCO_1V0_60U_ANA_LIN | ESD protection | N/A | | PPSUB_60U_ANA_LIN | ESD protection | N/A | ## **CHAPTER 4** ## **DESIGN SIMULATION AND VERIFICATION** The purpose of design verification and simulation is to verify whether the designed circuits meet the specifications or not before fabrication. Once there is any error detected in simulation, the designed circuits should be corrected since the simulation could predict the failure of the fabricated chip. The schematic of the SRAM designed in this project was firstly captured and simulated in Cadence with the device model in ST CMOS 90nm technology provided by ST Microelectronics through CMC Microsystems. The simulation tool is Virtuoso Artist Analog Design Environment. After the layout of design was created based on schematics, extraction and LVS (Layout Versus Schematic) check was done in Monter Graphics Calibre. This is to verify that the circuit layout topology matches that of the circuit schematic. Finally, post-layout circuit simulation was carried out with more accurate estimation of the physical parasitic resistances, capacitances and inductances. The post-layout simulation results were verified to be correct, and the design was ready for fabrication. In this chapter, the simulation results of critical parts, as well as the whole SRAM chip are presented. #### 4.1 Verification of Critical Components in SRAM Chip There are several critical components in this SRAM test chip design, including the cell, sense amplifier, boost circuit and ATD. The functionality and performance of these components are sensitive to supply voltage $V_{DD}$ and therefore need to be verified individually. The other components such as row and column decoders are combinational circuits and their propagation delay are relatively smaller, and therefore do not significantly affect the SRAM performance. Their simulation results are included in the whole SRAM simulation. Supply voltage $V_{DD} = 0.3 \text{ V}$ was applied to get the worst case simulation results for each major component, while $V_{DD} = 0.3 \text{ V}$ , 0.4 V, 0.5 V and 0.6 V were applied to evaluate the performance under different supply voltages for the whole SRAM chip simulation. #### 4.1.1 SRAM Cell Simulation Since the proposed DICE-based SEU-tolerant cell has very high stability during read and standby period, it can tolerant the noise occurred in any of the four nodes. The simulation results of SET response during the read operation are given in Figure 4.1. The supply voltage is 0.3 V and the current pulses to simulate SEU are 70 ps long and their amplitudes are -75 uA and 75 uA respectively. a. Negative current pulse response b. Positive current pulse response Figure 4.1 Single Event Tolerance of the Proposed Cell From the simulation results, we can find that neither a negative nor a positive current pulse can flip the values stored in the cell during read operation. All the nodes level will go to their original values after the current pulse which simulates the single event. The read and write decoupled scheme in the cell can ensure that it is single-event upset-tolerance design even in read operation. ### 4.1.2 Sense Amplifier Simulation The sense amplifier is the most critical component for read operation since it senses the small voltage swing and gives the output data. Its speed, sensitivity and stability must be studied carefully. Figure 4.2 shows the simulation waveforms of the sense amplifer during the sensing phase. Figure 4.2 Simulation waveform of sense amplifier In Figure 4.2, the voltage difference between the two input is 30 mV, i.e. $\frac{1}{10} V_{DD}$ , which is the targeted minimum bitline voltage swing. The typical time delay of the sense amplifier is less than 50ns. Since the stability is very important, we have to take process variations into account. Therefore, a Monte-Carlo simulation was performed to examine the stability of sense amplifier, shown in Figure 4.3. Figure 4.3 Monte-Carlo simulation of sense amplifier An one hundred-time Monte-Carlo simulation was carried out with the process variations. The statistical results indicate that the output is correct, which means that the functionality is stable with process variation. However, the delay time is largely scattered according to the simulation. The worst case delay is more than 140 ns. Fortunately, in our design the sense scheme can tolerant such a time delay, which was explained in Chapter 3. ## 4.1.3 Verification of Voltage Boosting Circuit The success of the boosted voltage is the most critical precondition for the success of write operation of the SRAM chip. Therefore verification of boost circuit is necessary. Figure 4.4 shows simulation results of the boost circuit with 5pF boost capacitor, working at $0.3 \text{ V} V_{DD}$ . Figure 4.4 Boost Circuit Simulation Waveform From the waveforms, it is noticed that once write enable signal WR falls from 0.3 V to 0 V, the boosted voltage goes to around 0.5 V, which is enough for write operation. At 0.3 V voltage supply, for the amplitude of the boosted voltage, 0.45 V is large enough in the simulation of write operation. If higher writing speed is required, larger capacitor can be employed. #### 4.1.4 Verification of ATD Circuit The function of the ATD is to detect the transition of addresses during read operation, as well as the high-to-low transition of read enable signal RD. Whenever there is a transition of these signals occurred at the input, it will generate a 25 ns long SEQ (sense amplifier equalization signal) pulse. All the following sequence during read operation depends on this pulse. Therefore, the success of ATD is the precondition of the read operation. Figure 4.5 shows the simulation waveform of the ATD circuit, including both RD and address signals transition. In the diagram, transition of the lowest address A0 was simulated to present all the address inputs. Figure 4.5 Simulation waveforms of ATD As shown in the waveforms, an approximately 25 ns negative pulse was generated once there is an input transition, as a matter of fact, the absolute length of this pulse is not critical as long as it is wide enough for the following circuit to capture. ### 4.2 Verification for the Overall SRAM Chip For the whole chip verification, since it will take a tremendously long time to obtain the results if each cell of the SRAM is written and read once, it is impossible to scan all of SRAM cells. Therefore, only the lowest address input (A0) is altered and the corresponding cells are simulated. The other cells at other addresses are identical and should show the similar simulation results as long as the decoders work properly. In the simulation, data '10101010' was firstly written to address 0 (A0 to A10 were all low) and data '01010101' to address 1 (A0 was high and A1-A10 were all low). Then the data in address 1 was read out once RD is active. At last, data in address 0 and 1 were read out. The above simulation pattern was applied at supply voltage from 0.3 V to 0.6 V, shown in Figure 4.6 to Figure 4.9. Figure 4.6 Simulation waveform for supply voltage at 0.3 V Figure 4.7 Simulation waveform for supply voltage at 0.4 V Figure 4.8 Simulation waveform for supply voltage at $0.5\ V$ Figure 4.9 Simulation waveform for supply voltage at 0.6 V Figure 4.6 to Figure 4.9 show that this SRAM design can work properly at supply voltage from 0.3 V to 0.6 V. With the increment of $V_{DD}$ , the read and write time are both reduced. The performance simulation results including the maximum speed and power dissipation at different supply voltage are shown in Section 4.3. ### 4.3 Performance Simulation of the SRAM Design Performance simulation of this $2K \times 8$ bits SRAM was also carried out in ST 90nm CMOS technology under typical condition ( $25^{\circ}C$ ). The IO pads were not included here. Bitline voltage swing is only approximately 10% of $V_{DD}$ , which could save power to charge the bitlines while provide enough swing for sensing. The replica sense scheme can obtain the optimal read time when supply voltage varies. At supply voltage of 0.3 V, 0.4 V, 0.5 V and 0.6 V, read and write access time, maximum operating frequency ( $F_{MAX}$ ) and power dissipation comparisons are given in Table 4.1, from which we can find that this SRAM consumes very low power while the speed is megahertz range. Table 4.1 Performance simulation results with different power supplies | $V_{DD}$ | Write Time | Read Time | $F_{MAX}$ | Power dissipation | |----------|------------|-----------|-----------|-------------------| | (V) | (µS) | (µS) | (MHz) | (µW) | | 0.3 | 0.141 | 0.212 | 4.7 | 6.0 | | 0.4 | 0.042 | 0.056 | 17.8 | 27.9 | | 0.5 | 0.027 | 0.029 | 34.5 | 76.7 | | 0.6 | 0.019 | 0.022 | 45.5 | 140.0 | In sub-threshold circuit design, especially for memory design, standby leakage power takes a major part of the total power dissipation. Table 4.2 shows the standby leakage current of the conventional 6T cell, conventional DICE cell and the cell proposed in this paper with the power supply voltage of 0.3 V. Table 4.2 Standby leakage current comparison of three Cells | I <sub>Leakage</sub> (Conv. 6T) (pA) | $I_{Leakage}(Conv.DICE) \ (pA)$ | I <sub>Leakage</sub> (Proposed Cell)<br>(pA) | |--------------------------------------|---------------------------------|----------------------------------------------| | 117 | 239 | 275 | From Table 4.2, we can find that the proposed DICE cell has slightly more leakage current than conventional DICE cell which is about two times of that for 6T cell. This slightly increased leakage current will lead to a little more power dissipation which, however, is worthwhile in term of reliability. ## CHAPTER 5 CHIP TESTING RESULTS The SRAM chip designed in this thesis project was fabricated in ST 90nm CMOS technology through CMC Microsystems. Ten packaged dies and 15 loose dies were shipped back for testing. The package is 64-pin detachable Ceramic Quad Flat Package (CQFP-64) with the pin pitch of 0.8 mm. In order to test the functionality of the chip, a Microchip PIC16 series microcontroller-based Printed Circuit Board (PCB) was built. In order to compare the input data and the output data, the microcontroller was programmed to write the data into the SRAM chip, read the data back and send the data to a Personal Computer (PC). A desktop software called SComAssistant V2.1 was used to collect and display the data from microcontroller. Also, to test the single-event tolerance of the SRAM chip, Sapphire Laser in the Saskatchewan Structural Sciences Centre (SSSC) was applied to irradiate the chip to mimic the particles in space environments. #### **5.1 SRAM Functional Testing** The purpose of functional testing is to verify whether the SRAM chip is functional as what is expected in the design. ### **5.1.1 Testing Board Configuration** Since the speed of the SRAM chip is not high, an 8-bit microcontroller is adequate to generate the read and write signals. However, there are two significant differences to test this sub-threshold SRAM chip compared to testing of the SRAM chip working at a normal voltage. Firstly, the SRAM chip is targeted to work at $V_{DD} = 0.3 \text{ V}$ to 0.6 V (Let us take 0.3 V as the example for the following discussion), there is no existing commercial microcontroller available working at such low supply voltage. Therefore, additional level shifting circuits are necessary to build the testing board. Secondly, an adjustable low supply voltage generation should be taken into account. On our testing board, the fixed 1.2 V is firstly generated by a DC-DC converter and then divided by two resistors in series, one of which is adjustable to get the different targeted supply voltage. The brief schematic of the testing circuits is given in Figure 5.1. The Device Under Testing (DUT) is the fabricated SRAM chip. Figure 5.1 Schematic of functional testing board The maximum gate voltage for a transistor in ST 90nm CMOS technology can be up to 1.2 V, and there are some available commercial level shifting devices that can shift 5.0 V voltage to 1.2 V. Therefore, we just used a two-stage inverters as the on-chip input buffer, shown in Figure 5.2 where the $V_{DD}$ is assumed as 0.3 V. This type of input buffer can be used for the input signals of the SRAM chip, i.e. addresses, read and write enabling signals. Figure 5.2 Simple on-chip level shifter for input signals This simple on-chip level shifter translates 0 to 1.2 V external inputs to 0 to $V_{DD}$ internal signals. In this way, we did not have to design a complicated voltage dividing circuit to generate low to 0.3 V inputs on the board. Instead, 5.0-to-1.2 V commercial lever shifting devices were applied on the testing board. For the bi-directional data signals of the SRAM chip, we also used the above scheme for the data input during read operation, and used 8 comparators to collect the data during write operation. During write operation, the switch in Figure 5.1 was turned off and therefore the data output from microcontroller could be shifted to 1.2 V. During read operation, the switch was enabled and the level shifter was disabled. The sub-threshold output was then compared with the $1/2 V_{DD}$ reference voltage, i.e. 0.15 V, which is also generated by a voltage divider similar to that shown in Figure 5.1. The main components used on the testing board are listed in Table 5.1. Table 5.1 Main devices used in testing board | Description | Manufacturer No. | Manufacturer | Package | |---------------------------------|------------------|---------------------|----------| | 1.2 V voltage generator | LD1117S12TR | ST Microelectronics | SOT-223 | | 5.0 V voltage<br>generator | LD1117S50TR | ST Microelectronics | SOT-223 | | Micro-Controller | PIC16F877 | Microchip | DIP-40 | | Single channel Comparator | AD8611ARZ | Analog Devices Inc | SOIC-8 | | 8-ch 5.0V-1.2V<br>level shifter | ADG3308BRUZ | Analog Devices Inc | TSSOP-20 | | Analog switch | SN74LV4066ADR | Texas Instruments | SOIC-14 | ## **5.1.2 Functional Testing Results** The test vectors were generated by micro-controller in C program. Two testing steps were applied in functional testing. The first step was to determine the read functionality and the second was to test the write functionality. If the read functionality failed, the write functionality testing would be unavailable. In first step, after the SRAM chip was powered up, the data from the lowest address to the highest address were read out for 10 times to see whether they were constant. We can assume that there are randomly generated data in the cells while the chip was powered up, if we were able to observe the same data from the same cell and different data from different cells, we can conclude that the read operation is working properly. In order for the microcontroller and PC to handle the data properly, this scanning of the whole address space was done by step of 16, shown in Figure 5.3. The testing result for the SRAM working at 0.3V supply voltage is listed in Table 5.2, which shows the data collected 10 times from address 0-15. The data in Table 5.2 are denoted in hexadecimal format. Figure 5.3 SRAM scanning process for read functionality testing Table 5.2 Data collected from Address 0 - 15 without writing | addr<br>Time | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |--------------|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|------------| | 1 | AC | AE | BF | BA | 8F | AB | FF | 9E | BE | FE | BF | EA | 8C | FF | A9 | DF | | 2 | AE | BF | BA | BF | ΑE | BD | AC | 96 | BE | FE | BF | EA | D9 | AC | AF | 8E | | 3 | FF | AΕ | AF | AA | 8E | BD | AB | 9E | ΑE | B9 | BF | FF | 9D | A9 | A9 | <b>8</b> E | | 4 | ΑE | BE | FF | AΕ | AΕ | AB | BD | 9E | BE | A9 | BF | EA | 9D | A9 | EF | 8E | | 5 | AD | AΕ | BF | FA | 8E | AB | AB | 96 | AΕ | EE | BF | FF | D9 | FF | A9 | CE | | 6 | BF | BE | AA | FF | AΕ | AD | BD | 9E | BE | B9 | BF | E8 | D8 | FF | EF | <b>8</b> E | | 7 | AD | AΕ | AF | BA | 8E | AB | AA | 96 | ΑE | A9 | BF | E8 | 9D | AD | EF | DF | | 8 | BF | BF | BA | BE | AΕ | AD | AC | 9E | AΕ | A9 | BF | E8 | 9C | A9 | A9 | 8E | | 9 | FF | AE | AF | AA | 9F | AB | AA | 96 | AE | BE | BF | EA | D9 | AD | A9 | DF | | 10 | BE | BF | AA | AE | ΑE | BD | FF | 96 | AE | BE | BF | E8 | D8 | AD | EF | 8E | From Table 5.2, it is noticed that the data read back from the same address was not all the same. Only a small portion (address 10) kept constant. Other bytes had only several bits keeping the same collected data. For example, the data in address 7 was either 10011110 (9E) or 10010110 (96), which meant that 7 bits in this byte keep constant. The data collected from other addresses or when the SRAM chip worked at other supply voltages shows the similar feature as in Table 5.2. From this point of view, we can conclude that the read operation of this SRAM is partially working, even if there is only a small portion out of all the cells. In second step, one of the three data patterns was firstly written to the chip, and we read the data out to see whether they were consistent to the written pattern. The three patterns were "0x00", "0xFF" and "0xAA" in Hex format. The testing process is shown in Figure 5.4. Figure 5.4 SRAM scanning process for write operation testing Table 5.3 Data collected from Address 0 – 15 with Data Pattern "0x00" written in | addr<br>Time | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | |--------------|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----| | 1 | FF | FF | DF | 7F | FF | FF | FF | FA | EF | FF | FF | 7F | 7F | FF | FF | FF | | 2 | FF | CF | FF | FF | FF | FF | DF | FF | FF | FF | FF | 7F | FF | FF | FF | FF | | 3 | FF | FD | FF | 0E | FF | FF | 8D | FF | AF | 19 | FF | FF | FB | FF | FF | FF | | 4 | FA | FF | EF | FF | FB | FF | FC | FF | FF | CF | FF | FF | FF | FF | FF | FF | | 5 | FF BE | FF | FF | FF | FF | FF | FF | 8B | | 6 | FF | DF | FF | FF | FF | FF | 37 | FF | 8F | FF | EF | FF | FF | FF | FF | FF | | 7 | FF | FD | 3D | FF | FF | 7F | FF | FF | FF | 7F | FF | FF | FF | FF | DF | FF | | 8 | FF | FF | FF | 01 | FF | FF | FF | FF | 9F | FF | 9 | 5F | FF | FF | 7F | FF AF | EF | FF | FF | EF | | 10 | FF | DF | FF | 00 | FF | FF | DF | FF | FF | 7F | FF | FF | FF | AD | FF | FF | The collected data from Address 0-15 with the data pattern "0x00" written at 0.3V is shown in Table 5.3. For the data pattern "0xFF" and "0xAA", as well as for different supply voltages, the collected data remains the same as in Table 5.3. Therefore, we can say that the write functionality is failed since we could not get the data that was written to the SRAM chip. ### **5.2 Single Event Upset Tolerance Testing** The laser experiments were performed in the Saskatchewan Structural Sciences Centre (SSSC). The laser source is Ti: Sapphire Laser with tunable pulse repetition rate. The laser pulse width is 1ps, with the repetition rate of 4.75MHz. The wave length of the laser is 800 nm, and the laser beam is focused to the surface of the die ( $1\mu m \times 1\mu m$ spot size) though a microscope. The working distance of the lens is 3 mm. The testing board with DUT was mounted on a platform with the step size of $0.1\mu m$ . Although the effort of improving the SEU tolerance of this fault tolerant SRAM was spent on the cell design, the total SEU tolerance should be tested for all of the circuits in this SRAM design, including sense amplifiers and data buffers. Since the number of "good" cells in the fabricated chip is very small and they happened to locate at the dummy filled area in the layout, we were not able to implement laser experiment to these cells to see their immunity, because the dummy metals of the chip die block the laser light and consequently there is no SET occurred in the circuit. Therefore, we were not able to do the most important SEU tolerance testing to the proposed cell design. However, the SEU-tolerance testing was performed to sense amplifier and data buffer. The SEUs in the sense amplifier may occur at both the sensing process and the hold state. Because the output of the sense amplifier is locked into the data buffer and hence its state change will not be seen at the chip output, the testing result of the sense amplifier implies its SEU-tolerance during sensing process. In SEU-tolerance testing of sense amplifier and data buffer, the supply voltage was not constrained below 0.6V since they are both working at read operation and the boost circuit does not generate high voltage that would damage the chip. In this way, we can examine the SEU performance of the sense amplifier and data buffer in larger voltage range. Table 5.4 shows the minimum laser power that can induce a SEU for the sense amplifier and data buffer. The minimum laser power was used to indicate the SEU tolerance here. The larger laser power needed to induce a SEU, the better SEU tolerance of a certain circuit. Table 5.4 SEU tolerance of Sense Amplifier and Data Buffer | V <sub>DD</sub> (V) | Minimum Laser Power<br>for Sense Amplifier<br>(μW±20uW) | Minimum Laser Power<br>for Data Buffer<br>(μW±20uW) | |---------------------|---------------------------------------------------------|-----------------------------------------------------| | 0.3-0.5 | ≈100 | ≈135 | | 0.6 | 205 | 220 | | 0.7 | 470 | 320 | | 0.8 | 745 | 440 | | 0.9 | 895 | 470 | | 1.0 | - | 485 | ## **5.3 Testing Results Analysis** From the testing results, we can conclude that the write operations of the fabricated chips failed to work as expected by the simulation results. Because we did not include any testing signal IO pin on the chip design, it was difficult for us to give an explicit interpretation when this phenomenon was examined and analyzed. However, the following possibilities can be reached. ### • Boost circuit failed to generate the boosted voltage From the testing results for write operation, we can see that no data can be written to the memory. Since the boost circuit is the most critical component for writing operation, it might be the most possible failed part. In addition, when we applied 1.2 V supply voltage to the chip for write operation, the chip did not break down, which may prove that the expected 2.4 V $(2 \times V_{DD})$ was not generated. This may suggest that the boost circuit was not working. • Reading sequence generation circuit failed to generate proper control signals Since we used a self-timing scheme to obtain an optimal reading time, there might be not enough time margin for the uncertainty due to variations coming from process, voltage and temperature. • Unmatched transistor models and real device parameters in sub-threshold operation Because the device model in simulation environment is for normal voltage operations in a specific technology, the simulation with this model in sub-threshold would lead to inconsistence between the simulation results and experimental results. The above discussions about the failure of fabricated chip do not give the specific cause and only lists some possible problems. The difficulty of troubleshooting also suggests the importance of Design For Test (DFT), which is a precious lesson in our future work. ## **CHAPTER 6** ## **CONCLUSION AND FUTURE WORK** # 6.1 Summary and Conclusions The reduced SNM of SRAM in sub-threshold operation imposes great challenges to the sub-threshold SRAM design. The conventional 6-transistor SRAM cell structure cannot work at sub-threshold supply voltage because it does not have enough noise margin. In order to achieve ultra low-power at sub-threshold operation, previous research work has demonstrated that the read and write decoupled scheme is a good solution to the reduced SNM problem. Meanwhile, the single event tolerant DICE was implemented in this SRAM chip design to make the designed SRAM chip suitable for radiation environments in high-reliable applications such as space applications. A $2K\times8$ bits SRAM chip was designed, simulated and fabricated in 90nm CMOS technology provided by ST Microelectronics. The simulation results indicate that the proposed SRAM can work at supply voltages from 0.3~V to 0.6~V, consuming $6.0~\mu W$ to $140~\mu W$ power at frequency of 4.7~MHz and 45.5~MHz respectively. This shows the ultra low-power consumption of the proposed design, with the cost of speed. For the fault-tolerance feature in this SRAM design, we focused on the cell design which is sensitive to SEUs. With the DICE cell applied, single event tolerance is improved greatly, however, at the price of larger area occupied by a single cell due to more transistors used. Consequently, the cell density of the proposed SRAM design was reduced compared to that of SRAM made of convention 6-transistor cells. According to the testing results of fabricated prototype chips, the write operation did not function as proposed. The possible failed parts may be the boost circuit or the read sequence generation circuit. Un-matching between transistor model and the real device parameters in sub-threshold would also be the problem. The failure of the fabricated chip needs to be fixed in future fabrication. #### 6.2 Future Work Since the fabricated chips failed to work as expected in simulation, the SRAM design will be re-fabricated in future. Before re-sending it to fabrication, some under-suspected failed parts of the chip will be modified and additional testing IO pins will be added to check the internal signals. After fixing the problem of the designed SRAM functionality, more practical research about applying this sub-threshold memory chip should be carried out. The targeted applications of this ultra low-power SRAM design are the low-power biomedical and space applications. Therefore, the designed sub-threshold SRAM is targeted to be integrated in digital systems which inevitably involve sub-threshold microprocessors. Hence the design and fabrication of a sub-threshold microprocessor is one of the main research topics following the sub-threshold SRAM design. This ultra low-power SRAM chip can also be applied in standalone manner. In this case, we need to design a low-power level shifter with reasonable speed which can form interfacing circuit between sub-threshold SRAM and normal digital processing circuits. There are some existing level shifter designs available now. However, they can rarely handle the level shifting between sub-threshold voltage and standard IO voltages, i.e. 2.5 V or 3.3 V for CMOS technology. One potential level shifter candidate is a comparator-based shifter which compares the logic output of sub-threshold SRAM with a reference voltage of half $V_{DD}$ and gives the output at standard 2.5 V or 3.3 V IO voltages. The capacity of 2K×8 bits in this project is not enough for modern digital systems, especially for some signal processing or data collection systems. Therefore, the design of subthreshold SRAM macro with larger memory capacity is also an interesting work. #### REFERENCES - [1] L. Turicchia, S. Mandal, M. Tavakoli, L. Fay, V. Misra, J. Bohorquez, Sanchez, and R. Sarpeshkar, "Ultra-low-power Electronics for Non-invasive Medical Monitoring," *IEEE Custom Intergrated Circuits Conference* (CICC), pp. 85-92, 2009 - [2] P. Dodd and L. Massengill, "Basic Mechanisms and Modeling of Single-Event Upset in Digital Microelectronics," *IEEE Transactions on Nuclear Science*, Vol. 50, No. 3, pp. 583-602, Jun. 2003 - [3] T. May, M. Woods, "Alpha-Particle-Induced Soft Errors in Dynamic Memories," *IEEE Transactions on Electron Devices*, pp. 2-9, Jan. 1979 - [4] D. Binder, E. Smith and A. Holman, "Satellite Anomalies from Galactic Cosmic Rays," *IEEE Transactions on Nuclear Science*, Vol.22, No. 6, pp. 2675-2680, Dec. 1975 - [5] C.H. Lin, K.K. Das, L. Chang, R.Q. Williams, W.E. Haensch, C. Hu, "VDD Scaling for FinFET Logic and Memory Circuits: the Impact of Process Variations and SRAM Stability," *International Symposium VLSI Technology, Systems, and Applications*, pp. 1-2, Nov. 2006 - [6] T. Chen, J. Chen, L. T. Clark, J. E. Knudsen and G. Samson, "Ultra-Low Power Radiation Hardened by Design Memory Circuits," *IEEE Transactions on Nuclear Science*, Vol. 54, No. 6, pp. 2004-2011, Dec. 2007 - [7] E. Seevinck, F. List and J. Lohstroh, "Static-Noise Margin Analysis of MOS SRAM Cells," *IEEE Journal of Solid-State Circuits*, Vol. SC-22, No. 5, pp. 748-754, Oct. 1987 - [8] J. Lohstroh, E. Seevinck and A. Groot, "Worst-Case Static Noise Margin Criteria for Logic Circuits and Their Mathematical Equivalence," *IEEE Journal of Solid-State Circuits*, Vol. SC-18, No. 6, pp. 803-807, Dec. 1983 - [9] E. I. Vătăjelu and J. Figueras, "Supply Voltage Reduction in SRAMs: Impact on Static Noise Margins," *IEEE International Conference on Automation, Quality and Testing, Robotics*, Vol. 1, pp. 73-78, 2008 - [10] K. Zhang, U. Bhattacharya, Z. Chen, F. Hamzaoglu, D. Murray, N. Vallepalli, Y. Yang, B. Zheng, and M. Bohr, "A SRAM Design on 65nm CMOS Technology with Integrated Leakage Scheme," *Symposium on VLSI Circuits (VLSI) Digest of Technical Papers*, pp. 294-295, 2004 - [11] M. Yamaoka, K. Osada, R. Tsuchiya, M. Horiuchi, S. Kimura, and T. Kawahara, "Low Power SRAM Menu for SOC Application Using Yin-Yang-Feedback Memory Cell - Technology," Symposium on VLSI Circuits (VLSI) Digest of Technical Papers, pp. 288-291, 2004 - [12] T. Blalock and R. Jaeger, "A High-Speed Clamped Bit-Line Current-Mode Sense Amplifier," *IEEE Journal of Solid-State Circuits*, Vol. 26, No. 4, pp. 542-548, 1991 - [13] E. Seevinck, P. J. van Beers and H. Ontrop, "Current Mode Techniques For High-Speed VLSI Circuits With Application to Current Sense Amplifier For CMOS Srams," *IEEE Journal of Solid-State Circuits*, Vol. 26, No. 4, pp. 525-536, 1991 - [14] K. Sasaki, K. Ishibashi, K. Ueda, K. Komiyaji, T. Yamanaka, N. Hashimoto, H. Toyoshima, F. Kojima and A. Shimizu, "A 7-Ns 140-Mw 1-Mb CMOS SRAM With Current Sense Amplifier," *IEEE Journal of Solid-State Circuits*, Vol. 27, No. 11, pp. 1511-1518,1992 - [15] T. Seki, E. Itoh, C. Furukawa, I. Maeno, T. Ozawa, H. Sano and N. Suzuki, "A 6-ns 1-Mb CMOS SRAM with latched sense amplifier," *IEEE Journal of Solid-State Circuits*, Vol. 28, No. 4, pp. 478-483,1993 - [16] T. Kobayashi, K. Nogami, T. Shirotori, and Y. Fujimoto, "A Current Controlled Latch Sense Amplifier and a Static Power-Saving Input Buffer For Low-Power Architecture," *IEEE Journal of Solid-State Circuits*, vol. 28, pp. 523–527, Apr. 1993 - [17] N. Verma and A. Chandrakasan, "A 256 Kb 65 nm 8T Subthreshold SRAM Employing Sense-Amplifier Redundancy," *IEEE Journal of Solid-State Circuits*, Vol.43, No.1, pp. 141-149, Jan. 2008 - [18] T. Kim, J. Liu, J. keane and C. H. Kim, "A 0.2 V, 480 kb Subthreshold SRAM With 1k Cells Per Bitline for Ultra-Low-Voltage Computing," *IEEE Journal of Solid-State Circuits*, Vol.43, No.2, pp. 518-529, Feb. 2008 - [19] J. Chen, L. T. Clark and T. Chen, "An Ultra-Low-Power Memory With A Subthreshold Power Supply Voltage," *IEEE Journal of Solid-State Circuits*, Vol. 41, No. 10, pp. 2344–2353, Oct. 2006. - [20] B. Zhai, S. Hanson, D. Blaauw and D. Sylvester, "A Variation-Tolerant Sub-200 mV 6-T Subthreshold SRAM," *IEEE Journal of Solid-State Circuits*, Vol. 43, No. 10, pp. 2338– 2348, Oct. 2008 - [21] B. Zhai, L. Nazhandali, J. Olson, A. Reeves, M. Minuth, R. Helfand, S. Pant, D. Blaauw and T. Austin, "A 2.60 Pj/Inst Subthreshold Sensor Processor For Optimal Energy Efficiency," in *Symp. VLSI Circuits Dig.*, pp. 154–155, Jun. 2006 - [22] I. Chang, J. Kim, S. Park and K. Roy, "A 32 kb 10T Sub-Threshold SRAM Array With Bit-Interleaving and Differential Read Scheme in 90 nm CMOS," *IEEE Journal of Solid-State Circuits*, Vol. 44, No. 2, pp. 650-658, Feb. 2009 - [23] J. Bisgrove, J. Lynch, P. McNulty, W. Abdel-Kader, V. Kletnieks and W. Kolasinski, "Comparison of Soft Errors Induced By Heavy Ions And Protons," *IEEE Transactions on Nuclear Science*, Vol. NS-33, No. 6, pp. 1571-1576, Dec. 1986 - [24] J. S. Fu, C. L. Axness and H. T. Weaver, "Memory SEU simulations using 2-D transport calculations," *IEEE Electron Dev. Lett.*, Vol. EDL-6, No. 8, pp. 422-424, 1985 - [25] C.L. Axness, H. T. Weaver, and J. S. Fu, "Mechanisms Leading to Single Event Upset," *IEEE Transactions on Nuclear Science*, Vol. NS-33, No. 6, pp. 1577-1580, 1986 - [26] J. A. Zoutendyk, L. S. Smith, G. A. Soli and R. Y. Lo, "Experimental Evidence For a New Single-Event Upset (SEU) Model in a CMOS SRAM Obtained From Model Verification," *IEEE Transactions on Nuclear Science*, Vol. NS-34, No. 6, pp. 1292-1299, 1987 - [27] R. L. Woodruff and P. J. Rudeck, "Three-Dimensional Numerical Simulation of Single Event Upset of an SRAM Cell," *IEEE Transactions on Nuclear Science*, Vol. 40, No. 6, pp. 1795-1803, 1993 - [28] P. E. Dodd and L. W. Massengill, "Basic Mechanisms and Modeling of Single-Event Upset in Digital Microelectronics," *IEEE Transactions on Nuclear Science*, Vol. 50, No. 3, pp. 583-602, Jun. 2003 - [29] H. T. Weaver, C. L. Axness, J. D. McBrayer, J. S. Browning, J. S. Fu, A. Ochoa1, Jr., R. Koga, "An SEU Tolerant Memory Cell Derived From Fundamental Studies of SEU Mechanisms in SRAM," *IEEE Transactions on Nuclear Science*, Vol. NS-34, No. 6, pp. 1281-1286, Dec. 1987 - [30] T. Calin, M. Nicolaidis, and R. Velazco, "Upset Hardened Memory Design For Submicron CMOS Technology," *IEEE Transactions on Nuclear Science*, Vol. 43, No. 6, pp. 2874–2878, Dec. 1996 - [31] Shah M. Jahinuzzaman, David J. Rennie, and Manoj Sachdev, "A Soft Error Tolerant 10T SRAM Bit-Cell With Differential Read Capability," *IEEE Transactions on Nuclear Science*, Vol. 56, No. 6, pp. 3768-3773, Dec. 2009 - [32] B. S. Amrutur and M. A. Horowitz, "A Replica Technique for Wordline and Sense Control in Low-Power SRAM's," *IEEE Journal of Solid-State Circuits*, Vol. 33, No. 8, pp. 1280-1219, Aug. 1998