I. INTRODUCTION HE translation lookaside buffer (TLB) is used to speed up the virtual-to-physical address translation. The power consumed by TLB is a significant part of the power consumed by the microprocessor as a whole [1] . Failures in TLB with high probability manifest themselves at the system level [2] , [3] . They result in user-visible errors with a probability of over 30% [2] . Software and algorithmic methods [2] , [4] can reduce this probability by only 20-50%, but not eliminate.
CMOS logical elements of matching for a contentaddressable memories (CAM) are currently constructed on 6T memory cells and "Exclusive OR" logic gates. The cells [5] , [6] suggested for the ternary content-addressable memory do not solve the problem of fault tolerance in content-addressable memories that work under the influence of nuclear particles. The hardened ternary CAM cell designed in [6] consists of CMOS 6T memory cells which is why problems of data upsets under impacts of single nuclear particles still remain.
The new approach to hardening TLB's design against impacts of single nuclear particles is based on the novel circuits developed using STG DICE cell. The STG DICE memory cell (Spaced Transistor Groups DICE) [7] , [8] is distinguished from the DICE (Double Interlocked Cell) [9] in that its transistors are separated into two groups. The charge collection from the tracks of single nuclear particles by transistors of just one group does not lead to the cell upset. The purpose of this work is to design TLB on hardened elements resistant to single nuclear particles.
II. THE BASIC BLOCKS OF THE TRANSLATION LOOKASIDE

BUFFER
The translation lookaside buffer ( Fig. 1) contains the 64-words content-addressable memory array (CAM) and the 64-words random access memory array (RAM). The input "Register selection" of the decoder (DEC) sets the number of the register to write or read data. The block of read and write buffers (R/W BUF) includes write buffers and sense amplifiers for the CAM and the RAM. The control logic (CONTROL) provides clock and control signals in the TLB. The encoder (ENCOD) indicates the search result (hit or miss) and shows the number of the register in which the data matched.
Each register of CAM contains [10] three blocks of matching (BM) and two blocks of masking and matching (BMM). The block of matching (Fig. 2a) masking (BMM) includes four "2 CAM cells", four mask cells as two "2 mask cells" and the combinational logic for combining match outputs and bits of the input for lookup. Fig. 2b shows the ¼ part of the block BMM.
III. SCHEMES OF THE HARDENED ELEMENTS
The element of data matching for associative memories [10] was developed using the STG DICE cell. Interleaving [11] of transistors groups belonging to adjacent STG DICE cells allows to increase the distance between the mutually sensitive pairs of transistors of the one cell and to improve fault tolerance. In the previous work [12] the simulation of the noise immunity was carried out using the SPICE models of transistors.
Experimental studies [13] demonstrate the fault tolerance of 65-nm STG DICE cells compared with the 6T memory cells and standard DICE cells during collection of charge generated by laser pulses. TCAD simulations [14] confirm the high fault tolerance of the STG DICE cell during charge collection from tracks of single nuclear particles. The charge collection from tracks of single nuclear particles in one of the STG DICE transistors groups inside the element of matching does not lead to a failure at linear energy transfer values up to 70 MeV×cm2/mg, but may lead to a temporary unsteady state of STG DICE cell [15] . Fig. 4 shows the scheme of the memory element for masking with the combinational circuit for correct data reading during steady and unsteady states of STG DICE nodes. The reading circuit consists of two tristate inverters TRInv 1, TRInv 2 and two normal inverters Inv 1, Inv 2. When STG DICE cell is in an unsteady state caused by a particle impact, the reading circuit forms the correct output signal. The correct reading is provided by unchangeable logic levels of two from four nodes A, B, C, D. These two nodes keeping their initial logic levels belong only to the group of transistors that does not collect the charge from the track of a nuclear particle. The charge collection at linear energy transfer values up to 70 MeV×cm2/mg leads to no data upsets of the memory element for masking [14] . Fig. 5 presents a latch-type sense amplifier consisting of the trigger on STG DICE cell ( Fig. 5a ) and the synchronous STG RS-latch (Fig. 5b ). It's clocked by the enable signal En. Fig. 5b presents the scheme of the STG RS-latch [16] on four pairs of Ni and Pi transistors (i = 1; 2; 3; 4), which are in the identical state -closed or open as in STG DICE cell; S 1 , S 2 , R 1 , R 2 -set inputs and reset inputs, Q 11 , Q 12 , Q 21 , Q 22 -outputs of the STG RS-latch. The latch-type sense amplifiers are used in the CAM and the RAM arrays for data reading out of the bit lines.
TCAD simulation [16] of the STG RS-latch shows that the threshold linear energy transfer value LET THR is 55 MeV×cm2/mg in case when no interleaving of groups is applied, and LET THR value exceeds 100 MeV×cm2/mg when interleaving of groups is used.
We try to improve the reliability of STG-type elements using the common approach: the separation of the transistors of such logical elements into two groups (or blocks), and subsequent interleaving of these groups belonging to the adjacent logical units [10] . Table 1 presents the parameters of the block of matching and the block of matching and masking, which cover 94% of the CAM-register's area. In units "2 CAM cells" and "2 mask cells" we use interleaving of the joint groups of the transistors to increase the minimum distance between the sensitive nodes of STG DICE cell. It provides the minimum distance of 4.15 µm for the unit "2 CAM cells" and of 3 µm for the unit "2 mask cells".
IV. THE PARAMETERS OF ELEMENTS AND BLOCKS
In the RAM array we use interleaving of groups belonging to four STG DICE cells. Thus, three groups of adjacent STG DICE cells are situated between two groups of one STG DICE cell. It provides the minimum distance between cell's mutual sensitivity nodes of 2.95 µm. Parameters of blocks were simulated in Cadence Virtuoso using 65-nm CMOS transistors at tt technological corner after extraction of layout parasitics. Supply voltage is 1.0 V, temperature is 25°C and the clock frequency is 1.0 GHz. Table 2 presents the values of the power consumption of blocks for different operating modes: write, search and read of data. The maximum power consumption takes place in the CAM array (27.54 mW) at the search mode. This is due to the switching of all logical elements in each register during a parallel search in TLB. In our case, the power consumed in the search mode is 0.43 mW for each register at the clock of 1 GHz. This exceeds the power consumption of the traditional CAM register (0.074 mW) with pre-charging match-lines [17] by 6 times.
V. THE METHODOLOGY OF TCAD SIMULATION
Single event effects that can be observed in the element of matching on the STG DICE cell during 3-D TCAD simulation depend on the linear energy transfer and the direction of the track. The results were obtained using Sentaurus Device simulator at the temperature of 25°С and the supply voltage of 1.0 V for 65 nm CMOS bulk structure. The 3-D device was comprised of transistors (width of channels is 150 nm) developed taking into account the models presented in the work [18] . Fig. 6 shows the layout of the element of matching realized as of two identical joint groups. One joint group contains transistors of the first tristate inverter TRInv 1 (N 1.2 N 1.1 P 1.2 P 1.1 ) and the first cell's Group 1 (N D N A P A P B ). The second joint group contains the transistors of tristate inverter TRInv 2 (N 2.2 N 2.1 P 2.2 P 2.1 ) and the transistors of the second Group 2 (N B N C P C P D ) of STG DICE. In Fig. 6 the layout is without spacing between joint groups; asterisks indicate the location of the input points of the particle track. In this case, the distance between the transistors N A and N C is 1.35 μm. To ensure a high level of noise immunity, the transistors of two joint groups may be spaced on the adequate distance using interleaving the joint groups of the adjacent elements. Fig. 7 shows the 3-D device physical model of STG DICE. Guard bands (n+ and p+) are for insulation n-and pregions. The direction of the particle tracks is indicated by arrows with labels -Track 1 and Track 2. These tracks are directed along the normal to the chip surface. Track 1 is the particle trajectory at normal direction to the surface of the p-region. Track 2 is the trajectory at normal direction to the surface of the chip's n-well. The area of the 3-D device model is 6.4 µm × 10.9 µm, the depth is 3.0 µm. 
VI. THE RESULTS OF TCAD SIMULATION
A. Effects in STG DICE cell
In the steady state of STG DICE, the logic levels of nodes A and C are the same. The logic levels of B and D nodes are the same too. The impact of a nuclear particle on Group 1 or Group 2 inside STG DICE (Fig. 6) does not lead to a failure, but may lead to a temporary unsteady state [8] . Table 3 presents the maximum deviations of the voltage on the nodes A, B, C, D inside the element of matching (Fig. 3) during the charge collection from tracks with the input points 1n, 2n, 3n, 4n, 5n (Fig. 6) . In spite of the change in the voltage on the node B (the deviation ΔV MAX.B = 0.49 V on the node B in table 3) the output level "1" is unchanged. STG DICE cells significantly increase the reliability of the matching procedure during storage time.
B. Noise impulses on output of combinational circuit
When the voltage levels of the nodes are practically unchanged, the memory cell stores the recorded data securely and the main reason for noise pulses at the output of the element of matching is the charge collection by the transistors of the combinational circuit.
For the initial state of nodes ABCD = 0101 and Input 1 = "0" (Input 2 = "1") the indicator of matching the data is the Output = "1". In this case the inverter TRInv 1 is in the logical state "1", the transistors P1.1, P1.2, N1.2 are open, and the transistor N1.1 is closed (Fig. 3) . The output of TRInv 2 is in the high-impedance condition and it does not shunt the output of the inverter TRInv 1.
In this mode, the charge collection from each of tracks 1n, 2n, 3n leads to a temporary switching of the inverter TRInv 1 from the level "1" down to "0" at LET values in range from 20 to 70 MeV×cm2/mg. These processes take place as in a normal inverter: the charge collection by the transistor N1.1 causes the temporary decrease of voltage on the Output. If the input track point is 4n (Fig. 6) , the charge is collected only by the transistor N B of the STG DICE cell. The voltage deviation on the node B is ΔV MAX.B = 0.49 V, but the Output keeps "1". Table 4 presents the maximum amplitudes of the output noise pulses appeared due to charge collection by transistors of the combinational circuit for all logic states of the element of matching at LET value equal to 70 MeV×cm2/mg. The cases when the input data matches the stored data (the Output logic level is "1") are more sensitive to the generation of large noise pulses (with the amplitude up to 1.33 V) at the Output. The small noise pulses are observed in two cases. The first case is the particle impact on PMOS transistors (the amplitude of noise pulses is up to 0.20 V). The second case is the particle impact on NMOS transistors (the amplitude of noise pulses is up to 0.29 V) when the input data doesn't match the stored data (the Output logic level is "0").
Curves in Fig. 8 demonstrate the parameters of noise pulses depending on the coordinate of the input track point at LET = 70 MeV×cm2/mg. Fig. 8a presents dependencies of the amplitudes A PULSE and durations t PULSE of the output pulses for the initial state of nodes ABCD = 0101 and Input 1 equal to "0". Fig. 8b shows dependencies for ABCD = 1010 and Input 1 equal to "1". The noise pulses in both cases (Fig. 8a and Fig. 8b ) are due to charge collection by NMOS transistors of tristate inverters. Curves in Fig. 8b are characterized by the same amplitudes and the durations of the noise pulse as in Fig. 8a .
Characteristics of the Joint group 1 (Fig. 6 ) obtained for input track points 1n and 2n at the state ABCD = 0101 and Input 1 = "0", and the characteristics of the Joint group 2 obtained for input track points 3n and 4n at the state ABCD = 1010 and Input 1 = "1" are practically the same. The reactions of these Joint groups are changed. This fact is confirmed by the dependencies in Fig. 8 , where the characteristics in Fig. 8b , converted into another sequence of tracks inputs, namely 3n; 4n; 1n; 2n, are similar to the characteristics in Fig. 8a. Fig. 9 presents amplitudes and durations of the output noise pulses as functions of linear energy transfer (LET). Dependencies are obtained for the input track points characterized by the largest pulse response. They are 1n at the initial state of nodes ABCD = 0101 and 3n at the state ABCD = 1010.
VII. THE TOPOLOGY OF TLB Fig. 10 shows the floorplan of the designed translation lookaside buffer. It takes five layers of metallization, all inputs and outputs are placed along one side of the TLB. There are guard rings on the borders of n-well and psubstrate for preventing a latch-up effect. The chip dimensions are 353×278 µm 2 ( Fig. 10 ) and the area is 0.098 mm 2 . Proposed TLB contains only one element of the parity calculation (instead of one element per word), which is placed in the block R/W BUF. It also reduces the CAM area by 11%. Blocks of CAM, RAM and the block of R/W BUF contain only hardened elements and occupy 87% of the chip area; 5% of the area is with non-hardened elements, 8% of the area contains no transistors. Table 5 presents the main parameters of the TLB: the chip area, the power consumptions in write, search and read modes; the delays in the search and read modes. The energy consumption of the content addressable memory is measured in figure-of-merit as FOM = [Power × (Clock Period)/(Total Number of bits)] for evaluating of the design efficiency. Typical values of the energy consumption for a traditional 65 nm CMOS CAM are 0.77-2 fJ/bit/search [17] . The hardened block of CAM expends 9.15-12.6 fJ/bit/search. It exceeds the consumption of CAM with traditional design by 5-6 times since the power consumed in the search mode for each our register exceeds the power consumed of the non-hardened register. Efficient use of the chip area can be estimated using the parameters of "Area on bit". The values of the "Area on bit" for non-hardened CAM designs are 7.6-16.9 µm 2 /bit. In our case, "Area on bit" is 21.1 µm 2 /bit for the hardened 65-nm CAM design. In contrast to the conventional CAM design, the block of matching is realized as the static combinational circuit but not as the dynamic circuit with a pre-charged match-line. This allows to increase the reliability but leads to additional costs of the power consumption.
VIII. CONCLUSION
Presented translation lookaside buffer is the first project based fully on the STG DICE memory cells. This faulttolerant solution was used in the CAM, the RAM, and the read and write buffers. The special separation of the transistors into two groups and spacing them on the chip provides the resistance of the elements of matching and the elements for masking to the impacts of single nuclear particles. The STG DICE cell reliably stores the logical state of the proposed elements under the influence of ions with LET up to 70 MeV×cm2/mg. Short-term noise pulses can occur at the outputs of these elements exposed to ions with LET in the range from 20 to 70 MeV×cm2/mg. 
