

Missouri University of Science and Technology Scholars' Mine

Electrical and Computer Engineering Faculty Research & Creative Works

**Electrical and Computer Engineering** 

01 Feb 2022

# A Stub Equalizer for Bidirectional and Single-Ended Channels in NAND Memory Storage Device Systems

Taelim Song

Jongjoo Lee

Chulsoon Hwang Missouri University of Science and Technology, hwangc@mst.edu

Follow this and additional works at: https://scholarsmine.mst.edu/ele\_comeng\_facwork

Part of the Electrical and Computer Engineering Commons

## **Recommended Citation**

T. Song et al., "A Stub Equalizer for Bidirectional and Single-Ended Channels in NAND Memory Storage Device Systems," *IEEE Transactions on Electromagnetic Compatibility*, vol. 64, no. 1, pp. 172 - 181, Institute of Electrical and Electronics Engineers, Feb 2022. The definitive version is available at https://doi.org/10.1109/TEMC.2021.3098481

This Article - Journal is brought to you for free and open access by Scholars' Mine. It has been accepted for inclusion in Electrical and Computer Engineering Faculty Research & Creative Works by an authorized administrator of Scholars' Mine. This work is protected by U. S. Copyright Law. Unauthorized use including reproduction for redistribution requires the permission of the copyright holder. For more information, please contact scholarsmine@mst.edu.

# A Stub Equalizer for Bidirectional and Single-Ended Channels in NAND Memory Storage Device Systems

Taelim Song<sup>10</sup>, Member, IEEE, Jongjoo Lee<sup>10</sup>, Member, IEEE, and Chulsoon Hwang<sup>10</sup>, Senior Member, IEEE

*Abstract*—In memory devices, such as solid-state drive, multitopology is used for interfaces where multiple memory packages are connected to a controller using a branched transmission line. Impedance mismatching caused by the branches and unwanted reflection from deactivated packages inevitably degrades signal quality, limiting the data rate of the interface. In this article, a simple stub equalizer is proposed to improve the data rate of the memory interface. An open-ended stub is placed between a transmitter and a receiver, and the length, impedance, and location of the stub line are determined to properly cancel the reflection from other branches. Parameters are optimized based on the peak distortion analysis and an exhaustive search considering both read and write modes. The improvements are validated through eyediagram simulations.

*Index Terms*—Equalizer, peak distortion analysis (PDA), signal integrity (SI), solid-state drive (SSD), stub equalizer.

#### I. INTRODUCTION

RACE structure with multiple branched lines is applied to a system with two or more receivers (Rx) corresponding to one transmitter (Tx). This structure is an interconnection type frequently used in dynamic random-access memory modules for interface with the CPU [1]-[3]. Recently, it is also an interface method between the controller and NAND flash memory, which must be used in a solid-state drive (SSD) module that uses high-capacity NAND flash memory to replace hard disk drive [4]. Fig. 1 shows the interface between the NAND memory package and the controller having multiple channels to produce a highcapacity NAND flash memory storage device. However, as shown in Fig. 1, traces having the branched structure cause serious signal distortion that degrades bit-error-rate (BER) quality under the fast I/O bandwidth [5], [6]. The edge-time distortion caused by the reflection due to the branched trace structure especially affects the eye width and height at the receiver side and can be a major cause of reduced timing margin [7].

As the increased speed of the I/O interface is developed, it becomes more difficult to avoid the problem of BER degradation

Manuscript received February 1, 2021; revised May 28, 2021; accepted July 13, 2021. Date of publication August 11, 2021; date of current version February 17, 2022. This work was supported by National Science Foundation under Grant IIP-1916535. (*Corresponding author: Taelim Song.*)

Taelim Song and Chulsoon Hwang are with the Electromagnetic Compatibility Laboratory, Missouri University of Science and Technology, Rolla, MO 65401 USA (e-mail: terrious76@gmail.com; hwangc@mst.edu).

Jongjoo Lee is with the Solution Design and Integration Group, SK Hynix Inc., Seongnam-si 13558, South Korea (e-mail: ejongjoo@gmail.com).

Color versions of one or more figures in this article are available at https://doi.org/10.1109/TEMC.2021.3098481.

Digital Object Identifier 10.1109/TEMC.2021.3098481



Fig. 1. Typical NAND-controller interface with multitopology for SSD module.

[8]. The recently developed SSD products have a 1 Gb/s or more I/O signaling to implement thousands of megabits per second interfaces between NAND and controllers. Therefore, the SSD module system, which must have high-speed branch traces because it requires the maximum capacity in a limited number of channels, needs a method to improve the timing margin. However, in order to counteract such signal distortion, it is very difficult to implement a filter or an active equalizer for reasons of the printed circuit board or chip space constraints.

In this article, a method of applying a stub equalizer with a type of open-stub line is proposed to improve signal distortion due to multitopology (with multiple branched lines or legs). The stub equalizer application method that does not use a specific component, such as a filter or equalizer, can provide an optimized signal improvement direction by selecting an appropriate line length, impedance, and location. First, in Section II, the signal distortion in the controller-NAND flash memory interface circuit using the topology line is analyzed by the lattice-diagram theory, under the condition of the lossless transmission line and the effect of the stub equalizer application method proposed to prevent this problem. Second, in Section III, the peak distortion analysis (PDA) with the transfer function for a signaling interface circuit, including the topology, is shown as an optimization method of the stub equalizer using the exhaustive search. Third, Section IV verifies a methodology of effectively optimizing the stub equalizer to work as a kind of passive equalizer in a bidirectional transmission circuit as compared with the eye diagrams. Finally, Section V concludes this article.

#### II. PROPOSED STUB EQUALIZER

Generally, to implement a high-density SSD system, the interface between the one host processor and the NAND memory uses a branched trace, such as an interconnection of a 2T-topology or 4T-topology (for 2 or 4 NAND package) type [4]. In such

0018-9375 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

172



Fig. 2. Lattice diagram to calculate the distortion for the balanced multitopology line.

T-topology, the signal from the Tx transmits the significantly distorted signal to the Rx because the transmission coefficient is influenced by the impedance that changes at the branch node. In this section, the signal distortion caused by these branched traces is analyzed. In addition, the stub equalizer with an open-stub type that can improve the timing margin by reducing the distortion of the signal is proposed.

#### A. Problem Description

First, before introducing the stub equalizer and analyzing the improvement effect of that, it is necessary to analyze the cause of the signal distortion due to the multitopology trace. A well-known lattice diagram can be used to explain the distortion caused by reflections coming from the legs of the balanced multitopology [9]. To quantitatively analyze the signal distortion due to the multiple topology legs, it is calculated using a lattice diagram, as shown in Fig. 2. The main trace length L of this circuit is 800 mils, and the length of each leg of the branched trace,  $L_{T1}, L_{T2}, ..., L_{Tn}$ , has the same length of 400 mils. The characteristic impedance of the main and branched trace is 50  $\Omega Z_0$ . In the actual NAND–controller interface circuit, the load capacitance of a few picofarads and on-die termination (ODT) for a receiver (Rx) need to be applied. However, by replacing the Rx side with an open load condition and removing the ODT of the enabled side, this calculation can be made simpler, and the signals reflected from the other legs causing edge-time distortion can be clarified.

The voltage level of  $V_L$  input to the topology leg past the main trace is determined by the reflection coefficient  $\Gamma_T$ . Since the characteristic impedance of each leg line and the impedance of the main trace are the same, the reflection coefficient  $\Gamma_T$  and  $\Gamma_{T'}$ can be calculated as (1 - n)/(1 + n), where *n* is the number of the topology legs. The larger the number *n* of legs, the closer to a short circuit the node that each leg starts at, and thus, the first launched  $V_L$  becomes smaller on the time section (1), as shown in Fig. 2. When calculating the voltage loaded to the open node of each leg, the voltage reflected from the legs, other than the self-leg, must be added. The signal value traveling in time



Fig. 3. Low to high switching waveforms calculated according to the number of topology legs.

section (3) is the sum of the reflected signal  $V_L\Gamma_{T'}$  of the self-leg (the solid line) and the transmitted signals  $V_L(n-1)(1+\Gamma_{T'})$ from the other legs (the dotted line). As the signal traveling in (1) becomes the signal in (3), the signal in (5) is also calculated in the same way. Therefore, each traveling signal at the time section (1)–(5) can be calculated as the following equations:

$$V_L = \frac{V_S}{2} \times (1 + \Gamma_T) \tag{1}$$

$$V_L' = V_L \times \Gamma_{T'} + (n-1) \left( V_L + V_L \times \Gamma_{T'} \right)$$
(2)

$$V_L'' = \Gamma_{T'} \{ V_L \times \Gamma_{T'} + (n-1) (V_L + V_L \times \Gamma_{T'}) \}$$
  
+  $(n-1) \begin{bmatrix} V_L \times \Gamma_{T'} + (n-1) (V_L + V_L \times \Gamma_{T'}) \\ + \Gamma_{T'} \{ V_L \times \Gamma_{T'} + (n-1) (V_L + V_L \times \Gamma_{T'}) \} \end{bmatrix}$ 

$$A_n = 2V_L \cdot k^0 \tag{4}$$

$$B_n = 2V_L \cdot (k^0 + k^1) \tag{5}$$

$$C_n = 2V_L \cdot (k^0 + k^1 + k^2) \tag{6}$$

where (1) is a signal initially passing by the leg at the time section (1) and (2), and  $V_L'$  in (2) is a traveling signal at the time section (3) and (4). The traveling signal of the time section (5),  $V_L$ " can be determined by (3).

By continuously calculating this way, the stepped edge-time waveform for the low-high switching can be calculated, as shown in Fig. 3. This plot can analytically show the signal distortion caused by the multitopology. In the lattice diagram, as shown in Fig. 2, when  $k = \Gamma_{T'} + (n-1)(1 + \Gamma_{T'})$ , the voltage values  $A_n$ ,  $B_n$ , and  $C_n$  at each step can be calculated more easily, as shown in Fig. 2, by (4), (5), and (6), respectively. Eventually, as the number of legs increases, the absolute value of the reflection coefficient increases so that  $V_L$  has a lower value. The low  $V_L$  makes the saturation time of the signal increase with more steps. This mechanism can be symmetrically seen even in the case of high to low switching and describe the main reason why the eye width is degraded.

leg1  $L_{T1}, Z_0$ Tx0 (host processor) load1. Ći main trace2 main trace1  $R_{S}$  $L_2, Z_0$  $= 0^{1}$  to 1.2 Rx2 (NAND package#2) leg2 = 800 MHz L<sub>T2</sub>, Z<sub>0</sub> edge time,  $t_f$  and  $t_r = 50 \text{ ps}$ open stub load2, C2 data rate = 1.6 Gbps  $L_S, Z_S$ GND 🗸 signal pattern: PRBS, 27-1 (a) Tx1 (NAND package#1) Rs  $(V_s$ Rx0 (host processor)  $L_2, Z_0$ load<sup>0</sup>. C<sub>0</sub> ≷R⁄ Rx2 (NAND package#2) 4 GND load2, C<sub>2</sub> L<sub>s</sub>, Z<sub>s</sub> (b)

Rx1 (NAND package#1)

Fig. 4. Circuit with 2T-topology trace for the data signal interface between the host processor and NAND packages. (a) Write-mode signaling. (b) Read-mode signaling

### B. Stub Equalizer (Mainly Structure Description and Underlying Physics)

The circuits, as shown in Fig. 4, are a data interface circuit with a one host target and two NAND targets as a circuit for simulating the data write mode and read mode between the host processor and the NAND memory using a 2T-topology line. In Fig. 4(a) of the write mode,  $V_S$  (= 1.2 V) generates a data signal output from Tx0 (host processor), and two Rx (NAND packages) are load 1 and 2, and capacitances  $C_1$  and  $C_2$  can be defined as loads of Rx, respectively. Only chip-enabled one of the two or more NAND targets connected to the host processor uses the ODT and the other targets have only C load without the ODT. The main trace length  $L_1+L_2$  of this circuit is 800 mils, and branched traces  $L_{T1}$ and  $L_{T2}$  have the same length as 400 mils. The characteristic impedance of the main and two-branched trace is 50  $\Omega Z_0$ .

In order to mimic typical circuit conditions for the interface between the host processor and NAND memory,  $R_S$  to adjust the driving strength and  $R_T$  of ODT were applied in series and parallel to the Tx and Rx nodes, respectively, where  $R_S$  is 50  $\Omega$ and  $R_T$  is 150  $\Omega$ . The load capacitance of each Rx was set to 2 pF for  $C_1$  and  $C_2$ . The case of the read-mode signaling is shown in Fig. 4(b), where the driving and load conditions of Tx0 and Rx1 are reversed, respectively.

An eye-diagram plot in an advanced design system is used to simulate signal distortion caused by 2T-topology lines. The pseudorandom binary sequence (PRBS) signal from the host processor is output in the 1.6 Gb/s with double data rate (DDR). Although the interface speed between the NAND-host processors developed so far is 1.2 Gb/s, the test is set up at 1.6 Gb/s to verify the signal integrity (SI) improvement effect of the stub equalizer for anticipated future development trends. The fundamental frequency  $f_0$  of the signal is 800 MHz, the swing level is 0–1.2 V, and edge (transition) times  $t_r$  and  $t_f$  are 50 ps for driving source  $V_S$ . In an eye diagram, the eye width is one



Fig. 5. Eye diagram of the write-mode signaling probed at Rx1 in 2T-topology. (a) Distortion due to the topology line without the stub equalizer. (b) Improvement effect of the stub equalizer.



Fig. 6. Eye diagram of the read-mode signaling probed at Rx1 in 2T-topology. (a) Distortion due to the topology line without the stub equalizer. (b) Improvement effect of the stub equalizer.

of the main figures of merit that determines SI quality. The eye width considering the allowed ripple of  $V_{\rm ref}$  represents the timing margin of the signal interface. The allowed ripple is  $\pm 5\%$  of the drive swing level of the Tx. That is, as shown in Fig. 5, the height of the rectangle inside the eye opening is fixed at 0.12 V to determine the width.

As shown in Fig. 5(a), the eye opening of Rx1 is degraded due to the topology line without the stub equalizer at the write-mode signaling. The edge time of the signal is distorted by the reflected signal from the NAND target that is not enabled. To improve this edge distortion, the stub equalizer with  $Z_S$  of 50  $\Omega$  is connected at the center position of the main trace ( $L_1$  and  $L_2 = 400$  mils) with the length 400 mils. Fig. 5(b) shows the enhanced timing margin due to the effect of the stub equalizer.

By improving the edge time, the timing margin is increased by around 13%. This stub equalizer is also working under the read-mode signaling. By comparing Fig. 6(a) and (b), it can be seen that the timing margin improvement effect appears even in read mode. Similar to the write mode, the edge-time distortion is decreased, and the timing margin is enhanced by about 7%. One reason for the difference in the improvement amount in the write and read modes is that the position of the stub equalizer in the write mode works differently from the position in the read mode. Another reason is that a topology that behaves as having balanced legs in write mode can be converted to unbalanced in read mode.

A 4T-topology circuit that uses four NAND targets is often used to increase capacity, as shown in Fig. 7. Fig. 7(a) and (b) shows the configuration for simulating write mode and read mode, respectively, and the driving conditions of the Tx are the same as the case of 2T-topology. Load C of the enabled NAND target with the ODT is 2 pF. All of the nonenabled packages have only C load. The lengths of the main trace and topology



Fig. 7. Circuit with 4T-topology trace for the data signal interface between the host processor and NAND packages. (a) Write-mode signaling. (b) Read-mode signaling.



Fig. 8. Eye diagram of the write-mode signaling probed at Rx1 in 4T-topology. (a) Distortion due to multitopology line without the stub equalizer. (b) Improvement effect of the stub equalizer.

legs with 50  $\Omega$  (characteristic impedance  $Z_0$ ) are 800 mils and 400 mils, respectively. As the 2T-topology, this circuit has a balanced topology of the same length from  $L_{T1}$  to  $L_{T4}$ .

As shown in Fig. 8(a), the number of legs in the topology that cause reflection increases three times, making the edge-time distortion worse than in the 2T-topology case. Thus, the eye width is significantly reduced by about 37% compared with 2T-topology in write-mode operation. In this case, the stub equalizer has a characteristic impedance  $Z_S$  of 50  $\Omega$  and is used at 250 mil



Fig. 9. Eye diagram of the read-mode signaling probed at Rx1 in 4T-topology. (a) Distortion due to multitopology line without the stub equalizer. (b) Effect of the stub equalizer.



Fig. 10. Calculation of a lattice diagram for the effect of the stub equalizer.

in length and 200 mil from the left end of the main trace. It shows the effect of improving the timing margin by about 21% in write mode, as shown in Fig. 8(b). In the read-mode operation of Fig. 9, the stub equalizer makes the eye opening change by improving the edge-time distortion but not increasing the timing margin. Therefore, it can be seen that the timing margin has been improved only in the write mode as a condition of this stub equalizer. The reason why the improvement effect cannot be seen in the read mode is that the variable obtained through the PDA process introduced in Section III improved a few picoseconds in the read mode. However, these variables cannot show the effect of improving the timing margin due to the error between the eye diagram and PDA result.

The main role of the stub equalizer in multitopology can be described by focusing on two points. The first of these is to increase the reflection coefficient by creating another open topology leg to lower the level of the signal output from the drive. In other words, the voltage level of the driving signal at step A is decreased due to the input impedance lowered by the open stub before reaching the topology leg of the Rx side. The second is that the signal, first reflected at the end of the open stub, is transferred to the Rx, which increases the voltage level at step B. To make this mechanism more intuitively calculated in the lattice diagram of Fig. 10, all Rxs make open load condition, the impedance of the main trace, leg, and the stub line are set at 50  $\Omega$ , and the open stub with the length of 400 mils is located in the middle of the main trace ( $L_1$ ,  $L_2$ , and  $L_S$  are same). This lattice diagram can be calculated by the following equations:

$$V_S = \frac{V_S}{2} \times T_S \tag{7}$$

$$V_S' = \frac{V_S}{2} \times T_S \cdot \Gamma_T \tag{8}$$

$$V_S'' = \frac{V_S}{2} \times T_S \cdot (\Gamma_{T'} \cdot \Gamma_S + 1 + \Gamma_S)$$
(9)

$$A'_n = 2V_L \cdot T_S \cdot k^0 \tag{10}$$

$$B'_{n} = 2V_{L} \cdot T_{S} \cdot (k^{0} + k^{1} + \Gamma_{T'} + \Gamma_{S} + T_{S})$$
(11)

where  $V_S$  is the signal input to the open stub (the dotted line), passes the open stub on the main trace 2 (the solid line) at the time section (6), and is determined by (7).  $T_S$  is the transmission coefficient of the stub when  $Z_0$  and  $Z_S$  are both 50  $\Omega$ .  $V'_S$  of (8) is the initial signal reflected by the topology (the solid line).  $A'_n$  of the initial step decreases in  $T_S$  by (10).  $B'_n$  of (11) is determined by the signal,  $V''_S$  of the time section (8) (the solid line) calculated by (9).

As shown in Fig. 11, the time interval of each step forming the saturated waveform is kept constant because the length of the leg is fixed. Therefore, as the difference between the voltage level of *A* and *B* increases in the constant A-B time interval, a steeper transition section is formed, as shown in Fig. 11. This change makes the timing margin improved in the eye diagram. This effect can be more noticeable in real condition, such as a NAND interface circuit with load *C* (1 or 2 pF) of Rx, because the load capacitance makes the transition time of signal switching slower.

#### **III. EQUALIZER OPTIMIZATION**

There are three variables that determine the characteristics of a stub equalizer: the characteristic impedance of the stub, its length, and its position. Finding an optimized variable that can increase the timing margin in a circuit with a multitopology requires a lot of repetitive "what if" case study and time resources. In Section III, a methodology that minimizes these resources and can effectively find optimized open-stub variables is proposed. In this section, to rapidly calculate each case for the stub equalizer performance, a method that analyzes the multitopology circuits with PDA using a transfer function coming from the *ABCD*-parameter is introduced.

#### A. Modeling

In the first step for the PDA analysis, it is necessary to extract the transfer function for the multitopology circuit with *ABCD*-parameter. The multitopology circuit of Figs. 4 and 7 in Section II can be expressed in series for the write mode (a) and read mode (b), as shown in Fig. 12, to extract the *ABCD*-parameter [10]. This figure makes it easy to understand intuitively that topology interface circuits, including the stub equalizer shown in Section II, can connect each part in series. At first, the matrices of parts 1–7 for the write mode with *n*T-topology can be written in order by (12) from Fig. 12(a).



Fig. 11. Comparison of the stub equalizer effect at the multitopology line. (a) Two-leg topology case. (b) Four-leg topology case.

In the case of read mode, (13) of the matrices is composed of the same parts in the order of 1, 6, 5, 4, 3, 2, and 7, as shown in Fig. 12(b)

$$\begin{bmatrix} A_W & B_W \\ C_W & D_W \end{bmatrix} = \begin{bmatrix} 1 & R_S \\ 0 & 1 \end{bmatrix} \begin{bmatrix} \cos \beta_0 l_1 & j Z_0 \sin \beta_0 l_1 \\ j Y_0 \sin \beta_0 l_1 \cos \beta_0 l_1 \end{bmatrix}$$

$$\begin{bmatrix} 1 & 0 \\ \frac{j \tan \beta_S l_S}{Z_S} & 1 \end{bmatrix}$$

$$\begin{bmatrix} \cos \beta_0 l_2 & j Z_0 \sin \beta_0 l_2 \\ j Y_0 \sin \beta_0 l_2 \cos \beta_0 l_2 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ Z_0 \frac{\frac{1}{j \omega C_2} + j Z_0 \tan \beta_0 l_{T_2}}{Z_0 + j \frac{1}{j \omega C_2} \tan \beta_0 l_{T_2}} \\ \end{bmatrix}^{-1} 1 \end{bmatrix}$$

$$\begin{bmatrix} \cos \beta_0 l_{T_1} & j Z_0 \sin \beta_0 l_{T_1} \\ j Y_0 \sin \beta_0 l_{T_1} \cos \beta_0 l_{T_1} \end{bmatrix} \begin{bmatrix} 1 & 0 \\ j \omega C_2 + \frac{1}{Z_t} & 1 \end{bmatrix}$$

$$(12)$$



Fig. 12. Modified circuit with two-topology trace for the data signal interface between the host processor and NAND packages. (a) Write-mode signaling. (b) Read-mode signaling.

$$\begin{bmatrix} A_R & B_R \\ C_R & D_R \end{bmatrix} = \begin{bmatrix} 1 & R_S \\ 0 & 1 \end{bmatrix} \begin{bmatrix} \cos \beta_0 l_{T1} & jZ_0 \sin \beta_0 l_{T1} \\ jY_0 \sin \beta_0 l_{T1} \cos \beta_0 l_{T1} \end{bmatrix} \begin{bmatrix} 1 & 0 \\ \left( Z_0 \frac{\frac{1}{j\omega C_2} + jZ_0 \tan \beta_0 l_{T2}}{Z_0 + j\frac{1}{j\omega C_2} \tan \beta_0 l_{T2}} \right)^{-1} \\ 1 \end{bmatrix}^{n-1} \begin{bmatrix} \cos \beta_0 l_2 & jZ_0 \sin \beta_0 l_2 \\ jY_0 \sin \beta_0 l_2 \cos \beta_0 l_2 \end{bmatrix} \begin{bmatrix} \cos \beta_0 l_2 & jZ_0 \sin \beta_0 l_1 \\ \frac{j \tan \beta_S l_S}{Z_S} & 1 \end{bmatrix} \begin{bmatrix} \cos \beta_0 l_1 & jZ_0 \sin \beta_0 l_1 \\ jY_0 \sin \beta_0 l_1 \cos \beta_0 l_1 \end{bmatrix} \begin{bmatrix} 1 & 0 \\ j\omega C_2 + \frac{1}{Z_t} & 1 \end{bmatrix}$$
(13)

$$TF_{w/stub}$$
 of write mode  $= \frac{1}{A_W}$  (14)

$$TF_{w/stub}$$
 of read mode  $= \frac{1}{A_R}$  (15)

where *n* is the number of the topology legs, and  $\beta_0$  and  $\beta_s$  are the phase constants of the trace and the stub equalizer, respectively. The transfer function of the write mode is the reciprocal of the element  $A_W$  of (12) and likewise, the transfer function of the read mode is the  $1/A_R$ , as shown in (14) and (15). In the case of a TF without the stub equalizer, it can be implemented by excluding the third matrix of the right-hand side in (12) and (13). With (12)–(15), the transfer function for a topology circuit with *n* legs can be generalized.

In the next step, the source to be convolved with the transfer functions created in the previous section is implemented. A single-bit pulse is presented by using the function of the



Fig. 13. Periodic trapezoidal waveform in the time domain.

spectrum of periodic trapezoidal waveforms with a specific period and transition time in the time domain, as shown in Fig. 13. As a frequency-domain function, this waveform is represented by the following equations [11]:

$$c_n = A \frac{\tau}{T} \frac{\sin\left(n\frac{\pi\tau}{T}\right)}{n\frac{\pi\tau}{T}} \frac{\sin\left(n\frac{\pi\tau_r}{T}\right)}{n\frac{\pi\tau_r}{T}} e^{-jn\frac{\pi(\tau+\tau_r)}{T}} \bigg|_{\tau_r = \tau_f}$$
(16)

$$c_{0} = A \frac{\tau}{T}, \quad c_{n}$$

$$= A \frac{\tau}{T} \left| \frac{\sin\left(n\frac{\pi\tau}{T}\right)}{n\frac{\pi\tau}{T}} \right| \left| \frac{\sin\left(n\frac{\pi\tau_{r}}{T}\right)}{n\frac{\pi\tau_{r}}{T}} \right| \Big|_{n \neq 0} : \text{ magnitude} \quad (17)$$

$$\angle c_n = \begin{pmatrix} -n\frac{\pi(\tau+\tau_r)}{T}, \sin\left(n\frac{\pi\tau}{T}\right)\sin\left(n\frac{\pi\tau_r}{T}\right) \ge 0\\ \pi - n\frac{\pi(\tau+\tau_r)}{T}, \sin\left(n\frac{\pi\tau}{T}\right)\sin\left(n\frac{\pi\tau_r}{T}\right) < 0 \end{cases} : \text{ angle}$$
(18)

where, as shown in Fig. 13, the pulsewidth (50%) of the single-bit pulse  $\tau$  is 0.625 ns. The rise and fall time,  $\tau_r$  and  $\tau_f$ , is 50 ps, respectively. This pulse needs to be a Tx source that will replace the random pattern as a driving source in the DDR interface of 1.6 Gb/s. Thus, although T is the period of the waveform, it is set as a period sufficient (more than 10 ns) to look like a single-bit pulse to make the PDA possible. From (16), the single-bit pulse implemented in the frequency domain as a driving input signal can be convolved with the transfer function, such as (14) or (15) created earlier, to obtain a response for the single-bit pulse.

Fig. 14 shows the results of the single-bit response implemented by convolution and inverse fast Fourier transform (IFFT). In this figure, the results of the usage of the stub equalizer can be compared in the cases of 2T-topology and 4T-topology under the read mode and write mode. The transfer functions without a stub equalizer can replace the  $j \tan \beta_S l_S / Z_S$  term to 0 in (12) and (13), and the cases of the 2T-topology and 4T-topology are implemented when n = 2 and 4 in (12) and (13), respectively, depending on the number of topology legs. This method shows that the effects of using stub equalization for single-bit responses can be clearly distinguished.

#### B. Correlation of the PDA and eye Diagram

PDA can be achieved based on the single-bit response by the convolution with the transfer function for the multitopology circuit similar to the result of Fig. 14. PDA is advantageous in optimizing the stub equalizer because it can handle thousands



1.2

0.6

0.4

0.2

-0.2 L

1.2

0.5

Voltage [V] 0.8

- Trapzoidal source input

2.5

Trapzoidal source input

w/ open-stub line

w/o open-stub line w/ open-stub line

1.5 2

Time [ns]

(a)

- Trapzoidal source input

Trapzoidal source input

w/o open-stub line

··· w/ open-stub line

w/o open-stub line w/ open-stub line

1.5 2 2.5

Time [ns]

(b)

trapezoidal input signal. (a) Write-mode case of 2T-topology. (b) Write-mode case of 4T-topology. (c) Read-mode case of 2T-topology. (d) Read-mode case of 4T-topology.

of "what if" cases in a shorter time than real-time eye diagrams that induce worst cases with random patterns. In general, PDA can show signal distortion due to intersymbol interference and this can be decomposed into two cases, worst-case "1" and "0", and analyzed [12], [13]. The eye edge due to the worst-case "1" and "0" can be calculated by (19) and (20), respectively

$$S_{1}(t) = y(t) + \sum_{\substack{k=-\infty\\k\neq 0}}^{\infty} y(t - kT') \bigg|_{y(t - kT') < 0}$$
(19)

$$S_{0}(t) = \sum_{\substack{k=-\infty\\k\neq 0}}^{\infty} y(t - kT') \bigg|_{y(t - kT') > 0}$$
(20)

where y(t) is the single-bit pulse response of the interconnect, and T' is the symbol period (pulsewidth). In order to verify the correlation with the eye diagram, the eye edge of the PDA is calculated by setting k = 7 in consideration of accuracy and effective calculation time. A larger k would be better for the more accurate PDA, but a kind of tradeoff is needed to allow efficient computational time. At this time, T' is 625 ps to have the same pulsewidth as the 1.6 Gb/s PRBS, which is the input source of the eye diagram plotted in Section II.

As shown in Fig. 15, it can be seen that the results of the eye diagram and PDA almost match in terms of the eye edge. This experiment is for a 2T-topology circuit, and the main trace and stub equalizer conditions in write mode [see Fig. 15(a)] and read mode [see Fig. 15(b)] are the same as those used in Section II. Therefore, the PDA calculation used in this article



Fig. 15. Comparison of the eye diagram and PDA for the 2T-topology circuit. (a) Write mode. (b) Read mode.

can be sufficiently applied to the stub equalizer optimization method proposed in Section III-C.

#### C. Optimization

In this article, there are three variables that determine the characteristics of the stub equalizer that improves the SI of the topology circuit: characteristic impedance  $(Z_S)$ , length  $(L_S)$ , and position  $(P_S)$  of the open stub. Optimizing these three variables is possible through thousands of "what if" experiments. Furthermore, it may take a significant amount of time to calculate the SI margin improvement of the stub equalizer through a general eye diagram. However, determining whether to improve the SI margin using a PDA calculation with an exhaustive search (brute-force search) method can be an effective SI improvement strategy because the exhaustive search method will find all values of variables that meet a specific condition within the range [14], [15].

The exhaustive search calculation with PDA follows the flowchart in Fig. 16. At first, the Tx and Rx operating conditions of the circuit, including multitopology information, are determined (input circuit). In order to optimize the condition (variables) of the stub equalizer, the criteria can freely decide the direction of improvement depending on stub usage. Fig. 17 shows an example of determining the criteria to improve the timing margin of the eye diagram as compared with the default circuit with a topology without the stub equalizer. If the criteria need to be decided in order to improve setup timing or hold timing by fixing the allowed ripple as described in Section II-B, an eye mask with a rectangular shape is effective. For example, when needed to improve the setup timing, the criteria to collect

Voltage [V] 0.8

0.6

0.4

0.2

-0.2

1.2

0

0.5



Fig. 16. Flowchart for the exhaustive search using PDA.



Fig. 17. Example for the decision of criteria to improve timing margin.

the variables of the stub equalizer should be used, which can make results in the eye mask extending to the left, as shown in Fig. 17. According to changing the criteria, toward the target of the hold timing improvement, it is available to collect the result of expanding the eye mask to the right. And, if needed, to set the criteria more strictly, it is possible to make the eye mask extend in both directions. In this article, the criteria were determined to have an eye mask extended to the left to improve the setup timing margin. Second, the exhaustive search is carried out by extracting the number of all cases for  $L_S$  and  $P_S$  ( =  $L_1$ ) of the stub using the PDA that was used in Section III and makes the "what if" case simulations rapid with the predetermined criteria. Assume the characteristic impedance  $(Z_S)$  of the stub is fixed at 50  $\Omega$ . For the two topologies of Section II, the variable for the stub equalizer can be ranged from 50 to 800 mils in length and 100 to 700 mils in position. And, assuming that all of these variables vary by 20 mils, the number of cases for the length and position is 38 and 31, respectively. Therefore, the number of all cases for the combination of two variables is 1178.



Fig. 18. Contour map indicating improvement by the condition of the stub equalizer from the exhaustive search for the 2T-topology circuit. (a) Write mode. (b) Read mode.

 TABLE I

 Optimized Results of Stub Equalization for a 2T-Topology Circuit

|                             |                                   | 2T-topology circuit |         |         |  |  |  |
|-----------------------------|-----------------------------------|---------------------|---------|---------|--|--|--|
|                             |                                   | Case #1             | Case #2 | Case #3 |  |  |  |
| Condition of stub equalizer | Length, L <sub>s</sub><br>[mil]   | 170                 | 170     | 190     |  |  |  |
|                             | Position, P <sub>s</sub><br>[mil] | 200                 | 220     | 220     |  |  |  |
| Increase in eye-width       | Write-mode                        | 31 ps               | 25 ps   | 22 ps   |  |  |  |
|                             | Read-mode                         | 29 ps               | 32 ps   | 26 ps   |  |  |  |

#### IV. VALIDATION

Variables of the stub equalizer obtained by optimization using PDA and exhaustive search should be able to check the effect in time-domain simulation using PRBS. In this section, the SI improvement effect of the optimized variables for the length and position of the stub equalizer for the 2T-topology and the 4T-topology cases obtained in Section III is verified with an eye diagram.

#### A. 2T-Topology

Fig. 18 shows the result of an exhaustive search with the total number of 1178 cases for a 2T-topology circuit performed under this condition. As the third process in the flowchart, it is a kind of contour map of the write and read modes collection based on the criteria. The blue area represents the range of variables for which the actual exhaustive search has been run, and the red area indicates the value of the stub equalizer variable with improved eye width beyond the criteria. As it can be seen in Fig. 18(a), for the write mode, 94 cases have a SI improvement effect in the length  $(L_S)$  range of 350–520 mils at most locations  $(P_S)$ . On the other hand, in the read mode, there are only six that are improved. However, when the graphs [see Fig. 18(a) and (b)] are overlapped, three conditions of the stub equalizer (490, 660), (490, 680), and (490, 700) that can improve both the write mode and the read mode can be found as the final process of the flowchart.

In Table I, from the measurement of the real-time eye diagram, it can be seen how much the optimized conditions of the stub



Fig. 19. Contour map indicating improvement by the condition of the stub equalizer from the exhaustive search for the 4T-topology circuit. (a) Write mode. (b) Read mode.

 TABLE II

 Optimized Results of Stub Equalization for 4T-Topology Circuit

|                                   |                                   | 4T-topology circuit |         |         |         |         |         |         |         |
|-----------------------------------|-----------------------------------|---------------------|---------|---------|---------|---------|---------|---------|---------|
|                                   |                                   | Case #1             | Case #2 | Case #3 | Case #4 | Case #5 | Case #6 | Case #7 | Case #8 |
| Condition<br>of stub<br>equalizer | Length, L <sub>s</sub><br>[mil]   | 170                 | 170     | 190     | 190     | 250     | 250     | 270     | 270     |
|                                   | Position, P <sub>s</sub><br>[mil] | 200                 | 220     | 220     | 240     | 180     | 200     | 140     | 160     |
| Increase in eye-width             | Write-mode                        | 25 ps               | 25 ps   | 35 ps   | 25 ps   | 44 ps   | 60 ps   | 50 ps   | 44 ps   |
|                                   | Read-mode                         | 3 ps                | 0 ps    | 0 ps    | -6 ps   | -6 ps   | -9 ps   | -12 ps  | -12 ps  |

equalizer for a 2T-topology circuit show improvement in the eye width. Compared with the before usage of the stub, the three cases have an improvement effect of about 7%–10% in the write-mode and about 6%–8% in the read mode.

#### B. 4T-Topology

Next is an experiment with a circuit with 4T-topology to verify if this exhaustive search calculation with PDA is available even in the case of multitopology with more than four legs. The circuit used in this experiment has the same trace and Tx/Rx operating conditions as the 4T-topology case used in Section II. Fig. 19 shows the stub length and position values for SI improvement in a circuit with 4T-topology. In the write mode of Fig. 19(a), 51 cases improve the eye width, and in the read mode, 85 cases are effective, as shown in Fig. 19(b). It is possible to collect the stub length and position values that can improve both modes by overlapping the two graphs in the same way as in the previous 2T-topology analysis. For a circuit with 4T-topology, eight cases can improve the read and write modes.

The improvement effect of these optimized eight cases can be given in Table II. There is an improvement effect of about 5%-13% in the write mode and a tendency of a slight degradation from -2%-1% in the read mode. However, since the improvement effect of the write mode is relatively larger and the loss in the read mode is very small, it is valid that the effectiveness of the stub equalizer can be seen even in four or more multitopologies.

#### V. CONCLUSION

In this article, a stub equalizer in the form of an open-stub line is proposed as a passive equalizer that can be used for SI improvement in a bidirectional high-speed interface circuit, such as a host processor-NAND memory of an SSD module. This stub equalizer has the effect of increasing the timing margin by improving the eye width. It is shown by the lattice diagram that it is caused by the impedance mismatch of the trace due to the reflection mechanism, according to the length and location of the stub. In addition, an exhaustive search technique using PDA is proposed to effectively find the optimized condition of the stub equalizer. It can be verified that this methodology can be applied not only to 2T-topology circuits but also to circuits having more than four topological legs. The methodology using this stub equalization is expected to be widely used in a variety of high-speed memory interface circuits with multitopology for implementing high capacity.

#### REFERENCES

- [1] C.-C. Chiu, K.-Y. Yang, Y.-H. Lin, W.-S. Wang, T.-Y. Wu, and R.-B. Wu, "A novel dual-sided fly-by topology for 1–8 DDR with optimized signal integrity by EBG design," *IEEE Trans. Compon., Packag. Manuf. Technol.*, vol. 8, no. 10, pp. 1823–1829, Oct. 2018.
- [2] D. Wang, B. Ganesh, N. Tuaycharoen, K. Baynes, A. Jaleel, and B. Jacob, "DRAMsim: A memory system simulator," ACM SIGARCH Comput. Archit. News, vol. 33, no. 4, pp. 100–107, 2005.
- [3] K.-II Oh, L.-S. Kim, K.-II Park, Y.-H. Jun, J. S. Choi, and K. Kim, "A 5-Gb/s/pin transceiver for DDR memory interface with a crosstalk suppression scheme," *IEEE J. Solid-State Circuits*, vol. 44, no. 8, pp. 2222–2232, Aug. 2009.
- [4] Open NAND Flash Interface Workgroup, "Open nand Flash Interface Specification Revision 4.1," 2017.
- [5] H.-J. Kim et al., "7.6 1Gb/s 2Tb NAND flash multi-chip package with frequency-boosting interface chip," in Proc. IEEE Int. Solid-State Circuits Conf.—Dig. Tech. Papers, 2015, pp. 1–3.
- [6] C. Kim *et al.*, "A 21 nm high performance 64 Gb MLC NAND flash memory with 400 MB/s asynchronous toggle DDR interface," *IEEE J. Solid-State Circuits*, vol. 47, no. 4, pp. 981–989, Apr. 2012.
- [7] S.-Y. Huang, Y.-S. Cheng, K.-Y. Yang, and R.-B. Wu, "Fast prediction and optimal design for eye-height performance of mismatched transmission lines," *IEEE Trans. Compon., Packag. Manuf. Technol.*, vol. 4, no. 5, pp. 896–904, May 2014.
- [8] G. Kim et al., "Modeling of eye-diagram distortion and data-dependent jitter in meander delay lines on high-speed printed circuit boards (PCBs) based on a time-domain even-mode and odd-mode analysis," *IEEE Trans. Microw. Theory Techn.*, vol. 56, no. 8, pp. 1962–1972, Aug. 2008.
- [9] S. H. Hall, G. W. Hall, and J. A. McCall, *High-Speed Digital System Design: A Handbook of Interconnect Theory and Design Practices*. New York, NY, USA: Wiley, 2000, pp. 92–94.
- [10] D. M. Pozar, *Microwave Engineering*. Hoboken, NJ, USA: Wiley, 2012, pp. 188–194.
- [11] N. Oswald, B. H. Stark, D. Holliday, C. Hargis, and B. Drury, "Analysis of shaped pulse transitions in power electronic switching waveforms for reduced EMI generation," *IEEE Trans. Ind. Appl.*, vol. 47, no. 5, pp. 2154–2165, Sep./Oct. 2011.
- [12] E. Song, J. Cho, J. Kim, Y. Shim, G. Kim, and J. Kim, "Modeling and design optimization of a wideband passive equalizer on PCB based on near-end crosstalk and reflections for high-speed serial data transmission," *IEEE Trans. Electromagn. Compat.*, vol. 52, no. 2, pp. 410–420, May 2010.
- [13] B. K. Casper, M. Haycock, and R. Mooney, "An accurate and efficient analysis method for multi-Gb/s chip-to-chip signaling schemes," in *Proc. Symp. VLSI Circuits Dig. Tech. Papers*, 2002, pp. 54–57.
- [14] M. Z. Coban and R. M. Mersereau, "A fast exhaustive search algorithm for rate-constrained motion estimation," *IEEE Trans. Image Process.*, vol. 7, no. 5, pp. 769–773, May 1998.
- [15] L. Neumann, and J. Matas, "Text localization in real-world images using efficiently pruned exhaustive search," in *Proc. Int. Conf. Document Anal. Recognit.*, 2011, pp. 687–691.



**Taelim Song** (Member, IEEE) received the B.S. degree in electronics and radio engineering from Kyung Hee University, Seoul, South Korea, in 2003, the M.S. degree in electrical and electronic engineering from Yonsei University, Seoul, South Korea, in 2005, and the Ph.D. degree in electrical and electronic engineering from Yonsei University, Seoul, South Korea, in 2016.

From January 2005 to September 2010, he was a chip-package-PCB SI/PI/EMI Analysis Engineer with LG Electronics Inc., South Korea, and from July

2015 to February 2020, he was a chip-package-PCB SI/PI Analysis Engineer with SK Hynix Inc., South Korea. Since 2020, he has been a Visiting Assistant Research Professor with Electrical and Computer Engineering Department, Missouri University of Science and Technology, Rolla, MO, USA. His main research interests include high-speed interface circuits, signal/power integrity, and electromagnetic interference field analysis in digital circuit systems. Recently, his research has consisted of the analysis of RF defense and signal integrity of a storage system.



**Jongjoo Lee** (Member, IEEE) received the M.S. and Ph.D. degrees in electrical engineering from the Korea Advanced Institute of Science and Technology, Daejeon, South Korea, in 1997 and 2001, respectively.

His doctorial dissertation demonstrated the worldfirst development of photoconductive vectorial electric near-field probes using micromachining for transient mapping of picosecond electric-pulse propagation phenomena. In 2002, he joined Package team, Memory division, Samsung Electronics, Hwasung,

South Korea, where he had developed the SI/PI cosimulation method world widely used until now and the high-performance and high-density stack-package solutions, as a SI/PI Leader. After leading electromagnetic interference TF at DRAM design team from 2009 to 2010, he was appointed to Flash Solution development team, where he had directed the design and development of SSD hardware, board design guides for mobile memory customers, and the SI/Phy/EMC of NAND-storage devices until 2018. He was a Visiting Scholar with Shanghai Jiaotong University, China, and the Missouri University of Science and Technology, Rolla, MO, USA, in 2015 and 2019, respectively. In 2020, he joined the SK Hynix Inc., Seongnam-si, South Korea, where he is currently a Fellow of the Head of Solution Design and Integration Group. He has authored or coauthored more than 50 patents and 40 papers. His current research interests include SI/PI/EMC and codesign and integration of device to the system.



**Chulsoon Hwang** (Senior Member, IEEE) received the B.S., M.S., and Ph.D. degrees in electrical engineering from the Korea Advanced Institute of Science and Technology, Daejeon, South Korea, in 2007, 2009, and 2012, respectively.

From 2012 to 2015, he was a Senior Engineer with Samsung Electronics, Suwon, South Korea. In July 2015, he joined the Missouri University of Science and Technology (formerly University of Missouri-Rolla), Rolla, MO, USA, where he is currently an Assistant Professor. His research interests include RF

defense, signal/power integrity in high-speed digital systems, electromagnetic interference/electromagnetic compatibility (EMC), hardware security, and machine learning.

Dr. Hwang was the recipient of the AP-EMC Young Scientist Award, the Google Faculty Research Award, and Missouri S&T's Faculty Research Award, and the corecipient of the IEEE EMC Best Paper Award, the AP-EMC Best Paper Award, and a two-time corecipient of the DesignCon Best Paper Award.