The data rate of VLSI interconnections has been increasing according to the demand for high-speed operation of semiconductors such as CPUs. To realize high performance VLSI systems, high-speed data communication has become an important factor. However, at high-speed data rates, it is difficult to achieve accurate communication without bit errors because of inter-symbol interference (ISI). This paper presents highspeed data communication techniques for VLSI systems using TomlinsonHarashima Precoding (THP). Since THP can eliminate the ISI with limiting average and peak power of transmitter signaling, THP is suitable for implementing advanced low-voltage VLSI systems. In this paper, 4-PAM (Pulse amplitude modulation) with THP has been employed to achieve high-speed data communication in VLSI systems. Simulation results show that THP can remove the ISI without increasing peak and average power of a transmitter. Moreover, simulation results clarify that multiple-valued data communication is very effective to reduce implementation costs for realizing high-speed serial links. key words: high-speed interface, Tomlinson-Harashima precoding (THP), intersymbol interference, multiple-valued signaling, low-voltage VLSI systems 
Introduction
Demand for data rates beyond Gbps has been increasing in order to manage the exponentially growing data traffic in VLSI communications. However, in such high-speed serial links, channel distortion such as inter-symbol interference (ISI) and noise significantly limit I/O bandwidth relative to device performances. To avoid noise effects, efficient transmission techniques using coding methods, such as MV-CDMA, have been proposed [1] . For high-speed data rates at Gbps levels, the main difficulty is the equalization of signal distortion due to loss of interconnections.
In VLSI systems, strip lines (SL) and/or microstrip lines (MSL) are widely used as interconnections. However, for high-speed data rates at Gbps levels, these interconnections exhibit complex behavior that makes accurate transmission of information difficult. The reason for this behavior is that the interconnections act as low-pass filters, because of the skin effect [2] and dielectric loss, which decreases high frequency components of signals. Therefore, the waveform of transmitter signals is distorted at the re-ceiver by passing through a bandwidth-limited interconnection. The transformation introduces ISI, which causes bit errors because of the difficulty of 0/1 detection at the receiver [3] .
Several signal processing techniques for pulse shaping have been proposed to reduce ISI at the receiver. In VLSI systems, pre-emphasis or post-equalizers, such as DFEs (Decision Feedback Equalizer), are widely employed. The pre-emphasis can cancel the ISI at a transmitter by emphasizing rising/falling edges of pulse shapes. These techniques are very effective for reducing ISI. However, it is well known that pre-emphasis suffers from increasing the peak and average power at a transmitter. If the characteristics of a transmission line H(z) are known in advance, ISI can be eliminated completely using signal processing techniques, such as Tomlinson-Harashima Precoding (THP), which was proposed in 1971 by Tomlinson [4] and Harashima [5] , independently. The THP has been applied mainly to wireless communication systems and optical transmission systems [6] .
THP can remove the ISI that is contributed by the postcursor of wave responses at a transmitter, and it can be realized by a digital filter that has a modulo-N adder. Because THP can limit peak and average power of transmitter signaling by modulo-N arithmetic, it is suitable for implementation in high-speed serial links for low-voltage VLSI systems [7] . Recently, future memory link systems using THP in 22-nm silicon-on-insulator (SOI) CMOS have been proposed, and the advantages are clarified [8] .
This paper evaluates high-speed data transmission over interconnections in VLSI systems using MSLs and hardware cost of THP. In [9] , authors have evaluated fundamental performance of THP for electrical communication by using coaxial cables. To estimate THP performance for highspeed serial links for VLSI systems in practical environment, this paper discusses transmission performance that is based on measurement characteristics of an evaluation board of a MSL. Using this transmission line, 2-PAM (Binary) and 4-PAM (4-valued) data communication are compared by a numerical simulation based on measurement results of the test board. From these simulation results, the effectiveness of THP using multiple-valued signaling is discussed from the viewpoint of hardware costs.
In this paper, Sect. 2 provides measurement and simulation results for transmission characteristics of a MSL. In Sect. 3, we provide simulations of a high-speed data communication on the MSL. In Sect. 4, we demonstrate the efCopyright c 2014 The Institute of Electronics, Information and Communication Engineers fectiveness of THP for high-speed serial links. In Sect. 5, we discuss the hardware costs of the THP implementation. Section 6 concludes our paper.
Evaluation of a Microstrip Line
To estimate transmission characteristics of VLSI interconnections, an evaluation board of a MSL is fabricated. The length of the evaluation board is 1 m, and the line is made on an FR-4 substrate with assumption of VLSI interconnections, such as backplane systems and serial memory link systems. As shown in Fig. 1 , the line width was designed to realize 50 Ω characteristics impedance [10] . Figure 2 shows the S21 parameter of the test board of a MSL 1 m, which was measured using a vector network analyzer (Agilent E8357A). As shown in Fig. 2 (a) , the attenuation increased because of skin effect and a dielectric loss of MSL, so it behaves as a low-pass filter. Figure 2 (b) shows that phase delay increases with frequency. Figure 2 (c) shows the flat frequency characteristic of group delay.
Measurement of Frequency Characteristics of a MSL
As shown in Fig. 2 (a), the attenuation is −5.8 dB at 1 GHz, and it corresponds to signal decay of 51.2% at a receiver. At 2 GHz and 5 GHz, attenuation are −10.4 dB and −26.3 dB, and these show that transmitted signals are attenuated to 30.2% and 4.8%, respectively.
Simulation of Time Domain Characteristics of a MSL
In order to simulate the time domain characteristics of the MSL, an impulse response is obtained from measurement results of frequency characteristics. The impulse response is calculated by using an Inverse Fast Fourier Transform (IFFT) of measured frequency characteristics. Figure 3 shows the waveforms of impulse response of MSLs 1 m and 2 m. As shown in Fig. 3 , the attenuation and delay increase with the length of a MSL, resulting in different waveforms. Using these impulse responses, transmitted signals at the receiver can be calculated by a convolution of the impulse response and transmitter signals. 
Simulation of High-Speed Data Communication on VLSI Interconnections
In this section, accuracy and validity of our simulation are shown. Also, ISI effects of binary signaling (2-PAM) and 4-valued signaling (4-PAM) on VLSI interconnections are simulated.
Evaluation of Numerical Simulation
The data symbols of binary (2-PAM) are 0 or 1. Figures 4 and 5 show waveforms of transmitter signaling and receiver signaling at 2 Gbps and 5 Gbps data communication using MSLs 1 m. The bits of 0 and 1 are mapped to the voltage level −0.5 V and +0.5 V, respectively. This simulation focuses on verifying the effects of ISI for high-speed data transmission, so AWGN (Additive White Gaussian Noise) is not considered. The waveforms at a receiver are distorted, and the distortion is expanded depending on the increasing data rate due to propagation loss. Figure 5 shows that symbol decision of 0 or 1 at a receiver becomes difficult at 5 Gbps data rate due to the severe ISI.
To confirm these simulation results, the actual eye diagram of the evaluation board is measured. As shown in Fig. 6 (a), a pulse generator produces a random pulse pattern of 0 or 1 which is applied to the MSL 1 m. Then, a sampling oscilloscope observes the transmitted waveform and draws the eye diagram. Figure 6 (b) is the actual measurement system. D3186 (ADVANTESET) and WaveMASTER 8500A (LeCroy) are used as the pulse generator and sampling oscilloscope, respectively. Figure 7 is our numerical simulation results using the measurement condition. Comparison of the simulation results and experimental measurements (Fig. 8) indicates good agreement between the two. Figure 9 shows the eye diagram of 2 Gbps in 1 m which was simulated by using a circuit simulator, Agilent ADS (Advanced Design System). The measured S21 parameter of the test board can be imported to a transmission line model in ADS simulation systems ( Fig. 9 (a) ). Comparing of Fig. 9 (b) and Fig. 7 , our numerical simulation method can obtain the same results. In the following sections, we evaluate THP in terms of hardware costs by using our numerical simulation. Figures 10 (a)-(c) are simulation results of eye diagrams of binary signaling. As shown in Fig. 10 (a) , the eye is opened at 2Gbps on 2 m MSL. Hence, the amplitude information 0 or 1 can be detected by a threshold voltage as same as higher than 0.3, and it is hard to detect the information at the receiver.
Binary Signaling Transmission on a MSL

4-Valued Data Signaling Transmission on a MSL
In 4-PAM, the transmitter signals are 4-valued, which means that 4-PAM can transmit 2-bit (00/01/10/11) data simultaneously. Therefore, 4-valued (4-PAM) signaling can represent 2 bit information as one symbol using a 4-level signal, resulting in reducing data rates by half. Figures 11 (a) and 11 (b) are eye diagrams of 4-PAM 2 Gsysmbol/s (sps) (corresponds to binary 4 Gbps) data rate on MSLs 1 m and 2 m, respectively.
Although the ISI effect is enlarged according to the number of division of PAM, the eye is slightly opened at 2Gsps in a MSL 1 m, and hence we can detect the data correctly. However, the eye is completely closed in a MSL 2 m because of severe ISI as shown in Fig. 11 (b) , and hence bit decision is difficult.
High-Speed Data Communication Using TomlinsonHarashima Precoding
Tomlinson-Harashima Precoding Techniques
Because THP can limit peak and average power of transmitter signaling, it is suitable for implementation in highspeed serial links for advanced low-voltage VLSI systems. Up to now, to eliminate ISI effects at a receiver for achieving high-speed interfaces, we have tried to apply TomlinsonHarashima Precoding (THP) to data transmission for VLSI systems [7] . Figure 12 illustrates a block diagram of THP circuitry. As shown in Fig. 12 , THP can be achieved with a digital filter that uses a modulo-N adder instead of a conventional adder. The modulo-N adder always has a magnitude between −N/2 and +N/2. Consequently, the dynamic range of THP output is limited between −N/2 and +N/2, and it has the advantage of limiting the average and peak power in the transmitter output. The modulo-N adder can be implemented by using digital adder circuits.
The weighting coefficients of THP (−h 1 , −h 2 , −h 3 , · · ·, −h n ) are determined by the transfer function H(z) of a transmission line. Assuming that the transmission line is characterized by the transfer function
using the z-transform, the coefficients are −h 1 , −h 2 , −h 3 , · · ·, −h n . THP coefficients h 1 , h 2 , h 3 , · · ·, h n are sampled values of impulse response at the transmission data rate. The number of taps for THP depends on characteristics of H(z), especially the post-cursor length. Note that the number of taps directly corresponds to circuit costs as shown in Fig. 12 . Therefore, the number of taps should be small according to the channel property so as to reduce circuit costs and the power dissipation of transceiver circuitry. Figure 13 shows the proposed data transmission system using THP. The data symbols a i are modulated at the transmitter. The output data of THP b i is 
and therefore the magnitude of b i is limited between −N/2 and +N/2. At the receiver, the data after transmission is modulo-N reduced, and THP can cancel the ISI completely, resulting in achieving ISI-free communication. However, as shown in Fig. 13 , to convert the output data b i to analog signals, Digital to Analog Converter (DAC) is necessary at a transmitter. To demodulate the signal at receiver, Analog to Digital Converter (ADC) is required. In high speed data transmission using THP, high-speed DAC and ADC are important factor. This fact is an issue for THP implementation in VLSI.
Binary Data Communication Using THP
To confirm the principle, THP is applied to binary data communication on MSLs. Figure 14 shows transmitter signaling using THP with various conditions. Figure 14 (a) is THP output signaling at 5 Gbps on a MSL 1 m, and Figs. 14 (b) and 14 (c) are THP output signaling at 2 Gbps and 5 Gbps on a MSL 2 m. Those THP taps are 7, 9 and 11, respectively. At 5 Gbps as shown in Figs. 14 (a) and 14 (c), the amplitude is limited between −1 to 1 by modulo arithmetic. At 2 Gbps data transmission (Fig. 14 (b) ), however, the output waveform is similar to the pre-emphasis one, because THP coefficients are stable in this condition. Figure 15 shows simulation results of eye diagrams of 2-PAM with THP. Figures 15 (a) and 15 (b) are 2 Gbps data rate on MSLs 1 m and 2 m, respectively. Also, Fig. 15 (c) shows 5 Gbps data rate on a MSL 2 m. After modulo-N reduction using an ADC, THP can remove ISI, and the eyes are clearly opened as shown in Fig. 15 . Here, 4-PAM with 2 Gsps corresponds to 4 Gbps binary data transmission. This means that operating frequency of 4-PAM THP circuitry can be halved compared to binary one, which contributes the low-power operation. As shown in these figures, THP can provide clear eye diagram in multiple-valued data communication. At a receiver side, we can detect 4-valued data symbols correctly. Note that the ADC/DAC-based THP transceiver circuits for binary and 4-PAM are the same except for the required number of taps. Since 4-valued signaling enables us to reduce the data rate, ISI effects can also be mitigated. Therefore, it is possible to reduce the number of taps for 4-PAM THP than binary one, resulting in decreasing hardware costs.
However, this simulation does not consider AWGN in transmission lines. To evaluate more realistic transmission on MSLs, it is necessary to consider AWGN effects that are determined by the channel environment. 
Discussion of Implementation Costs of THP
The achievable data rate of serial links using THP is mainly limited by the operating frequency of transceiver circuitry, especially ADCs and DACs. For instance, to achieve 2 Gbps data rate in 2-PAM, high-performance circuits that can operate in 2 GHz are required. In contrast, multiple-valued signaling, which can reduce data rates as well as the operating frequency of circuits, is very effective to achieve high-speed serial links in VLSI system. Figure 17 shows the eye diagram of 4-PAM 2.5 Gsps (5 Gbps) with THP. Comparing the same data rate of 5 Gbps using binary signaling (Fig. 15 (a) ), similar eye diagram is obtained. These waveforms are fed to the same ADCs to perform modulo-N reduction. Therefore, the use of 4-PAM THP offers advantages over binary THP, namely the reduction of operating frequency by half and the reduction of the number of taps for THP. For instance, in these simulations, binary THP needs to operate 5 GHz using 7 taps. On the other hand, 4-PAM THP operates 2.5 GHz using 5 taps. As a result, reduction of both operating frequency and number of taps (Table 1 ) using 4-PAM greatly contributes to decreasing the power consumption of THP. Moreover, since THP consists of digital rich ADC/DAC-based architectures, the performance of THP can be improved by the CMOS scaling.
To reduce the hardware costs, bit resolution for the THP operation is important. As shown in Fig. 18 , bit resolution of THP coefficients is fixed at L = 8-bit. converter is required for THP implementation in this channel environment.
Conclusion
This paper evaluated high-speed data transmission systems using Tomlinson-Harashima Precoding. First, transmission characteristics of a MSL were measured and simulated, and eye diagrams of 2-PAM and 4-PAM were evaluated using numerical simulation. Next, the feasibility of the data transmission on a MSL with THP was demonstrated. To evaluate hardware costs, the required bit resolution of THP implementation was simulated.
From simulation results, THP can remove ISI effects, and thus high-speed data transmission on a MSL can be achieved. By using multiple-valued signaling with THP, operating frequency and implementation cost can be reduced. As a result, 4-PAM THP has a possibility to accomplish high-speed serial links for VLSI systems, such as backplanes. Furthermore, compared to conventional analog preemphasis, THP can change and the parameters easily using digital calibration. Future simulations are needed to evaluate the various transmission lines.
