Abstract: This paper proposes the use of orthogonal coding to allow simultaneous communication over a digital link without the use of a modulation scheme. The interconnect consists of a complex, lossy and noisy channel (consistent with designs in a Hewlett-Packard computer system) and a novel transceiver.
I Introduction
Simultaneous bi-directional signaling has been used conventionally to effectively double the bandwidth of a link while conserving resources [4. 51. This has been achieved by means of a subtraction mechanism that compares the received signal with the transmitted signal from the near-transmitter and thus obtains the far-transmitter's signal. To do this accurately one must be able to build a replica driver so that the near-transmitter's signal that is seen by the receiver is an exact copy of the component of this signal seen on the actual line; this is very difficult to do for complex channels owing to the mismatches and discontinuities present. Another issue that arises is that of reverse-channel crosstalk [2, 6, 7] .
In this paper we present an alternate design for a simultaneous bi-directional link that does not implement a subtraction circuit. Both transmitters are allowed to simultaneously access the line but using a coded signal. This coded signal is obtained by hashing the data stream with a code stream that is generated in the transmitter. The two codes (one for each transmitter) are orfhogonal (or pseudo-orthogonal) which means that they have zero (or nearly zero) cross-correlation [6. 71. Many codes could be constructed to perform the desired function [6, 71 . The longer the code. the higher the switching (or coded-data-rate) and thus, greater the losses (due to dielectric losses in the channel). The simplest Walsh code [2, 6, 81 doubles the data-rate (for one transmitter there is no change in data-rate) and also provides for very simple synchronization. This code (used in our design) is a 2-bit code (per data bit) for each of the two transmitters; that is, Code,=[l 01 and Code2= [l 11 . The data is assumed to swing between 1 and 0. It is useful to note that one code does nothing to alter the data, while still providing separability. This implies no 0-7803-7451-7/02/$17.00 0 2002 IEEE extra hardware is required for coding in one of the transmitters and also no code-synchronization in one of the receivers. Very interestingly, these two codes are orthogonal no matter what the phase offset between them is.
There are several advantages in using orthogonal codes to allow simultaneous bi-directional signaling. While maintaining an amount of hardware consistent with the previous implementations [4, 51 (i.e. the subtractor circuit is replaced by a correiator in one receiver and an integrator in the other) it provides for reception over very high losses. This is possible due to the high amount of separability that is guaranteed by the coding. The issue of accurately replicating the transmitted signal is overcome as the reception is done by correlation rather than subtraction. This means no replica driver is needed. Reversechannel crosstalk is no longer an issue because the codes used in this link remain orthogonal over arbitrary time-shifts (introduced due to reflections over the line) as well as mismatches introduced by the driver. This provides for a multi-gigabit link over a complex interconnect with high losses and discontinuities, which can effectively perform simultaneous bi-directional signaling. The rest of this paper deals with the design and functional level implementation of such a link.
5"
Stripline (board) I1 The Lossy Channel r AMP 20"
The channel used for simulating the link is an interconnection between two CPUs on two different PC Boards that are connected to each other through a backplane and each connected to the backplane by means of a connector (Fig. 1) . The lines on board and the backplane are made of copper and the dielectric surrounding them is FR-4. The whole channel has been modeled using S-parameters and also includes an aggressor-victim model that models the coupling of all nearby parallel lines in the interconnect [lo] and thus simulates coupling noise. This channel has extremely high losses. In gigabit data-rates the dielectric losses and skin-effect losses distort the signal heavily [ 1 11. The effect of the losses in the channel due to skin-effect, and more predominantly dielectriclosses, is significant attenuation and distortion. The received signal loses 60% amplitude at 1 GHz over a 40" line. Pre-emphasis or equalization can provide limited benefit for such lossy channels [2, 51. The channel is also subject to discontinuities introduced due to the connectors and mismatches along the line.
I11 The Transceiver
A detailed finctional level [9] block diagram of the transceiver is shown in Fig. 2 . One of the transmitters drives its data directly onto the line (corresponds to Code2). The other transmitter uses an XNOR gate to effectively multiply the data with its code (Code,) and drives the line with the coded data. The data clock (for edge-triggered circuits) can be used as Code,. Both transmitters drive the line simultaneously from both ends. The driver can be a voltage-mode driver [4] or a current-mode driver [5] . In this implementation the driver has been modeled as a voltage-mode driver. To compensate for the distortion in the channel, pre-emphasis can be done by means of a simple FIR filter [5] . This can be tuned -to optimize performance for a particular channel. In our implementation the pre-emphasis is incorporated in the behavioral model of the driver and is tuned to the-chosen channel. The termination for the interconnect (at both ends) is a resistive split parallel (Thevenin) matched termination. This means that although reflections are minimized, the voltage at the end of the line is halved, reducing the detection margin. The capacitive loads have been imped onto a 2 pF capacitor at both ends to simulate the effect of the I/O pad. The receiver mainly consists of a correlator that correlates the received signal with the code of the far-end transmitter. This is done by means of a multiplier, which multiplies the received signal with the correct code, followed by an integrator, which integrates this waveform over one data bit-period. The integrator resets itself to zero after every data bit-period. Correlation with Code2 effectively requires no multiplier thus saving hardware and improving linearity, The multiplier and integrator (together called the correlator) have been implemented as bnctional switch capacitor circuits (Fig. 3) [12, 13, 141 with realistic parameter values (a dc gain of 500, a unity gain bandwidth of lGHz, a tail current of 10 mA, an output resistance of 500, an output slew of 1 GV/s, an output swing of 1.2 V and an input offset voltage of 50 mV) for their behavioral operational amplifiers. The multiplier could also be implemented as a Gilbert cell. On the other hand the entire correlator could be implemented as a digital circuit if a high precision ADC [ 151 is used to convert the received signal into a multi-level digital waveform. This would require stringent design of the ADCs for lossy channels, and would be unadvisable with limited hardware. The receiver also contains a comparator followed by a flip-flop to provide digital voltage swings and retimed signals respectively.
H

D2
Received Fig. 1 is a single block) . TB is the bit-period. The retiming is achieved by recovering the clock by synchronizing it by means of a DLL (which is indicated in the right block). Pre-emphasis is part of the driver. The receiver needs to be synchronized to the transmitter to provide accurate correlation. A phase offset between the received signal and the code will result in lower correlation with the useful component of the received signal and thus reduce performance. For a parallel bus between two CPUs, source synchronization is the easiest way to synchronize the link. A DLL [4] can provide synchronization for the whole bus if one line is used to send the data-clock. This clock is recovered and synchronized to the local clock at the receiver. The recovered clock can now be used as the code in the case of Code, and also be used to retime the signal (Fig. 2) by clocking the flip-flops. The DLL [4] has been implemented as a behavioral model in Spectre from Cadence.
The entire link, including the transceiver and the S-parameter representation of the channel was constructed as a mixed-mode circuit using Hewlett-Packard's internal tools and simulated in Cadence Spectre and HPSPICE over varying lengths of the channel (10"-40") and varying data-rates (1 Gbps-3 Gbps). Figures 4 and 5 show the inputs to the line and the outputs of the correlator for the case when the length of the line is 30" and the data-rate is 1 Gbps (both ways). The input data is a random bit-stream (DI=O1 1 I I 101 1 10. ..) and Fig. 5 shows its 
IV Summary
A simultaneous bi-directional IO has been described. It has been shown that using DS-CDh4A to simultaneously transmit signals on a bi-directional line was feasible and a novel transceiver has been built.
The performance of this transceiver was shown over a high-loss channel with many discontinuities. The interconnect (transceiver and channel) was simulated for various lengths of the channel and various datarates. Gigabit transmission was shown to be successful. The advantage of using such a scheme is to avoid errors in replicating the driver, eliminating reverse-channel crosstalk and providing high user-separability. This form of baseband CDMA could also be used to send multi-point-to-point signals on a single link and also provide asynchronous bus access over a multi-drop architecture.
