Abstract-This paper reports on studies concerning the feasibility of large-scale integrated realization of the circuits needed to provide hybrid-mode full-duplex digital transmission at 80 kbits/s or higher rates over standard local telephone loops. Alternative means of achieving the required 60 dB or so of echo cancellation have been studied in detail. The conclusion is that a combination of analog and digital circuit techniques permit practical MOSLSI realization of the complete modem, including filters, echo canceller, timing recovery, and A/D and D/A converters, without need for external circuit elements, trimming, or adjustments.
I. INTRODUCTION
HE trend towards digital networks in the telephone system andthe need to provide digital transmission capability for the subscriber has prompted the investigation of the local loop as a means of transmitting digital data. For economic reasons it is desirable to transmit in both directions over the same pair of wires. Two techniques have emerged as promising for this purpose, namely, the burst-mode or ping-pong method, and the hybrid balancing method. The latter uses a hybrid transformer (or the electronic equivalent) to provide directional isolation, as in voice transmission. Adaptive echo cancellation is used to improve channel separation in order to achieve an adequate error rate.
Advantages of the hybrid balancing method over the pingpong method are the smaller total transmitted signal bandwidth and the fact that long loops can be repeatered without accumulating delays. Transhybrid near-end crosstalk and echos, not a significant problem in a ping-pong system, must be minimized by echo cancellation when using the hybrid technique. If adequate cancellation can minimize the effect of near-end crosstalk and echo, the smaller signal bandwidth of the hybrid technique will result in a greater range of transmission than the ping-pong method for a given error rate. VLSI techniques will likely reduce the cost of the greater complexity involved in Manuscript received February 19, 1982 . This paper was presented in part at the International Symposium on Subscriber Loops and Services, Toronto, Ont., Canada, September 1982. This work was supported in part by the National Science Foundation under Grant ENG78-11397 and in part by a grant from TRW Vidar.
The authors are with the Department of Electrical Engineering and Computer Sciences and the Electronics Research Laboratory, University of California, Berkeley, CA 94720. echo cancellation, making the hybrid system cost-competitive with ping-pong.
The objective of this work is to establish the feasibility of the large-scale integration of full-duplex two-wire baseband modems using the hybrid method working at 80 kbits/s or higher data rates. The 80 kbit/s rate would permit, for instance, simultaneous use of a 64 kbit/s PCM channel for voice communication, plus up to 16 kbits/s of simultaneous data transmission. Switching equipment could be designed to permit independent connections for the voice and data portions of the 80 kbit/s channel.
The purpose of this paper is to give detailed consideration to several alternative methods of implementation of the echo canceller in MOSLSI. A couple of these methods have been implemented for experimental verification. Since the echo canceller operates in the context of the entire modem, a modem was implemented as described in Section 11. Section 111 then addresses the performance which can be expected from an MOSLSI realization of specifically the echo canceller, as obtained from theory, simulation, and experimentation.
Section IV presents the conclusions.
11. SYSTEM DESCRIPTION Fig. 1 shows a block diagram of the system under consideration. Two baseband modems communicate at 80 kbits/s on a single pair of wires. One of them is located at the central office, the other at the subscriber set. Typical specifications include a transhybrid loss of 10 dB, a line attenuation of up to 40-45 dB for a 5 km subscriber loop (measured at half the data rate, where the spectral density peaks for the bipolar coded signal), and a required echo cancelation of 50-55 dB for 20 dB signal-to-noise ratio after cancellation. Even greater cancellation of the echo would be desirable if it could be achieved.
While the primary goal was to obtain a monolithic realization of the echo canceller, it was felt important to consider the design in the context of the entire system. The system configuration which was chosen is shown in greater detail in Fig.  2 . The following sections discuss the design choices which were made.
A . Scrambler and Descram bler
Scrambling the incoming data ensures that the transmitted sequence is random (or pseudorandom) even during idle or repetitive data patterns [3] . A random data sequence is required for the convergence of the echo canceller, to avoid placing discrete spectral components on the line (that would cause RFI and crosstalk interference), as well as to aid timing recovery in the receiver. The particular scrambler chosen is recommended by CCITT and perfoms a modulo 2 division by the polynomial 1 + x3 + xZo [4] , generating a pseudorandom sequence of 1e:ngth 2" -1. The descrambler, which is selfsynchronizing, performs amodulo 2 multiplication by the same polynomial.
B. Bipolar Coder
Alternative line codes have'recently received attention in the literature [5] ~ and will not be discussed here. "Bipolar" or "alternate mark inversion (AMI)" coding has b e e n c h o s e n in this design in view of its simplicity and robustness; however, most of the techniques used here are readily adaptable to other codes. The bipolar transmitted signal is three-level, and it was desired to avoid building a canceller which accepted three-level data. One possibility is to input the same data to the line coder and the echo canceller. However, since the coder is then located in the echo path, that echo path would then be nonlinear if true bipolar coding were utilized. A modified technique is therefore used here, which consists simply of differentiating the input binary signal using an analog differentiator with a time constant much shorter than the baud period. A three-level signal is thus created which obeys the same constraint as the standard bipolar signal; namely, that the transmitted signal has no dc content. In addition, since the line coding is obtained through a linear operation, it is consistent with linear echo cancellation. In practice the analog differentiator is incorporated in the output filter, making it into a bandpass instead of a low-pass filter. Thus, the block named "bipolar coder" has conceptual rather than physical existence in the hardware implementation.
It should also be noted that the coding could be made into true bipolar by performing a modulo 2 running sum of the data sequence before entering it into the canceller and the transmit filter with differentiator, as suggested in [12] . This additional complexity was not included, however.
C. Transmit and Receive Filters
Minimal intersymbol interference filters [6] were used at the transmitter output and the receiver input.The transmit filter shapes the pulse placed on the line t o minimize high-frequency components that would unnecessarily increase crosstalk and radio frequency interference. Care must be exercised in its design to keep nonlinear distortion components (which a linear echo canceller cannot compensate) lower than approximately -60 dB relative to the transmitted signal level. This also requires that the input pulses be symmetrical t o a similar degree. Perfectly square pulses would be desirable, but unfortunately the output of a logic gate departs significantly from that ideal due to rising and falling transients that are not equal in general. Satisfactory pulses have been obtained by clamping a CMOS gate with diodes as shown in Fig. 3 .
Because of the high degree of symmetry between positive and negative pulses which is required, oscilloscope measurements in the time domain are generally not sufficient to evaluate the quality of the pulse generating circuitry. A better evaluation can be made by measuring the output signal with a spectrum analyzer. For a bipolar signal, spectral zeros are to be expected at multiples of the data rate. It is shown in the App e n d i x t h a t i n t h e p r e s e n c e of nonlinear distortion the departure from zero of the spectral density at multiples of the data rate provides a good indication of the transmitted pulse symmetry. A more quantitative result can be obtained driving the circuit with a sequence of alternating 1's and 0's. A periodic waveform with a fundamental frequency at half the data rate results at the output. The even harmonic content of that signal measures the nonlinear pulse asymmetry distortion.
A significant consideration in the design of the modem is timing recovery. It is possible, although perhaps foolhardy, to derive timing in a half-duplex mode during initialization. It was decided to take a more conservative design approach in which the canceller is capable of deriving timing properly when the data transmission in the two direction is asynchronous. This allows for proper recovery of timing during startup or after loss of timing, but also requires that the sampling be done at twice the transmitted pulse rate. One of the functions of the receive filter, in addition to removing the out-of-band noise, is thus to remove all potentially aliasing frequency components above the data rate. Aliasing distortion would impair the operation of the timing recovery circuits, although it has no adverse consequences on the echo cancellation and data detection. The receive filter used here provides only 27 dB of attenuation at the data rate. However, the signal has a spectral zero at that frequency, which further decreases the @5"' h.igh-frequency components. Experimental evaluation has shown that this filter is completely satisfactory.
D. Equalizer
A single-zero manually adjusted equalizer was used in this system. As in a standard automatic line build-out (ALBO) used in T-carrier systems, this equalizer had a single adjustable zero. Performance proved satisfactory under most circumstances. The adjustment of the zero frequency could be mad.e. automatic using standard techniques. Use of adaptive decisiokj feedback equalization (DFE) after the echo cancellation would further improve the eye opening, particularly in the presence of bridge taps, but was not included. Eight taps of DFE or fewer should be completely sufficient.
E. Echo Canceller
Two different versions of the echo canceller have been constructed. They are described in Section 111-C-1, after several alternative realizations are discussed in detail in Section 111-A.
F. Detector
The output of the echo canceller is the far-end transmitted data signal, which has a three-level eye. Detection requires slicing it at two thresholds and conversion to a binary signal by the inverse of the differentiation introduced in the transmitter. This is done with logic circuits in this implementation. The binary signal is then descrambled with the circuit described in Section 11-A.
G. Reconstruction Filter
This filter reconstructs a continuous-time waveform from the samples at the output of the echo canceller to be used in the subsequent timing recovery circuit. Since a nonlinear operation will follow in the timing recovery circuit, and this will create high-frequency components not present in the original spectrum, it is necessary to remove first the aliases generated by the sampling. The design of the filter is somewhat simplified using a bandpass instead of a low-pass characteristic. A fourth-order maximally flat filter with a center frequency at half the data rate and a bandwidth of 0.4 times the center frequency has been used. It is shown in [7] that use of this filter minimizes the jitter in the timing waveform.
H. Full-Wave Rectifier
Full-wave rectification creates a discrete line [7] in the spectrum at the data rate, even when the data signal has a continuous spectrum band limited to less than the data rate (typically between 0.5fB andfB).
I. Bandpass Filter
This filter recovers the spectral line created by the full-wave rectification and removes other spectral components whose presence would increase the jitter in the timing signal. A second-order filter with a Q = 100 has been used. The output of this filter is a nearly sinusoidal waveform with a significant amount of jitter.
J. Frequency Multiplier
The phase lock loop (PLL) locks to the timing tone obtained in the previous stage and, by using a very large loop time constant (7 = 0.1 s), reduces the jitter. It simultaneously performs a frequency multiplication by a factor of 40, in order to obtain tfie processor clock at 3.2 MHz (in the second implementation of the echo canceller the processor clock frequency was f, = 1.6 MHz but the same PLL was used in both receivers, &ding a divide by 2 stage in the second).
Phase locking occurs after the echo canceller has converged, since an echo-free signal is required to recover the timing. Synchronous operation starts after an initial period during which first the echo canceller is allowed to converge and then ,the PLL locks. In the subscriber modem, the derived timing clock is also used as the transmit data clock. This synchronous operation is not required for the proper operation of this type of echo canceller [2] , but is a requirement of the digital switch with which the modem must interface.
< ,"%
TECHNIQUES FOR LARGE-SCALE INTEGRATION OF THE SYSTEM
The implementation complexity of hybrid mode digital subscriber loops would result in a high manufacturing cost for any realization using off-the-shelf components. Thus, from an economic point of view the hybrid mode technique cannot compete with the burst mode technique, in spite of the recognized technical advantages of the former [ l ] . The high cost is mainly associated with two sections of the echo canceller, namely 1) the digital processor implementing the canceller adaptation and/or transversal filtering, and 2) the analog-to-digital andlor digital-to-analog converters. Large-scale integration techniques have proven very effective in decreasing the cost of digital circuits. Thus, a monolithic implementation of this system will clearly overcome the cost factor associated with 1). However, the requirement of onchip high-performance data converters poses additional problems that have to be solved before the system can be fully integrated. This section will address the problems associated with the data conversion in an integrated hybrid-method digital subscriber loop. Switched-capacitor filters are now a proven technique that can be used for the implementation of the various filters described in Section 11, and will not be considered here. In Section 111-A we analyze available alternatives for the implementation of the echo canceller, in view of the problems associated with the data conversion. Three of the more attractive of these alternatives are explored in detail in Section 111-B by means of analysis and computer simulation.
An MOS monolithic chip built to experimentally evaluate the feasibility of the implementation of an echo canceller with on-chip data converters is described in Section 111-C. Means for integration of the complete system are proposed in Section 111-D. Each of these alternatives will be discussed in the following four subsections.
1) Analog Echo Canceller: Fig. 4 (a) shows the structure of a completely :malog echo canceller using switched-capacitor techniques. The coefficients of the transversal filter are stored in integrators, and the binary weighting is done using switched capacitors. A summing amplifier performs the convolution sum. The adaptation algorithm is also implemented by a switched capac:itor circuit that adds a correction term to the coefficients stored in the integrators. The desired correction term is obtained by multiplying the cancellation error coming from the output of the summing amplifier by either + 1 or -1, corresponding to the data bit at the corresponding tap. This technique is adequate to implement synchronous sampling echo cancellers, whose sample rate is synchronous to the data rate. These echo cancellers are usually combined with simultaneous decision feedback equalization [8] .
Use of an AGC structure as described in [9] permits subtraction of all components associated with the far-end signal from the err01 signal used to adapt the coefficients. This requires use of a four-phase clock. The first two phases are used to perform the transversal filtering operation associated with the echo cancellation and simultaneous decision feedback equalization. A decision is made regarding the received bit. An AGC tap is then weighted by the present bit and subtracted, yielding at the output of the summing amplifier the true canc1:llation error, which is used during the last phase to correct the coefficients. Since the error is free from received signal components, a high gain with corresponding rapid convergence can be used in the adaptation algorithm. A value of (Y = 0.03 has been successfully used in a breadboard implementation of a 16 tap echo canceller combined with a 16 tap equalizer we built using this technique.
Since no well-proven techniques exist to perform timing recovery when the canceller samples synchronously at the received pulse rate, one must consider the asynchronous echo canceller as described in Section 11. In this instance, the farend data signal cannot be removed from the error signal driving the canceller, and as a result an extremely low gain is required in the adaptive algorithm. This low gain cannot be reliably implemented with analog techniques. Thus, we have given no further consideration to the analog approach.
2) Fully Di,yital Echo Canceller: In this approach, shown in section:
4(a);
A Fig. 4(c) , as.will be discussed in Section 111-B.
3) Didtal Echo Canceller with Analog Cancellation: This implementation, is shown in Fig. 4(c) , uses an all-digital implementation of the canceller and converts' the echo replica to analog using a D/A converter prior to cancellation in the analog domain. Resolution requirements for the data converters are reported in [2] , where it is shown that at least 12 bits are required. Not explicitly mentioned in that reference is the fact that 1/2 LSB integral linearity is also required in the D/A. MOS monolithic D/A converters of the required speed and resolution have been demonstrated [ 101 ; however,, the integral linearity of these converters cannot be guaranteed to be better than 7 or 8 bits unless laser trimming is used. Nonlinear distortion in the D/A greatly degrades the echo cancellation, as shown in Section 111-B. A low-cost implementation of the system precludes the use of laser trimming, making this alternative not directly applicable to the monolithic implementation o f the echo canceller, A technique in which nonlinear distortion introduced by the D/A and other sources is corrected digitally by a nonlinear echo cancellation algorithm is the subject of another paper [ 111 and will not be discussed here.
Since the A/D is used only to generate a feedback signal for the adaptation algorithm, it must be monotonic, but not necessarily linear. Furthermore, a resolution of only 8 bits is required [2] . The MOS monolithic implementation of these converters poses no major problems. 4) AnaloglDigital Echo Canceller: An architecture in which the convolution sum and the echo cancellation are performed with analog techniques is shown in Fig. 4(d) . The adaptation algorithm is performed digitally, and the tap weights are converted to analog using a single time-shared D/A converter. The multiplications and additions are then performed in the charge domain, where the nonlinear distortion can be kept very small without special circuit design effort.
Since a large dynamic range and slow time constant is still required in the adaptation algorithm, it cannot be implemented by analog means. Thus, the use of a digital processor to perform the adaptation, while a switched capacitor analog transversal filter computes the convolution sum and cancels the echo in the received signal. The coefficients of the transversal filter are stored in sample-and-hold (S/H) capacitors ( 
converter .in this approach represents a significant advantage, since si& operation could be traded off for increased resolution and reduced die area and power consumption. The design of the A/D converter for the cancellation error, as in the realization of Fig. 4(c) , represents no particular. difficulty.
The effect of the inevitable nonlinearity in a monolithic D/A (or A/D) converter without laser trimming will be carefully considered in the next section for the architectures of Fig. 4(b) , (c), and (d). It will be shown that Fig. 4(d) is the least sensitive to converter nonlinearity. This is due to the ability of the adaptation algorithm to compensate for any nonlinearity which is-continuous and monotonic.
B. Effect of D/A Nonlinearity
In this section we analyze both theoretically and by computer simulation the effect qf D/A (or A/D) nonlinearity on the echo canceller configurations of Fig. 4(b) , (c), and (d). In the process we will also be able to characterize the effect of any nonlinearities in the echo response, such as are introduced by transmitted pulse asymmetry.
It is natural to assume that the current echo sample or alternatively the current sample at the canceller output is given by some time-invariant but otherwise abritrary nonlinear function of the current and last N -1 transmitted bits. Let c k de- This expansion can be used to represent both the echo response and the canceller output in the presence of arbitrary time-invariant nonlinearities. As a simplification of notation it is useful t o define the "augmented transmitted symbol vector" Ck=(1,C;,.'.,Ck-N+1,CkCk-1,CkCk-2,' ..,
where the sup'zrscript T denotes transpose. There are 2N components in this vector, one for each of the terms in (1). Also define the 2N-dimensional vector where the tap-weights in the canceller summation are the a(n).
Since (6) is a general nonlinear function of the form of (l), it can be written in the form where u is the N-dimensional tap-weight vector and D[u] is a 2N-dimensional vector induced by the nonlinear function d(.). We thus see that the effect of the D/A nonlinearity is to add the terms of order 2 and higher in the expansion of the form of (1) into the echo replica. Those added terms are uncorrelated to the echo when the echo response is assumed to be linear, and contribute together with the noise and far-end data to the residual mean squared cancellation error.
The cancellation errw is
(3) where sk is the far-end signal and nk is the noise. Since all the Then the expansion of (1) When the data digits c k are mutually independent and assume the values $ 1 and -1 with equal probability, it is easy to show that the components of the vector ck are uncorrelated. This fact will be used when calculating the effect of nonlinearities in the canceller.
This nonlinear expansion can be used to determine the effect of nonlirtearities in the echo response. For the general nonlinear case, if it is assumed that the current echo sample is a (nonlinear) function of the current and last N -1 data bits, then it can be written in the form where E stands for expected value. The condition for minimum MSE is to choose u so as to minimize the first term in (9) . Because u only has N components, in general it is not possible to force this term to zero, even when the echo response is linear, except, of course, when d(.) is linear. The minimization of the MSE is further discussed in [ 1 1 1 , where it is shown that in the case of small nonlinearity the familiar gradient technique applies.
The net effect of nonlinear distortion is to increase the residual error after cancellation, since it causes the first term in (8) The representation of the canceller output is different for each of the three cases of Fig. 4(c) , (d), and (b) so that they will be considered separately. Fig. 4(c) : For the configuration of Fig. 4(c) , it is assumed that the D/A converter has an inherent undesired nonlinearity d(*). Since techniques are available t o implement D/A converters which are inherently monotonic and have no discontinuities, it is reasonable to assume that the function d(.) is monotonically increasing and continuous. This implies that the inverse function d-'(.) exists.
1) Nonlineur Distortion in
The canceller 'can then be modeled as a linear combination of the N data bits, followed by the nonlinear function d(.), where the notation [a] denotes the nth component of the vector. The first term in this residual cancellation error is a linear distortion which will be present due to the nonlinearity in the D/A. The minimum MSE is not necessarily achieved when the first term vanishes, since the second and third terms of (IO) are also functions of u. The second term is a dc offset due to the nonlinearity, while uk represents all the accumulated nonlinear distortion terms (defined as terms containing a product of two or more data bits). It is perhaps suprising that all these terms are uncorrelated with one another.
The representations of (8)- (IO) are useful for gaining insight to the nature of the distortion introduced by the D/A nonlinearity. When it is desired to compute the MSE of the cancellation residual, however, it is more convenient to use the simpler relation 0 " " ' " " " " ' " ' '~" ' " " '~" " ' " '~" " "~" " " ' " " '~" ' " ' " ' " " ' " '~"~~' ' "~~~~~~~~~~ 
+ E [ s k 2 ] + E [ n k 2 ] .
The additional residual echo introduced by D/A nonlinearity is illustrated by the computer simulations of the adaptive canceller shown in Fig. 6 . A second-order nonlinearity bx2 in the D/A transfer function was considered here. The ratio of the mean squared error (averaged over 1000 samples) to the mean squared echo is plotted for b = 0, 0.01, 0.005, and 0.001. In order t o achieve 60 dB of cancellation b must be less than 0.001, which requires 12 bit integral linearity in the D/A. Fig. 5 . By allowing for nonlinearity and offset in the S/H's, their design can be simplified and chip area can be saved. Moreover, as we shall see later, the adaptation algorithm can easily compensate for these impairrhents in the structure of Fig. q d ) .
2) Nonlinear Distortion in Fig. 4(d): Let d,(*) be the nonlinear transfer function of the D/A combined with the nonlinear transfer function of the nth S/H in
The echo replica in this instance is . In this instance the condition for minimizationof the MSE is much simpler; namely, from (9) which is valid whether or not the echo response is linear. As before, the assumption of the existence of dn-' ( a ) is reasonable. While nonlinearity in the echo response does not affect the optimal tap-weight vector in (12), it will cause excess MSE in the minimized MSE as seen from (9). When the echo response is linear, the first term in (9) can be forced to zero, and there is no excess residual echo. This represents a significant advantage of the technique of Fig. 4 
(d) over 4(c).
The MSE can be minimized adaptively using the gradient technique. The gradient of p is Because dB(.) is monotonic, ad,(a(n))/aa(n) is never zero, and hence the only minimum of p is associated with the solution (13), so that the gradient technique will be able to find the minimum. Furtherinore, since for practical situations ad,(a(n))/ aa(n) is close to unity (because the nonlinear distortion is small), the nonlinearity does not significantly slow convergence. As usual, the gradient is replaced by a noisy estimate, and the algorithm becomes where a subscript k has been added to the tap-weights to indicate that they are now functions of time.
A further modification of the algorithm is introduced by the fact that in Fig. 5 the error is computed using the values of the coefficients stored in the S/H, which are not the latest versions since only one S/H is refreshed each pulse period (although the latest versions of all the coefficients are always available in the memory of the digital processor, refreshing the N possible cyclic permutations it is depends on the initialization of the selector circuit, but no attempt is made to control that parameter. Since a is very small, the coefficients change very slowly and this modification does not appreciably alter the convergenc:e.
The preceding analysis has been verified by computer simulation. Fig. 7 shows the convergence transient of the echo canceller w h e n no external signal is present and an infinite resoiution A/D converter is used. DAC's with 10 to 13 bits resolution and with infinite resolution were considered. It is interesting to obse:rve that the speed of convergence is still experimentally determined to be as predicted in [ 11, in spite of the modifications introduced in the algorithm. Although one particular case of nonlinearity is shown, we experiinented with different nonlinearities and even with independent noniinear functions for each tap (in anticipation of possibly different S/H nonlinearities), with results very similar to those of Fig. 7 . We conclude that this technique is extremely inslensitive to fionlinear distortion in the D/A converter. It cannot, however, compensate for nonlineqrities in the echo response, as can the technique described in [ 1 1 ] .
3 ) Nonfine(zr Distortioiz in Fig. 4(b) : The effect of nonlinear distortion in the ali-digital organization of Fig. 4(b) is somewhat more involved than the previous two cases; That is why it was relegated to the end of this discussion. Since in Fig. 4(b) the nonlinear distortion introduced by the A/D affects the input signal bg'ore echo cancellation, intermodulation components between the far-end signal and the echo will be created. To see this in detail, let {xk} now be the sequence of trans- Fig. 4(b) , the input to the receiver is 1 .
2.4~
In general, all components ofB[g] may be different from zero, even when the echo response is linear. The t e h s of (17) which are linear functions of the near-end transmitted bits represent a linear echo. The terms which are linear functions of the far-end transmitted bits represent a linear received signal. The nonlinear terms in (10) include an offset term and, in addition, 1) nonlinear echo components consisting of nonlinear interactions imong the local transmitted data bits, 2) nonlinear far-end signal components consisting of nonlinear interactions among the far-end transmitted data bits, and 3 ) intermodulation terms consisting of nonlinear interactions among local and far-end transmitted signal bits.
All these nonlinear terms represent an uncancellable noise for a linear echo canceller.
A nonlinear echo canceller as described in [I 11 could remove component 1). In order to remove 2) and 3), a generalized transversal filter structure which is a nonlinear version of Mueller's combined echo cancellation and decision feedback equalization [8] could be used. This transversal filter would generate the most general nonlinear function of the last N transmitted and the last M received bits. However, such a structure would be impractical because of the requirement for synchronous operation between the near-end and the farend systems, as discussed in Section 111-A-I.
C. An Experimental Integrated Circuit Echo Canceller
From the analysis of Section 111-B, it is clear that the configuration of Fig. 4(d) is the most attractive of those presented for MOS realization. (Another attractive approach is discussed in [I I ] .) This approach has been experimentally evaluated by fabricating an MOS chip that implements the functions shown within dotted lines in Fig. 5 . Two 8-tap programmable transversal filters operate in time-interleaved fashion, sampling at twice the data rate.
A circuit schematic is shown in Fig. 8 . The data bits are shifted through the dynamic shift register formed by En(1) and E n ( 2 ) , where the subscript IZ identifies the tap number and the superscript (1, 2) indicates which one of the two time-interleaved section is being considered. Capacitors C, , n ( i ) are the sample and hold capacitors, with the associated sampling switch M3,n(i) and source follower buffer M l , n ( i ) , M2,n(i).
Summing capacitors C B , n ( i ) are initialized to ground when the corresponding data bit is 0, and to the S/H coefficient when the bit is 1. At the same time the input signal is sampled by capacitor C, . During the next clock phase the positions of the switches associated with the summing capacitors and with capacitor C, are reversed, and the quantity
is computed in the charge domain and appears at the output of the summing amplifier. While Section 111-B concentrated on the effect of the D/A nonlinearity, in the context of Fig. 4(d) the following additional factors are worth considering.
1) Linearity requirements on the sample fhold (SfH) circuits. 2) Linearity and gain requirements on the summing ampli-

3) Linearity requirements on the sampling capacitors.
With respect to l), the S/H's are implemented as capacitors buffered by source followers. No attempt was made to linearize them or compensate their offset, since their effect is the same as D/A nonlinearity and is therefore taken care of by the adaptation algorithm, as shown in Section 111-B. By keeping the S/H simple, much die area can be saved, since a total of 16 are implemented on-chip.
To consider points 2) and 3), refer to Fig. 9 , and assume that voltages n-1 fier. 2103 and V2(2) -V2(1) (19) are to be added using a switched capacitor summing amplifier as shown. The arguments 1 and 2 refer to the sampling and charge redistribution phases. This serves as a model of single-tap transversal filter, with Vl(l) equal to the input signal with echo, V , ( 2 ) = 0 and either
depending on whether a weight of + l or -1 is given to the tap coefficient (as determined by the corresponding data bit). The conclusions extracted from this simple model can be easily extended to a multitap transversal filter. Let also
be the (nonlinear) transfer function of the summing amplifier. It is not assumed here that the gain of the amplifier is high. Finally, let
define the (possibly nonlinear) capacitors of Fig. 9 . Using charge conservation,
In the case of linear capacitances
and becomes
where
Solving (25) for V-(2), using the fact that V- (1) is a constant (the offset of the amplifier), and also using (21), the output V0(2) can be found as a function of the linear combination This shows that in the case of linear. capacitors, the nonlinear distortion of the summing amplifier affects the total sum, which for the echo canceller of Fig. 5 pacitors. Assume that
( 2 7) Then, substituting in (23) and collecting terms, it can be seen that the linear combination of (26) (29) is satisfied. Fig. 10 shows a photograph of the chip, fabricated in NMOS silicon gate process with 7 pm design rules. Chip size is 3 mm on each side (between centers of scribe lines). The area of the active parts is 1.5 mm2, and can be decreased using smaller S/H capacitors (instead of the conservative 20 pF used here) and a more advanced process.
1) Experimental Results: An experimental breadboard modem has been built as described in Section 11. The first version used a completely digital processor built using 2901 bit slice microprocessors, with a 12 bit analog-to-digital converter at the input and a digital-to-analog converter at the output [ Fig. 4(b) ]. This D/A converter had adequate linearity for this application, but for the reasons described earlier could not be realized in MOSLSI without trimming. The design of the digital processor followed criteria given elsewhere 121. Internal arithmetic precision of the entire processor was 24 bits. The gain factor a of [2, eq. (16) ] could be chosen in our implementation over the range 2-to 2-'.
The modem has received extensive laboratory as well as field testing, with line attenuations up t o 44 dB. The field test was performed in Ora Loma, CA, a small (300 line) local office where the crosstalk and impulse noise impairments were mhimal. Subscriber loops were remotely looped back SO that both ends of the subscriber loop were available in the office. Because of the relatively ideal nature of this environment, these tests can be considered as indicative of the capabilities of the canceller itself in a realistic echo response environment, but not representative of the effect of crosstalk and impulse noise in a more severe environment. This is consistent with our goal of studying the effect of impairments due to the echo canceller, implementation.
"The degree of cancellation achieved was measured by measuring the SNR at the input and the output of the canceller. Specifically, both SNR's were determined by measuring the peak far-end signal voltage (with the local transmitter turned off) and the peak echo signal (with the far-end signal turned off). By this measurement technique, the achieved degree of cancellation was 63 dB. Error rate measurements showed a BER (bit error rate) lower than IO-' with line attenuations at 40 kHz up to 40 dB BER increased to IOM4 when the line attenuation was 44 dB. This increase was attributed not to the echo canceller, but rather the noise from the digital circuitry and the effect of equalization errors due to the very simple equalizer that was used. Fig. 11 shows the eye diagram at the output of the transmitter. Fig. 12(a) shows the eye diagram at the input of the echo canceller with the local transmitter turned off, while Fig. 12(b) shows the signal at the same point with the local transmitter on and set to the nominal output level of 600 mV peak. Note the different scales in the two pictures. Prior to echo cancellation, the echo was about 33 dB higher than the far-end signal. The eye diagram of the echo signal is closed because the equalizer has been adjusted to open the received signal eye, not the eye for the echo. Fig. 13 shows the eye diagram at the output of the echo canceller when operating with a line attenuation of 40 dB. The quantization error as well as the sampled nature of the signal can be clearly observed.
Impulse noise was the dominant source of errors we observed in both laboratory and field testing. Impulse noise almost always caused a single bit error on the line, which expanded to three errors in passing through the scrambler. Such errors are inconsequential for voice communication. For data communication, error detection followed by retransmission might well provide a satisfactory solution. Alternatively, a simple form of burst error-correcting code would be highly effective in overcoming the impulse noise errors we observed. Digital circuitry for such encoding could be included on the same chip as the functions described above.
Measurements were made to determine the magnitude of near-end crosstalk from other (unsynchronized) channels at 80 kbits/s on adjacent pairs in typical cables. Of course, such crosstalk could not be reduced by the echo canceller. Meas- ured crosstalk was at -70 to -80 dB relative to the transmitted signal level for a single interferer. At these levels, near-end crosstalk is not a limitation. A custom chip has been built to evaluate the effect of nonlinear distortion in the digital-to-analog converter using the favorable technique of Fig. 4(d) . The chip perfoms only the transversal filtering operation, while the adaptation is performed off-chip by a digital processor, using again 2901 bit slice microprocessors. The A/D and D/A were also implemented off-chip.
Fig. 14 compares the eye diagrams of the breadboard and the custom chip when operating with a line attenuation of 26 dB. The maximum measured echo cancellation of the modem using the custom chip (measured as previously described) was 40 dB, with th.e limiting factor being not nonlinear distortion but digital noise picked up by the analog circuits. In fact, the nonlinearity of the sample and hold circuits alone would have limited the cancellation to approximately 20 dB, were the adaptation not able to compensate for this nonlinearity. We are confident that this 40 dB of cancellation can be substantially improved with additional effort in designing circuits with better common mode and power supply rejection ratios. In addition, monolithic integration of the entire modem would reduce the problems of digital noise pickup experienced in both implementations. 
D. Integration of a Full-Duplex Hybrid Modem
We have made rough estimates of the die size and complexity of a complete MOSLSI realization of the system described in this paper. Fig. 15 shows approximate areas that would be required by the different functions to be implemented on-chip. The essential element of such a chip would be the analogdigital echo canceller discussed in previous sections of this paper. Switched-capacitor filters would be used at the input and output and for timing recovery. Phase-locked loop design in MOSLSI is described elsewhere [ 131. Area would be about 30 mm2, power dissipation 300 mW, and transistor count about 6000, based upon use of a silicon-gate NMOS process with a minimum feature size of 5 pm. Density would be increased and power reduced through use of a more advanced fabrication process.
IV. CONCLUSIONS
The design of the echo canceller for a hybrid-mode fullduplex data modem has been considered based upon analysis, simulation, and experimental studies. This paper has focused on the effect of impairments introduced by the implementation of the echo canceller itself in MOSLSI. Techniques have been proposed which should eliminate these implementationinduced impairments as limitations on the modem performance, so that performance can approach fundamental limits determined by noise, crosstalk, and cable attenuation.
An experimental system was built and tried out in a field environment. Again, the emphasis was on checking the perr fonpance of the echo canceller itself, rather than other system impairments. Measured performance of the experimental System was excellent.
It may well be possible to increase the transmitted bit rate to, 144 kbitsls for a system of this design. Measurements on real loops confirined that attenuation is proportional to the square root of frequency. A very large fraction of all loops would show attenuation of 40 dB or less at 144 kbits/s. Most importantly, circuit speed for the implementation approach proposed in this paper is certainly no barrier to 144 kbit/s operation.
APPENDIX
This Appendix will establish that the components of the transmitted data signal at multiples of the bit rate are a direct and useful measure of transmitted pulse asymmetry. As is well known, the bipolar encoded signal has zeros in the power spect&m at all harmonics of the bit rate. This Appendix will show that in the presence of pulse asymmetry the transmitted signal in fact has a line component at the bit frequency.
Assume that the transmitted data bits assume the values c k = + l and c k = -1, that the difference of successive data values are used to modulate a train ofpulses, and that there is an asymmetry in the pulses such that a positve pulse has shape h+(t) and negative pulse has shape h-(t). Then the transmitted signal cafl be represented as + f-(Cj+ 1 , C,)h-(t -kT)) ('4.1) where f + is I when c k = 1, c k -= -1, and o otherwise, and similarly f -is 1 only when c k = -1, c k -= 1. Since f+ and f -are nonlinear functions of c k and c k -1 , the expansion of (1) is valid and it is simple to show that (A.3) which demonstrates the nature of the nonlinearity introduced when there is pulse asymmeti-y: nonlinear distortion in the form of second-order products of adjacent bits is introduced into the transmitted signal. This demonstrates the terms which would have to be added to the nonlinear canceller in [ 111 to compensate for pulse asymmetry.
