Proposed implementation of a near-far resistant multiuser detector without matrix inversion using Delta-Sigma modulation by Magana, Mario E.
AN ABSTRACT OF THE THESIS OF
Timothy F. Myers for the degree of Master of Science in Electrical and Computer
Engineering presented on April 29, 1992 .
Title: Proposed Implementation of a Near-Far Resistant Multiuser Detector
without Matrix Inversion using Delta-Sigma Modulation
Redacted for Privacy
Abstract Approved:
Dr. 14a6 alE. Magaiia
A new algorithm is proposed which provides a sub-optimum near-far resistant
pattern for correlation with a known signal in a spread-spectrum multiple access
environment with additive white gaussian noise (AWGN). Only the patterns and
respective delays of the K-1 interfering users are required. The technique does not
require the inversion of a cross-correlation matrix. The technique can be easily
extended to as many users as desired using a simple recursion equation. The
computational complexity is 0(K2) for each user to be decoded. It is shown that this
method provides the same results as the "one-shot" method proposed by Verdu and
Lupas.
Also shown is a new array architecture for implementing this new solution
using delta-sigma modulation and a correlator for non-binary patterns that takes
advantage of the digitized AX signals. Simulation results are presented which show
the algorithm and correlator to be implementable in VLSI technology. This
approach allows processing of the received signal in real-time with a delay of 0(K)
bit periods per user. A modification of the algorithm is examined which allows
further reduction of complexity at the expense of reduced performance.cCopyright by Timothy F. Myers
April 29, 1992
All Rights ReservedProposed Implementation of a Near-Far Resistant Multiuser Detector without
Matrix Inversion using Delta-Sigma Modulation
by
Timothy F. Myers
A THESIS
submitted to
Oregon State University
in partial fulfillment of
the requirements for the
degree of
Master of Science
Completed April 29, 1992
Commencement June 1992APPROVED:
Redacted for Privacy
'I kj
Assistant Professor of Electrical and Computer Engineering in chargeof major
Redacted for Privacy
Head of department of Electrical and Computer Engineering
Redacted for Privacy
Dean of Graduatetol
Date thesis is presented:April 29, 1992
Typed by Tim Myers for Timothy F. MyersTABLE OF CONTENTS
Page
1. INTRODUCTION 1
Definition of Terms 4
Correlation Receiver Theory 5
Review of Asymptotic Efficiency 6
Decorrelating Detector Review 8
2. MATHEMATICAL BASIS OF PROPOSED SOLUTION 11
Extension for Three or More Users 15
Reduction of Computational Complexity 16
Simulation Results 18
3. IMPLEMENTATION OF PROPOSED SOLUTION 20
Simulation Results 23
Analysis of Error Contribution due to A/D Quantization 27
Analysis of Error due to Delay Resolution 31
4. DELTA-SIGMA CORRELATOR 34
Introduction 34
Block Diagram of System 36
Timing Used in System 36
Modulators Examined 38
2ndOrder Noise SNR Predictions 40
Simulations 41
SNR at Input to Detector 41
N x 1vs.1 x 1 Multiplication 49Review of Musts and Wants 53
SUMMARY 55
BIBLIOGRAPHY 57
APPENDIX A 59
SECTION 1 Introduction 59
One-Shot Decorrelating Theory of Operation 60
Block Diagram of Systolic One-Shot 61
Ad-Hoc Approach 63
SECTION 2. Systolic Synthesis of One-Shot 65
Correlation Matrix 65
Correlation Matrix Inversion 68
Correlators 69
Appendix Summary 71LIST OF FIGURES
Figure Page
Figure 1.1 - Block diagram of typical correlator and detector 5
Figure 1.2 - Block diagram of orthonormal basis correlator and detector 6
Figure 1.3 - Block diagram of one-shot decorrelating detector 7
Figure 2.1 - Illustration of one shot break-up of interfering user 13
Figure 2.2 - Three user case bit interval 15
Figure 2.3 - Normalized output from standard and modified GS Correlators 19
Figure 3.1 - Block diagram of one-shot correlator and detector 21
Figure 3.2 - Block diagram of GSOSG 21
Figure 3.3 - Shaded module block diagram 22
Figure 3.4 - Non-shaded module block diagram 22
Figure 3.5 - Modified GS block diagram 23
Figure 3.6 - Simulation results of entire system for three users 26
Figure 3.7 - A/D Resolution required for three user case 29
Figure 3.8 - A/D resolution required for five user case 29
Figure 3.9 - A/D resolution required for ten user case 30
Figure 3.10 - A/D resolution required for twenty user case 30
Figure 3.11 - Normalized output of correlator for one chip interval 31
Figure 3.12 - Output of correlator with interfering user as input 32
Figure 3.13 - SNR vs. delay resolution for proposed and standard detectors.33
Figure 4.1 - Block diagram 37
Figure 4.2 - Illustration of timing used in system 37
Figure 4.3 - Power spectrum of noise transfer functions used in project 39
Figure 4.4 - Closer view of power spectrum in range of 0 to p/4 39Figure 4.5 - Simulation results of SNR for modulators 42
Figure 4.6 - Graph of SNR at detector Input 43
Figure 4.7 - Proj_16 modulator output 45
Figure 4.8 - Output of multiplier 45
Figure 4.9 - Frequency spectrum with sine wave input 46
Figure 4.10 - Frequency spectrum of message before modulator 47
Figure 4.11. Frequency spectrum of message after the modulator 47
Figure 4.12 - Frequency spectrum for sine wave after modulator and /8
decimator 48
Figure 4.13 - Frequency spectrum for sine wave after modulator and /16
decimator 48
Figure 4.14 - Correlation of bandlimited input with binary PNGpattern 50
Figure 4.15 - Correlation of bandlimited input with bandlimited PNGpattern 50
Figure 4.16 - Correlation of input after delta-sigma modulation with binary
PNG pattern 51
Figure 4.17 - Correlation of bandlimited input after delta-sigma modulator with
band limited PNG pattern 51
Figure 4.18 - Ideal correlation of input signal with PNGpattern 52
Figure 4.19 - Correlation of input signal after delta-modulation with binary
PNG pattern using lx1 multiplier with higher oversampling 52LIST OF TABLES
Table Page
Table 2.1- Simulation results 18
Table 3.1 - Results of simulation 25
Table 4.1 - Musts and wants of a multi-level correlator 34
Table 4.2 - Comparison of SS correlator to delta-sigma modulator 35
Table 4.3 - Poles and zeros of error transfer function for modulators 38
Table 4.4- Comparison of calculated to simulated SNR for 2ndOrder
modulator 40LIST OF APPENDIX FIGURES
Figure Page
Al - Block diagram for systolic one-shot decorrelating detector 62
A2 - Timing for one-shot decorrelating detector 62
A3 - Processor layout diagram for one-shot systolicarray with two users 64
A4 - Dependency graph for correlation matrix 66
A5 - Processor layout diagram for correlation matrix 67
A6 - Processor layout diagram for correlation matrix with fewerprocessors68
A7 - Mapping of search in time correlator 71PROPOSED IMPLEMENTATION OF A NEAR-FAR RESISTANT
MULTIUSER DETECTOR WITHOUT MATRIX INVERSIONUSING DELTA-
SIGMA MODULATION
CHAPTER 1
INTRODUCTION
As the need for more communication continuesto grow there is a limitation on
capacity due to the finite amount of bandwidth availableto support all projected
users. In order to allow more users to share the same channel, different methods
have been devised which achieve this multiplexing. One of themost promising
methods is called code-division multipleaccess (CDMA). This approach allows
multiple users to share the same channel simultaneously. Thisis accomplished via
the use of separate codes to modulate the carrier. Generally,the codes used are
designed to be as orthogonal as possible to each other. Thistype of separation
allows for a gradual decay of the networkas more users try to access it. It does not
require the precise timing on the part of the individualusers to access the network as
does time-division multiplexing. Asmore users access the CDMA network, the error
rate increases causing users to leave the networkor to stay on with some loss of
performance.
While CDMA works well when allusers have about the same power levels, a
known problem which limits itsuse in mobile and other environments is the "near-2
far" problem. In this case, a transmittinguser which is close to the desired receiver
will overwhelm the signal of a user which is transmittingfrom some distance. Since
the codes are not completely orthogonal, there remainssome cross-correlation
between the transmitting users.
Several approaches have been taken to tryto lessen the "near-far" problem.
The first is to make the codesas close to orthogonal as possible. This action requires
more complicated circuitry to produce the patterns and there is alsoa limit on the
number of patterns that can be generated fora given code sequence length. Another
approach, similar to that used in cellular phones, isto allow a base station to limit the
transmitting power levels of the users. This method tendsto degrade the network to
the performance of the weakestuser. New cellular technology will attempt to solve
the problem by placing the base stationa far distance from the transmitters so all
received power levels are equivalent. This method is easilyachieved by using
satellites in orbit as the receivers. Obviously, the numberof satellites is limited and
this method is very costly to implement.
Recent work has focused on improving the detector inthe receiver so that it is
"near-far" resistant. Work in thisarea was previously ignored as it was assumed that
the optimum detector for multipleusers was close to the detector used for single
users. In [2], Verdu showed that an optimum detector for multipleusers exists and
that its performance is vastly superiorto the single user detector in a multi-user
environment. Verdu showed that this detector couldbe implemented using a Viterbi
algorithm that was computationally exponential inthe number of users and had a
variable decoding delay time. Since this revelation,research has been directed at
trying to reduce the complexity requiredto implement a "near-far" detector.3
In [3], Poor and Verdu examined singleuser detectors for multi-user channels
and concluded that significant reduction in complexitycould be achieved by limiting
the detection of each symbol to only the interval in whichit is received. They also
showed that by accounting only for the subset of interferingusers with significant
power levels, the amount of computation could be reduced further.
In [4] and [5] linear algorithmsare presented to decode the desired signals by
processing the output of a bank of filters matchedto each user. These approaches
require that the power levels of the received signalsbe known.
In [1], Lupas shows that the optimum linear detectoris the inverse of the cross-
correlation matrix of the users in the channel. It is furthershown that this can be
implemented as a bank of filters after the matched filtersas the length of the
transmitted sequence gets larger. This approach isequivalent to correlating the
received signal with a pattern that is orthogonalto the interfering users' sub-space.
Although the approach proposed by Lupas is linearin complexity with the
number of users, inverting the cross-correlation matrixbecomes quite cumbersome
as the number of users increases due to the need touse multiple time intervals. By
restricting attention to the "one-shot"or single bit interval as suggested by Verdu,
Lupas shows that the "one-shot" is sub-optimum andis upper bounded by the
decorrelating detector in performance. This detectorstill exhibits good near-far
resistance. In this approach, the received signal isnow correlated with a new pattern.
The generation of this new pattern still requiresthe inversion of a simpler cross-
correlation matrix.
This thesis examines current correlation receivertheory and performance
measures for CDMA channels in chapter 1. Anew method of generating the pattern
for the "one-shot" detector is proposed in chapter2 and it is shown that it is4
equivalent to the one proposed by Verdu and Lupas,yet it does not require the
inversion of a matrix. Chapter 3 describes the implementationof the new solution
which takes advantage of delta-sigma (AZ) modulation. Simulationshows that this
approach allows VLSI implementationso that the detector can operate in real-time.
A study of implementing a non-binary correlator usingAZ modulation is examined in
chapter 4. Appendix A looks at the blocksnecessary to implement the "one-shot"
detector of Verdu-Lupas using systolic methods.
Definition of Terms
The message of a code division multipleaccess (CDMA) direct sequence
spread spectrum communication system is characterizedas
r(t) = S(t ,b)+ n(t)
M K
S(t,b)=lib k(i)Vw k(i)s k(tiTk)
i = -M k=1
n(t) = AWGN,psd = a2
N = 2M +1= length of message
K = number of users in channel
sk (t) is the normalized signature waveform of user k
and is zero outside of interval [0, T]
k = relative delay of user k from start of bit interval
bk (i) is the i th message bit ofuser k
wk (i) is the received energy of user k in the i th time slot
s(t,b) is the normalized receiver input signal withunit energy
psd is power spectral density5
Correlation Receiver Theory
There are several methods used to provide coherent detection ofsignals. This
thesis examines the role of a near-far resistant detectorused in code-division
multiple access direct sequence spread spectrum communicationchannels. Although
matched filters can be used, this thesis focuseson the correlator receiver and thus
requires that we have knowledge of the signals under considerationand the delays
between the multiple users' bit sequences. The mostcommon form of correlator
receiver for multiple users (Figure 1.1) is composed of K activecorrelators, each
containing a pattern matched to that used to send each user'ssignal. The output of
each individual correlator then has a check of its signat the appropriate time interval
to determine the bit level.
SGN()
1511(t)
sk (t)
Figure 1.1- Block diagram of typical correlator and detector
Another approach that has been used is to makeuse of the fact that each signal
can be represented by a set of basis functions Of represented by
S i(t)= Esoi(t)
j=16
The ith signal can be detected by using correlatorsthat contain the basis function
patterns and weighting the outputs of the correlators by thecorresponding
coefficients sip The basis vectorscan be formed by using the Gram-Schmidt
procedure. The block diagram for this method wouldbe
SGN()4(t)
coN(t)
Figure 1.2 - Block diagram of orthonomial basiscorrelator and detector
A new method, the "one-shot decorrelating detector"was proposed by Verdu
and Lupas [1] which provides resistanceto the near-far problem. The received signal
for this approach(Figure 1.3) is correlated witha set of patterns which are each
orthogonal to the other interferingusers. This approach increases the bit error rate
for AWGN but only marginally in high SNRconditions. The approach by Verdu and
Lupas requires that the interferingpatterns be broken up into sub-intervals. A
matrix containing the cross-correlations of thesewaveforms is inverted and the
orthogonal pattern is created by multiplying theappropriate row of the matrix by the
desired user and m=2K-1 sub-divided interferingusers patterns.7
ym
SGN()4(0
Pm (t)
Figure 1.3- Block diagram of one-shot decorrelating detector
This paper builds on the Verdu-Lupas method byproposing a new method to
derive the orthogonal patterns without invertinga matrix. This new approach is then
examined along with the Verdu-Lupas methodto see how it can be implemented
using systolic methods. Also presented in thispaper is a correlator circuit which can
perform the multi-level correlationprocess needed for the new detectors. This
correlator uses delta-sigma modulation to providea serial bit stream which can easily
be processed using standard VLSI digital logic andanalog circuitry.
Review of Asymptotic Efficiency
One of the most importantmeasures of performance in a communication
channel is its bit error rate (BER). For channels thatare characterized with additive
white gaussian noise (AWGN), this BERcan be calculated using the "Q function",
represented by
Q(x) =.1--rri CY2ady
where x is a function of the SNR at the detector.Efficiency as used in [1] is defined
as the ratio between the effective SNR and the actual SNR. The effectiveSNR is8
the SNR defined with only AWGN in the channel. The actual SNRincludes
additional contributions due to the noise from othersources. The efficiency is always
non-negative and upper bounded by 1 since the actual SNR is the best possibleSNR
in the channel. The efficiency definition providesa method by which we can analyze
communication systems.
One measure of multiuser system performanceuses asymptotic efficiency
which was defined by Verdu. Thismeasure is defined as the limit of the efficiency as
the background noise level goes to zero. It providesa means to calculate the
performance loss due to other activeusers using the same channel as the user of
interest.
For the type of channel examined in thispaper, (e.g. binary, antipodal, AWGN)
the probability of error for an optimum detector usinga correlator pattern matched
to the desired signal and followed by a SGN() decision is Q(1)-4, Ay). When K
multiple users are also in the channel, the kth user'senergy (wk) and the error
probability (Pk) are used to create the effectiveenergy (ek) where
Pk(a)= Q( ek Tt Wa)
From [1], "the logarithm of the kth usererror probability decays asymptotically
with the slope corresponding to a singleuser with energy nkwk." The asymptotic
efficiency for the kth user [12] is then
flk=limek(a) a--)0Wk
= sup0<r <1;9
In order to have an analytical tool for thenear far scenario with multiple
users, a measure is defined in [1] that predicts the performance of detectors which
operate over all received energies. The "near-far resistance" is definedas the worst
case asymptotic efficiency over all possible energies of interferingusers. This
measure is defined as
= illfk
A detector is defined as near-far resistant foruser k, if the near far resistance of user
k is non-zero.
Decorrelating Detector Review
From [13] the sampled output ofa normalized matched filter for the ith bit of
the kth user, i=-M...M is
Y fir iT++TT:kr(t)sk(tiTrk)dt
S(t,b)s k(tiTk)dt + f n(t)s k(tiTOdt
since the signals are zero outside of [0,T].
The linear detector for bit i ofuser k is characterized as Vk'i E L, where L is
the Hilbert space of square-integrable functions. Thedecision of the detector is
given by the polarity of the inner product of vk4 andthe vector y of the matched filter
outputs, which is equal to
M K
(1)y (1)= ,-Vk'')dt +120
1=-M j=1
=(,f(t,0-6),.§(t,flki))+nkt10
nici is the noise componentat the output of the cascade of the matched filter,
sampler and detector. It is a Gaussianzero mean random variable with a variance of
E[nk'1.2] = k(1)v (i)j- a2 sk(t1Tk)si (tiTj)dt
lc, 14
0.211g, (tk )112
The receiver decision on the ith bit ofuser k is
M K
1 3 = S g nIV" (1)), (1)
1=-M j=1
=sgn((:§(t,cv-E),:§(t,riki))+n)
By restricting the time interval to just that of the bitwe are trying to decode then
b k (i ) = sgn ((slw-bl,s1v-k))+
This restriction results in the "one-shot" senario. Thegoal now is to generate a
pattern s(vici) which is insensitive to the energies of the interferingusers.11
CHAPTER 2
MATHEMATICAL BASIS OF PROPOSED SOLUTION
The key point made in Lupas' paper[1] is that rather than correlating the
received signal with the same pattern used to send the signal, the matched filter at
the receiver is modified such that its response is orthogonal to the known interfering
users. This filter and detector were shown to have the property of being optimum in
terms of near-far resistance.Due to the mathematical complexity in generating the
response of the filter for a large number of users, Lupas investigated the "one-shot"
detector proposed by Verdu[3] and found that while it was sub-optimum, its near-far
resistance was closely comparable to the optimum. In this case, the basis vectors
used to despread the signal are derived by inverting the cross correlation matrix of
the interfering patterns and the signal of interest. In order for this method to be
stable, i.e. the matrix is invertible, the following conditions must be met:
a. The signals must be linearly independent of each other in the bit interval
under consideration.
b. The transmission of bits of an interfering user and the desired user must
not be synchronous (i.e. arrive at the same time instant).
The first condition can be easily met by selection of the proper spreading
sequences. The second requirement occurs if the bit delay time of one or more of
the interfering users coincide with the start of the user we are trying to decode. If
this action occurs, then the left sub-bit is discarded and only the right sub-bit is used.
In practice, this approach would be difficult to implement in hardware since the12
matrix inversion circuitry would have to allow for differingnumbers of interfering
users.
Another method which avoids invertinga matrix but which can also be used to
generate the orthogonal basis vector to interferingusers is the Gram-Schmidt (GS)
procedure. Normally, the GS procedure is usedto form orthonormal basis vectors
from a known set of original non-orthogonalvectors. Each of the original vectors can
be created by linearly combining the basisvectors. Let the original vectors be
represented as { Vi,V2, V3... vk } and the basis vectors as Ifv- , CV' kl.Each
original vector is then constructed by:
vi=aei where ay = (0j,i7i)
i=1
The GS procedure used to generate the basisvectors uses the following steps:
a) w1 = Ili 11where
1111 is the Euclidean norm
b) while 2 5.. ik
= vi (17i, )170i
Wi= Zi /I1ZiII
The final set of basis vectors is relatedto the original vectors by:
all0 0 0wl vi
a220 0w2 i72
a31a32a33 0W3 = 173
aki a k2a k3 akk k k
It is evident from the above formulas thatthe last basis vector wk formed by
the GS approach is orthogonal to the originalvectors (except vk) due to the
coefficients of aik=0 except forakk.13
In order to use this method to find the basis vector which is orthogonal to the
interfering users, vk (e.g. the last vector in the process) in the above formulas must be
the original signal of the user of interest. The resulting basic vector zk is then the
same vector as that generated in the correlation matrix inversion approach taken in
[1].
As an example, the 2 user case presented in [1] is re-examined here using the
proposed GS procedure. Using the terminology used in [1], the 2 user's signals can
be represented as 3 users. One is the signal of interest, and the other two being the
left and right sub-bits of the interfering user as seen in Figure 2.1.
I- ONE BIT INTERVAL
V3
IVi V2
0 T - t
)
{6{-1,1 b
v-20 )-={60{ "
},
v-30 )--qe { -1,1
0<t <T2
T2 <t <T
T2 <t <T
0<t <z2
0<t <71
T2 = delay of user2 with respect to user1
Figure 2.1 - Illustration of one shot break-up of interfering userlet P21=(31' ,31),P12 = (31: 3-1 ), e2 = (44), 1 e2 = (Y121,4)
let=,V2 = .ff, V3 =
01= v1/1117111=4/(31,:f-)112
w2= 22 illtzil2-2 =172 - 072 ,1701)01 = P2 since (V2,01)= 0
w2= 17247211=4/(327'3-2
W3 /112311 IY3 = 173(173 /11 )r1-)1(173 /4"2 )/T12
(314)4(31 Y3-iR)32R
(31,S2 / ,3 121)
P21:q le2P124 1(1e2) V1)3 =
111311
The numerator in the final expression forw3 is recognized as eq. 4.152 in [1]
which is the new basis pattern. The denominator 1123IIis the square root of both the
efficiency and the near far resistanceas defined in [1]. It can be calculated as
follows:
w3 = S1 112311
112311=1/(23,23)
21' 2 12'2)(3-2,1'2V12'' 2
e21 e2 e21 e2
-,L c-R -L ,--R
141(N1,3"1)P21(1174)P12(1,4) =
e2 1- e2
n2
1P221 P12
e2 1 e2
14
The normalized power in the waveform is thesquare of the above result. The
efficiency is 1P21 V12
e2,which is recognized as equation 4.150 in [1]. e215
Extension for Three or More Users
In the two user case, the derivation of the orthonormal basis vectors was
made easier by the fact that the interfering user is split into two parts whichare
automatically orthogonal to each other. This section derives the basis vector for the
3 user case and then shows how this can be expanded into N users. Also presented is
a simplified formula which ignores terms that have little effect on the final waveform
and efficiency. This new algorithm reduces the amount of computational complexity
required to implement the GS system.
In order to make the terminology easier to follow, each interfering user's sub-
bit will be numbered as a separate user. The normalization process is rearrangedso
that square root operations are avoided. This approach willease the computational
complexity of the circuits used later to implement this algorithm. The delays
between users will be shown as increasing without loss of generality. The derivation
assumes that the delay of each user is independent of the others and is equally
probable on the interval [0,T]. A diagram showing the setup for the 3user case is
shown in Figure 2.2.
I-- ONE BIT INTERVAL
US ER1I V5
USER2
USER3
0
V3
Vi
V4
V2
T2 13 T - t
Figure 2.2 - Three user case bit interval
The steps used to compute the basis vectors using the GS algorithmare:16
let pij = (rii,ij),
z1 = V1
Z2 PA/el
Z3 = V3P31z1/e1P3222/e2
Z4 = V4P4A/ei P4222le2P 43z31e3
Z5= V5Psizi/eiPs22.2/e2P532.3/3 135424/e4
In the case where one or more of the interferinguser's bit delay is zero, the
terms with the discarded bits are just set tozero (The hardware circuits can account
for the 0/0 division as 0). Ascan be seen from the formula for z5, as the number of
users diminishes, the resulting pattern used to correlate the receivedsignal
approaches the desired user's originalpattern. The above formula for the 3 usercase
can easily be expanded to multiple users using therecurrence equation
fi =17i IAA lei. The key point in implementationis that the user of interest
j.i
must be the last vector in the GS process. The interferingusers can be input in any
order though normally the left and right sub-intervalswould be paired together.
Reduction of Computational Complexity
If the above formulas forz1 are examined more closely, some of the terms
contribute little or nothing to the finalpattern. The most obvious is the last term in
z2 's equation. Since the vi and v2 patterns do not overlap (bydefinition if the
interfering users are entered paired),P21 will always be zero and the term does not
need to be computed. The last term inz3 appears to be equal to 0 by a similar17
argument but had the delay for user3 been less thanuser2 s then P32 would not be 0
so this term can not be deleted. For
P43 =(10(73P3A/el P3222 /e2))
(174,v3)
P31(114,! 1)P32("4' '2)
el e2
(9-4,V3)= 0, and eitherP41 or p32 will be 0. In the case shown in Figure 2.2,1332 =0
and p31p41 form a second order product ofcross correlations which with typical
spreading patterns will contribute little to the generationof the basis vector.
Simulation of a set of maximal patterns has shownthat this term can be ignored. The
new reduced equations (ignoring relative delays)are then
z1 =
z2 =
Z3 = V3P31z1 /e1 P3222/e2
= 114 PAie1 P4222/e2
25=r15 Psizi/eiP52-12/e2P53Z3/e3P5424/e4
This reduced approach can also be extendedto multiple users. In the case of 4 users,
the above equations hold and
2.6 = v6P61Z1/e1P62Z2/e2P63Z3/e3P64Z4/e4
= P7 P7222/e2P732-3/e3 P7424/e4P752.5 lesP76Z6/e6
If the restriction is added that the sub-intervalpatterns are input into the circuit
with respect to increasing delays (but eachinterferring users' sub-intervals paired),
then the number of computational blockscan be reduced further. This reduction is
due to the cross correlation ofz1 and veven always being zero (e.g. peveni =0). As the
number of users increase then thecross correlation of zi and vj, where i is odd and j is
greater than i, is also zero.18
Simulation Results
In order to determine how ignoring the lastterm for z3 affects the modified GS
approach, simulation was performed using three 31 chipmaximal sequences. The
interfering users were each delayed from 0to 30 chip intervals within each bit
interval, and the output of the normalized correlatorwas sampled at the time of
maximum SNR. Figure 2.3 shows four graphs detailingthe results. The standard
correlator is the output of a correlator matchedto the user of interest's pattern only.
The modified GS correlator is usinga signal closely orthogonal to the interfering
users. This signal is recalculated for each relative shift of all interferingusers. The
3D plots show the relative delays of the interferingusers in the X and Y directions.
The Z direction is the relative output from thecorrelator. The lower graphs show
actual levels looking left into the 3D graphs. Simulationof the full GS correlator was
also performed and its outputwas flat at a value of 1 for all relative delays. Table 2.1
shows the average SNR, and worstcase peak SNR at the output of the correlator.
Correlator Average SNR Lowest SNR cr2
Standard 14.6 dB 6.9 dB 34E-3
Modified 44.4 dB 24.3 dB 36E-6
Table 2.1 - Simulation results
From this data we see that onaverage, the total power of the interfering users
can be increased by about 30 dB, though for the worstcase delay the increase is only
17 dB. If more rejection of theusers are required, the full GS approach can be used.19
1.6
1,4
1.2
1
0.8
0.6
0.4
0.2
Standard C rrelator
ide view of Ahoy
oo
10 20 30 40
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
Modified GS Correlotor
view of Above
IaWooIsawlwr!,...
oo
10 20 30 40
Figure 2.3 - Normalized output from standardand modified GS Correlators.
The X and Y axes of the 3D plotsare the relative chip delays (1-31) for the two
interfering users patterns to the desiredusers pattern.20
CHAPTER 3
IMPLEMENTATION OF PROPOSED SOLUTION
This section examines how the near-far resistant"one-shot" correlator and
detector could be implemented. Appendix A containsa study on the "one-shot"
proposal by Verdu and Lupas. The main difficultyfound in their algorithm is the
circuitry required to invert the correlation matrix.Since the proposed solution in this
thesis does not require matrix inversion, thisroadblock is avoided. Both the full GS
method and the modified GS methodare presented and the tradeoffs compared.
Figure 3.1 shows a block diagram, containing thetop level functions. The GS
Orthogonal Signal Generator's (GSOSG)purpose is to receive the known patterns of
the interfering users and theuser of interest from the synchronizer and pattern
generators. The sequence of vi patterns consist of +/- 4where / is a constant, in the
sub-interval of interference and 0 in the othersub-interval. For the user of interest,
the pattern will always be + / -1. Better performancecan be achieved by using
bandlimited pattern signals since the receivedwaveform will also be bandlimited due
to constraints on the channel and receiver circuitry. In thelatter case, the vis would
be analog signals.21
V1
Gram-Schmidt
Orthogonal Signal
Generator
V2
V3
V4'<
V5r1
r(t
FIFO for correljtor
detector
Delay
Figure 3.1 - Block diagram of one-shot correlatorand detector
The received signal r(t) is processed byan oversampling A/D and digitized into
a one bit sequence. This digital sequence is then delayed byan appropriate interval
required by the GSOSG to create the near-farresistant pattern. The digitized
received signal and the near-far patternare correlated using a "Delta-Sigma
Correlator" (multiply, add and decimate portion)that is described in the next
chapter. This new correlator is also used in theGSOSG block as well.
Figure 3.2- Block diagram of GSOSG
Figure 3.2 shows the block diagram for theGSOSG which uses all of the terms
in the equations. The shaded blocksfunctions are shown in Figure 3.3 whichare
responsible for determining theenergy in the sub-interval and delaying the pattern
by one bit cycle. Theenergy is provided by multiplying the signal by itself and22
summing the results through the entire bit interval. The resultingdesired N bits
would be latched at the end of the bit interval. This resultis then used to control the
attenuation of a programmable amplifier whichoperates on the delayed digitized
input signal.
In
A
Deci-
mator
r re 1 a to r
LionmApjAlmiumminiiimmulF0, 1 Bit Dela A
ua
ttto en
r-
latch
Zi/e
Figure 3.3 - Shaded module block diagram
The non-shaded blocks shown in Figure 3.4 performa similar function. The
input signal is converted to a 1 bit digital signal using thedelta-sigma A/D convertor.
This signal is then fed along with the decimated and digitizedzi into the delta-sigma
correlator.
FIFO, 1 Bit Dela
Figure 3.4 - Non-shaded module block diagram
This correlator is simplified since it only needsto be an N bit by one bit
multiply. The output of the correlator formsply which is the correlation between the23
input signal andThe result is latched at the end of the bit intervaland controls the
gain of the amp forming peilei. This product isthen added to the delayed input
signal, summed and low pass filtered toremove the undesirable noise due to the
convertor. This forms an analog output signal which is fed intothe next block. This
technique allows for variable chip delay intervalsbetween interfering users.
To implement the full GS algorithm requiresN(2N-1)-1 total processing blocks.
It also requires a time delay of 2N bit intervalsplus the time required for the delay
through the decimators and filters. This approach,however, does allow real time on
the fly processing of the incoming waveform.Although this total processing unit only
decodes one user, the number of these processing unitsrequired to decode multiple
users is N for N users.
As shown previously,some of the computational complexity could be removed
at a cost of reduced performance. Although the numberof processing element
reduction is minimal, the organization of the blockdiagram can be changed (Figure
3.5) to allow some parallel processing andthus a reduction in the time complexityto
N bit delays for N users. The detailed processingblocks are similar to those shown in
Figures 3.4 and 3.3 respectively for the blankand shaded elements.
Figure 3.5- Modified GS block diagram24
Although these functional blockswere derived to use delta-sigma modulation,
conventional A/D techniques could also be used. Thedelta-sigma approach was
taken to allow easier VLSI implementation of thecircuits.
Another approach that could be takento implement the GSOSG is to use
digital signal processors at each processing block.This action would keep all of the
intermediate results in digital form and improve theaccuracy of the final result.. The
amount of delay required for r(t) would dependon the number of DSPs used and
how fast they operate.
Simulation Results
In order to implement the GSOSG block, theprecision from the delta-sigma
A/D convertors must be better than thatrequired in correlating the final signal in the
delta-sigma correlator examined in thenext chapter. In the final correlation case, the
object is to determine the polarity of themessage bit. In the GSOSG case, the goal is
to generate a new pattern, which is orthogonalto the interfering users. Any error
that is introduced will tend tocause the generated signal to not be truly orthogonal.
Of course, the final pattern only needsto be near-far resistant only over therange of
the front-end circuitry.
To get an idea of the accuracy required, simulation was performed which
investigated using some of the delta-sigma modulatorsfrom chapter 4 and various
decimation strategies. It was found, fora randomly chosen pair of relative delays,
that the 2ndOrder modulator with triangulardecimation (-9 bits) provides good
near-far resistance for the threeuser case. A summary of the results are shown in
table 3.1. The multiplication and divisionoperations were done with full floating25
point precision. Quantization of these operations willcause additional error in the
generation of the orthogonal pattern.
Over-
Sampling
Ratio
Modulator Window
for
Decimator
m.s.e
a2/chip
P12 P13 Pll
128 k=.252ndOrder Boxcar 1.0E-6 0.002 0.0065 0.8865
128 2ndOrderTriangular2.2E-8 -1.55E-4 -0.0033 0.8908
128 2ndOrderBlackman4.1E-8 1.01E-4 0.0123 0.8939
128 Proj_32 Boxcar 6.78E-7 -0.0019 -0.0092 0.8915
128 Proj_32 Triangular1.1E-7 -0.0027 -0.0039 0.8865
128 Proj_32 Blackman8.05E-8 -3.8E-6 -0.0061 0.8880
64k=.252ndOrderBoxcar 3.925E-6 -0.0024 0.0404 0.8880
64 2ndOrderTriangular6.29E-7 -0.0034 0.0287 0.8976
64 2ndOrderBlackman3.26E-7 -0.0012 -0.0164 0.8825
64 Proj_32 Boxcar 4.0E-6 -0.0025 0.0079 0.8825
64 Proj_32 Triangular5.269E-7 0.0189 0.0047 0.8902
64 Proj_32 Blackman 1.55E-7 0.0011 0.0113 0.8902
Table 3.1- Results of simulation
The mean square error is the average variance of the differencebetween the
ideal pattern and the actual output from the simulation for each chipinterval. The
correlation coefficients between the simulated output pattern and thethree users are
listed as p12,P13' P11 in the table. Proj_32 is a fifth order delta-sigma modulator
which is optimized for minimum noise in the region of 0to 7c/32 (27t is the sampling
frequency).
The 2ndOrder modulator with triangular decimationat an oversampling rate of
128 provides good performance and is relatively implementable.More simulation
was performed, using these selections, using all possible chip delays of the interfering
users. The results are shown in the graphs of Figure 3.6. The top left graph showsa
three dimensional picture with the relative delayson the X and Y axis and thenormalized output on the z axis. A sideview ofthis graph is shown in the lower left
plot. The lower right graph isa typical output pattern from the GSOSG.
1.5
0.5
Output of Correlator vs Delays
irie view of Above
1 4C H'str)9r°V1
12G-
100
80
60
40
20
0
of Output of Cc-re In or
0.98 0.99 1 1.01 1.02
0.06
0.04
0.02
0.02-
0.04-
Cornple Ppttern rnm (-;5oc.-;
0.06
10 20 30 40 0 100200300400500
Figure 3.6- Simulation results of entire system for three users
26
Note the various levels for each chip interval.This requires that a non-binary
correlator be used. The top right graph isa histogram of the normalized output from
all the delays showing the distributionsdue to the error introduced by the delta-sigma
modulators. The variance for the noise causedby the interfering users for all relative
delays is 1.6E-5, which suggestsa SNR due to the multiple users of about 48 dB.27
Analysis of Error Contribution due to A/D Quantization
This section examines how well the system SNRperforms with respect to the
number of users in the channel (K), the number of chipsper message bit (N), and the
quantization of the internal signals due to the AZ A/Dconvertors. In most practical
systems N will be greater than 31 and the number ofusers will be larger than three.
These values were used in the simulations of thesystem in order to limit the amount
of computation. The following derivations allowssome idea of what the bit
resolution requirements on the A/D convertorare for other values of K, N and SNR.
When the patterns are converted from analogto digital form in the various
blocks, the AZ modulator, decimator and otherfunctions provide an error with a
variance of a2ds. If the assumption is made that theerrors from each block are
independent and uncorrelated and that there isno correlation between adjacent chip
intervals, then the total variance of the signalat the output of the system is
a20,4,a= K(2K
where K is the number of users, K(2K-1) is thenumber of processing blocks, and N is
the number of chips per message bit.
In Figure 3.6 it was shown that theerror distribution was approximately
gaussian. To make the estimate less complicated itwill be assumed that the error at
the output is uniformly distributed. The formulafor equating the variance of a
uniform distribution to its quantizing level is knownas
/12
12
Substituting this into the previous equation andsolving foroutput28
A 2
" output 2 K(2K 1)Nads
12
Aouvw = 23/3K(2K-1)Nal
Since the noise generated from this digitization process is random, we can set a
bound on the worst case SNR by limiting the size of A,,wutputSince the error is zero
mean, then we need to only limit the maximum noise level to Aoutput/2. Assuming
that the output signal has been normalized, the SNR at the output of the system is
SNRoutput= 20log
A ouvat/2
Solving for the quantizing step of the A/D convertor
A2 A 2
2...... output Lids 0"ds
12K(2K 1)N12
A2output
Ad,
K(2K 1)N
The bit resolution of the A/D convertor is then
# bits =log( 1og(2)
Ads
A set of graphs using the above equations is shown in Figures 3.7, 3.8, 3.9 and
3.10 for 3, 5, 10, and 20 users respectively for the worst case SNR desired. In the 3
user case examined previously with the 2ndOrder modulator and triangular
decimation, the worst case SNR was 35.23dB. From Figure 3.7, the equated
resolution required of the A/D is 9 bits which agrees with the modulator/decimator
pair. Since the 2ndOrder modulator is capable of more bits of resolution using a
better decimator and bandlimiting on the input signals, the SNR could be improved.
With a 16 bit resolution, it is possible to handle 20 users with a 1023 chip sequence.29
20
18
16
14
12
10
8
A/D Bit Resolution for SNR
Three Users
6
31 63 127 255 511 1023
Number of Chips/Message Bit
20 dB SNR
40 dB SNR
60 dB SNR
-e-
80 dB SNR
Figure 3.7- A/D Resolution required for three user case
ND Bit Resolution for SNR
Five Users
0
8
6
2
0
6
R
31 63 127 255 511
Number of Chips/Message Bit
1023
20 dB SNR
40 dB SNR.
60 dB SNR
80 dB SNR
Figure 3.8- A/D resolution required for five user case30
22
20
18
16
14
12
10
A/D Bit Resolution for SNR
Ten Users
8
31 63 127 255 511 1023
Number of Chips/Message Bit
20 dB SNR
40 dB SNR 4.
60 dB SNR
80 dB SNR
Figure 3.9 - A/D resolution required for tenuser case
2
A/D Bit Resolution for SNR
Twenty Users
4
2 20 dB SNRi-
40 dB SNR
60 dB SNR
-Ei--
80 dB SNR
=
8
6
4
2
D
4
2
0
1
127 255 511
Number of Chips/Message Bit
1023
Figure 3.10 - A/D resolution required for twentyuser case31
Analysis of Error due to Delay Resolution
It was assumed previously that the relative delaysbetween user chip intervals
were known. This section examines how the resolution of the delay intervalsaffects
the SNR at the output of the correlator. Since thesignal will be sampled at discrete
time intervals for processing by the proposedsystem, the sampling rate will affect the
output of the final correlator.
The output of a correlator usinga pattern matched to the user exhibits a
familiar triangular function asseen in Figure 3.11. Here i is the relative delay from
the peak output of the correlator. Maximumoutput from the correlator is at T=0.
As the delay varies from 0 to T (the chip intervallength), the output of the correlator
decays linearly as
Rsigna(1.).1L
1-1,where 121< T
for a normalized pattern. For the GSOSGpattern, the result is similar so the above
formula will be used.
-T 0
Figure 3.11- Normalized output of correlator for one chip interval
In the situation proposed in this thesis, thereceived signal r(t) is correlated with
a pattern that is orthogonal at T=0 to the interferingusers. At delays other than 2 =0
there will be some correlation between thenew generated pattern and the interfering
users which will reduce the SNR and thus have less near-far resistance.For ease of32
analysis, it is assumed that one possible output (others exist) of the correlatorusing
the generated pattern and just one interferinguser signal as input is
Roice(r) = c
where c is the maximum value the cross-correlation achieves. This isseen
graphically in Figure 3.12.
R(T)
c
-T 0
Figure 3.12 - Output of correlator with interferinguser as input
The SNR at the output of the correlator when r(t) is the input is
SNR(T)=
Rsi
=
gnai(T)1(1ITIIT)1(T
1whereT.* 0 e(r)ckl/T c
This result shows that the SNR is directly related to the resolutionof the delayT.
Notice at t=0, the SNR will only be limited by other noisesources in the channel. As
Tincreases the SNR will also decrease. Thismeans that the tracking, synchronization
and sampling of the received signal r(t) will directly affect the maximumnear far
resistance possible by the proposed circuit. Figure 3.13 showsthe simulated output
of the proposed system for a threeuser case for a random selection of delays for the
users. The x axis is the chip interval where 64 is T=0. On the left hand side of the
graph, the two interfering users output from the correlator addtogether and the
effect of sampling the correlator too earlycauses a dramatic decrease in the SNR.
On the right side of the graph, the two interfering users'output from the correlator33
tend to cancel and the SNR decays close to linearly as expected. The dashed line is
the SNR from the output of a correlator using a pattern matched to the desired user
rather than that generated by the GSOSG circuitry.
Since the resolution of the delays will affect the SNR, the sampling rate will
tend to dominate the maximum near-far resistance of the circuit. By using higher
sampling rates, the AI modulator's resolution will also increase which will reduce the
error caused by quantizing the signal. The quantizing error will decrease faster with
an increase in the sampling rate than will the delay resolution error. The AI
modulators can thus be of a lower order which will lower the complexity of the
circuit. Higher sampling rates will be limited due to the speed of the technology used
to implement the system.
S
R
a
n
a
b
Comparison of SNR in dh for Three User Case
Delay over chip interval, tau=61 at 64
Figure 3.13 - SNR vs. delay resolution for proposed and standard detectors.
The dashed line is the standard detector, the solid line is the proposed detector.34
CHAPTER 4
DELTA-SIGMA CORRELATOR
Introduction
This chapter examines the role of delta-sigma modulationin a spread spectrum
(SS) direct sequence (DS) correlator and detector. Themotivation for this proposal
is the search for simple, low cost, flexible, implementable andpredictable hardware
which can produce such a correlator. Some of themusts and wants for such a SS
correlator are:
Wants Musts
Flexibility in choosing design options High SNR for low bit error detection and
low false alarm rates
Small chip area Work at speeds close to the limits of
technology
Expansibility Be able to use patterns outside of the
domain of (-1,+1) or (0, +1)
Minimal external control of settings
Table 4.1 - Musts and wants ofa multi-level correlator
The inherent features of the delta-sigma A/D suchas excellent linearity, low
complexity, bit serial output and the ability to control theshape of the noise power
spectrum lead to its natural selection as a block ina correlator. Both the delta-sigma
modulator and the correlator for SS havecommon elements such as the following:35
SS DS Correlator Delta Sigma Modulator
Bit Serial Data Stream Bit Serial Output
Noise Shaped Signal Noise Shaped Error Spectrum
Correlator consists of Multiply and Add Decimator consists of Multiply and Add
Pseudo Random Pattern used for spreading Noise pattern is Random on Higher Order
Modulators
Table 4.2 - Comparison of SS correlator to delta-sigma modulator
It therefore seems reasonable to expect thatsome of the operations could be
shared by the correlator and modulator and thus reduce the complexity of thetotal
design.36
Block Diagram of System
The block diagram for the architecture used for this project is shown in Figure
4.1. The message (m(t)), a binary signal, is multiplied by a pseudo-random pattern
which generates the signal r(t). In general, r(t) would be used to modulate some
carrier and be demodulated at the receiver back to baseband. In this chapter, it is
assumed that synchronization with r(t) has been accomplished. A low pass filter is
used for limiting the bandwidth and reducing the input into the modulator to prevent
instability. The final signal has a power of -12.5dB (-.225Vrms) The delta-sigma
modulator converts the analog signal into a serial bit stream of {-1, + 1}. The output
of the modulator is multiplied by the pseudo-random code (addressed later in the
paper). This code is generated at the receiver and is in sync with the transmitted
signal. In the block diagram it passes into an optional low pass filter which limits its
bandwidth. After the multiplier, the signal is integrated and the output dumped at
specific intervals of R x spread code length(31). The final signal then has the signum
operation performed on it to convert it back to the original binary signal.
Timing Used in System
Figure 4.2 shows the timing used in this chapter. Each bit of the message is
spread by a 31 'chip' pattern which has pseudo random properties. The pattern is an
M sequence which has excellent autocorrelation properties. Each chip is then
oversampled by R samples before entering the Delta-Sigma modulator. For each
case examined, the data sequence consisted of '1001'.37
Input r(0
m(t)
+1
-1
LPF
Delta
Sigma
ALL
PNG iLPF
Block Diagram of Delta Sigma Correlator
Int. and
dump
Pseudo-Noise
Pattern
m(t)
PNG
r(0
X
Block Diagram for r(t) Generation
1 bit
Figure 4.1 - Block diagram
'1'
N Each bit is spread by 31 chip
pseudo-random code
1 chip
r......." \,....Each chip is oversampled by R
R samples
Figure 4.2 - Illustration of timing used in system38
Modulators Examined
In order to see what effect the noise transfer functionhas on the operation of
the correlator, six modulators where used whichhave different properties. The
modulators consist of the lstOrder, 2ndOrder, 3rdOrder-c[17],Example[17],
Proj_32, and Proj_16. The poles andzeros in the Z domain of the noise transfer
function are tabulated in table 4.3.
Modulator Zeros Poles
lstOrder 1 0
2ndOrder 1 0
1 0
3rdOrder-c 1 0.6166
0.9993+60.038 0.7297+60.3176
Example 1 0.74545
0.999650+/-j0.026429 0.778479+60.136363
0.999011+60.044467 0.880630+60.249593
Proj_32 1 0.74450
, 0.998603+/-j0.052839 0.777593+60.136819
0.996045+60.088847 0.88001+60.250536
Proj_16 1 0.74069
0.994416+/-j0.105531 0.774055+/-j0.138631
0.984213+60.177699 0.87754+60.254299
Table 4.3- Poles and zeros of error transfer function for modulators
Example has a corner frequency at 7c/64,Proj_32 at it/32, Proj_16 at 7c/16 where
27c is the sampling frequency. These filtersare designed such that the amount of
noise power in the signal band is minimized.Figure 4.3 shows the power spectrum of
the transfer functions. Figure 4.4 showsa closer view in the range of 0-7c/4.a.
2.
1.3
.16
.14
12
10
/Ht~* = "2 othe Modulateused in Pr.ojeo t
Figure 4.3-Power spectrum
2.5 3 3
of noise transfer functions used in project
cto :".42 of the Modulators from%-1,4
0. 2
Figure 4.4-Closer viewof power spectrum in range of 0 to ir/4
3940
2ndOrder Noise SNR Predictions
In order to verify the simulation results, the 2ndOrder modulators noise
spectrum was derived. When the output of the modulator and the PNG pattern are
multiplied, it is equivalent to convolution in the frequency domain, and so NO2 the
noise power, will change. The modulators which have their zeros of the noise
transfer function spread out on the interval from 0 to n/Rfilter are more difficult to
analyze but they can be estimated.The results show that the noise power is
increase by a factor of 2.6.The formula for NO2 is (assuming ae2 in the modulator
= 1/3):
No2=0.548/R5
The calculated predictions and simulation results for the SNR for the
2ndOrder modulator are listed in table 4.4 below:
Oversampling
Factor
Calculated
Results
Simulation
Results
R=2 5.199 dB 12.73 dB
R=8 35.30 dB 34.19 dB
R=16 50.35 dB 52.41 dB
R=32 65.4 dB 64.01 dB
R=64 80.5 dB 69.55 dB
Table 4.4 - Comparison of calculated to simulated SNR for 2ndOrder
modulator41
Simulations
SNR at Input to Detector
The first simulations donewere to feed r(t) into each of the modulators and to
perform a rectangular decimationon the output from the modulator. The pattern
used to multiply the output of the modulatoris band limited. The decimation used
consists of a rectangular window of length R*31.Other decimation methods could
be used to reduce the noise level further.The output of the decimator isa signal at
the message bit rate. This is the signal thatwould be sent to the detector which is just
the signum function. The output from thedecimator was examined to determine the
SNR (10*logia m2/(32)) going into thedetector. The results are tabulated in Figure
4.5 and the SNR at the detector input isgraphed in Figure 4.6.
Due to the tonal nature of the noiseshaping of the lstOrder modulator, its
results are difficult to predict',but the other modulatorsare more predictable. At
each R, the modulators with the least noisein the signal band have the higher SNR's.
At R=8, it is expected that the 2ndOrderwould have the highest SNR since it has the
lowest power and the results confirm thishypothesis. At R=64, it is expected that
example would have the highest SNR sinceits transfer function indicates the least
noise and this is also verified. From thegraph, it is seen that as the samplingrate
increases the SNR increases. Thus thereis a distinct tradeoff between SNR and
oversampling ratio. For the detectorto have a decent (101 error rate, only about 12
1 If the noise consists oftones that fall on or near the zeros of the decimation filter,then it is possible
to have a high SNR as seen in the case where R=8. Thetones are
very dependent on input signal level into the a modulator.42
dB is required. Adding 6dB for additional margin would bring the minimum SNR
required to 18dB. From Figure 4.6, an oversampling ratio of 8 provides enough
MODULATORR=64 R=33 R=16 R=8 R=2 R=1
latOrder:
0.22395 0.22321 0.22225 0.19618 0.12903 0.35484
-0.22415 4.22601 -0.22137 -0.19618 -0.25937 -0.22581
.0.22406 -0.22575 4.22236 4.19618 -0.19355 -0.225e1
0.22419 0.22307 0.22227 0.19018 . 0.19355 0.61290
Mean 022409 0.22451 0.22206 019619 0.19e85 0.38844
Variance 8.938.09 1.89E-06 1.63E47 0.00E+00 2.11E-03 2.61E-02
SNR dB 67.50 44.26 54.82 .100.00 12.73 7.62
2ndOidec
022503 0.22532 0.22853 0.23749 0.19355 4.03226
-0.22517 4.22560 -0.22742 -0.23995 -0.25807 4.02258
4.22505 422567 4.22881 4.23064 -0.12903 -0.09677
0.22521 0.22566 0.22964 0.24330 0.19355 0.22581
Mean 0.22512 0.22556 0.22835 0.23789 0.19885 0.12440
Variance 5.62E-09 202E-08 299E47 2.16E-05 2.11E43 7.48E-03
SNR de 69.55 64.01 52.41 34.19 1273 3.16
3rdOrder.c:
0.22492 022462 0.22123 022816 0.19355 022581
4.22502 422479 .022277 -021158 -0.19355 -0.35484
4.22492 4.22460 -0.22206 43.22071 4.12903 4.35484
0.22506 0.22480 0.22285 0.20641 0.19355 0.22581
Mean 0.22498 0.22470 022232 0.21688 0.17980 029740
Variance 3.97E-09 8.33E-09 384E47 699E45 7.85E-04 4.21E-03
SNR dB 71.06 67.83 51.09 28.28 16.14 13.22
Example:
0.22498 0.22492 0.22539 020108 0.19355 0.35484
4.22503 4.22512 4.22124 -020607 4.25807 .0.35484
.022503 -022520 4.22063 -0.18451 .019355 .029032
0.22499 0.22527 0.22568 , 0.20758 0.32258 0.29032
Mean 0.22503 0.22513 0.22330 0.20002 024778 0.32419
Variance 3.088.10 1.69E-06 5.10E-06 8.39E-05 2.90E-03 1.04E43
SNR dB 8216 64.76 39.90 26.78 13.26 20.03
Proi_32:
0.22493 0.22521 0.22750 0.20675 0.38710 0.22581
422500 0.22538 -0.22449 -020388' -0.32258 -0.61290
4.22495 4.22523 4.22129 4 21302 -0.38710 4.22581
0.22505 022559 0.2249e 019267 0.19355 0.35484
Mean 0.22499 0.22535 022458 0.20573 033212 0.38844
Variance 2.25E-01 2.37E-08 489E-06 605E45 6.33E-03 2.61E42
SNR dB 73.51 63.31 40.14 28.0.5 1241 7.62
Prot_16:
0.22497 0.22523 0.22290 0.19301 0.19355 022581
.0.22516 4.22552 4.22431 -019783 -025807 448387
-022515 4.22508 4.22153 -020675 -025807 -0.35484
0.22518 0.22519 0.22340 0.20823 0.19355 0.35484
Mean 0.22511 0.22526 0.22304 0.20155 0.22810 0.36638
Variance 7.39E-09 2.71648 1.01E-06 3.96E-05 1.05E43 846E-03
SNR dB 68.36 62.73 4691 30.11 16.97 12.01
Figure 4.5- Simulation results of SNR for modulators100
90
80
70
60
50
40
30
20
10
0
ECE 619 Project-SS Del-Sig Correlator
Signal/Noise Powerratio at Detector
R=64R=33R=16R=8R=2
Oversampling Amount
R=1
1 stOrder
--4
2ndOrder
3rdOrder-c
Example
)<
Proj_32A
Proj_1644
margin for all modulators tested. Even though higherSNRs could be achieved by
higher R's, the additional benefits would probably beoutweighed by other noise
degradations of the signal in an actual system. This data ofcourse assumes a
bandlimited pattern doing the correlation.
The next set of graphs show the output of the Proj_16modulator(Figure 4.7),
and the multiplier(Figure 4.8). After the modulator, thespreading of the message
and the noise shaping of the delta-sigma modulatorare both evident. After the
multiplier, the signal is despread and the noise band isspread by the pseudo-noise
spreading pattern. Figure 4.9 shows the frequencyspectrum of output of the Proj_16
modulator with the sine input. The point here is thatalthough the input signal has
different spectral components but with thesame energy, the noise spectrum is very
close to that of the output of the multiplier. Thisseems to validate our comparisons
made here.
The next figures show the comparison of thefrequency spectrum of the
message before it enters the circuit, (Figure 4.10) and the output (Figure 4.11)
spectrum after the decimator(integrate and dump). The scaledifference is the .225
caused by a gain reduction of 4 and bandlimiting of theinput signal. Notice that the
noise is only noticeable in the higher frequencycomponents which will be eliminated
after the signum( ) function. Figures 4.12 and 4.13show the noise in the spectrum
for the sine input after the decimationstage. Here the noise is shifted to the higher
frequencies where in the previous, the noise has beenwhitened by the spreading
pattern.700
600
300
499
309
299
199
399
259
200
159
190
50
Freq. Output of proj_16 Modulator
200400 600 890100912901400160918902099
Figure 4.7 - Proj_16 modulator output
Freq. Output of Multiplier, proj_16 modulator
e9 299 409600869100012691400160018002099
Figure 4.8 - Output of multiplier
4535.
399
250
200
150
109
50
46
Freq.Speotpum of Output of ppoj16 Modulator, Sine input
200 409 800100012001400160018002000
Figure 4.9 - Frequency spectrum with sine wave input47
89
79
69
59
49
30
29
10
Freq. Spectrum of Data
20 40 69 80 100 120 140
Figure 4.10- Frequency spectrum of message before modulator
18
16
14
12
10
8
6
4
Output after /16 Decimator
20 49 60 80 100 120 140
Figure 4.11. Frequency spectrum ofmessage after the modulator48
45
40
35
30
25
20
15
IS
5
Freq. Spectrum of Output of /8 decimator,Sine input
59 190 159 200 250
Figure 4.12 - Frequency spectrum for sine wave after modulator and /8
decimator
25
20
15
10
5
Freq. Spectrum of Output of /16 decimator, Sine input
20 40 60 80 100 120 140
Figure 4.13 - Frequency spectrum for sinewave after modulator and /16
decimator49
N x 1 vs.1 x 1 Multiplication
This section on simulation looks at how complicated the multiplier needsto be
in comparison to the sampling rate. Normally, ina direct sequence correlator, the
input is multiplied by a single one bit stream of the pseudo-noisegenerated (PNG)
pattern. Since the output of the delta-sigma modulator is bit serial,we have two
options. We can multiply the two bit serial streams witha simple exclusive or, or we
can chose to change the spreading pattern to a multibit sequence multiplied by the
bit serial output of the modulator. The mainpurpose in going to a multibit pattern,
is that we are able to attack other areas suchas the near-far problem. It is possible
to use a pattern, different from the spreading sequence, that provides good near-far
resistance, but may be less optimal for someenergy inputs.
The next set of graphs show the correlation, searching in time,between the
output of the modulator and other patterns. Previously, itwas determined that an
oversampling of 8 would provide enough SNR for adequatedetection. In trying to
acquire initial sync, we would also want to limit theamount of false alarms from the
synchronizer (correlator searching in time). Looking firstat R=8, the correlation of
the bandlimited input r(t) with a binary PNGpattern (Figure 4.14) with no delta-
sigma modulation shows the typical idealresponse. By bandlimiting the PNG pattern
and correlating with r(t) (Figure 4.15) it is apparent that nothingis gained. This is
why most systems do not modify the decorrelatingpattern.
The next set of graphs show what happens if the input r(t) isprocessed by the
delta-sigma modulator and multiplied with the bit serial PNGpattern (1 x 1 multiply,
Figure 4.16). We can see that the peaks maintain theproper levels, but the output of50
69
40
20
Correlation of BM limited signal and PUG pattern, R=8
a AnytAWAYvyk
-20
-40
696 200 400600 800100012001400160018002000
Figure 4.14- Correlation of bandlimited input with binary PNG pattern
60
20
-20
-40
Correlation of BM limited signal and BW limited PUG pattern
Animiykook,fi
-600 200 400660 see106012001400160018002000
Figure 4.15- Correlation of bandlimited input with bandlimited PNG patternCorrelation of Output of 2ndOrder Modulator and PNG pattern, R=8
80
60
40
20
- 20
-40
- 60
- 800 299 400 600 800100012001400164918002000
Figure 4.16 - Correlation of input after delta-sigma modulation with binary
PNG pattern
Correlation of 2ndOrder Modulator and BW limited PNG pattern, R=8
BO
60
40
20
-20
-40
-69e-200 499 600890190012991409160018002000
51
Figure 4.17- Correlation of bandlimited input after delta-sigma modulator with
band limited PNG pattern52
250
200
150
100
50
a
-59
-188
- 150
-298
Correlation of BW limited signal and PNG pattern R=33
- 2509 100029903990400950906080790080009009
Figure 4.18 - Ideal correlation of input signal with PNG pattern
Correlation of Output of 2ndOrder Modulator and PNG pattern R=33
259
299
150
199
#411/44V%i
58
-58
-190
-150
-209
-2599 10882999398040005000 6099700089999000
Figure 4.19 - Correlation of input signal after delta-modulation with binary
PNG pattern using lx1 multiplier with higher oversampling53
the decimator has high peaks in between thecorrect ones. By bandlimiting the PNG
pattern so that it does not add additional noise when it is being convolvedwith the
delta-sigma bitstream (Figure 4.17), the output of the decimatoris very close to the
ideal but with some small noise. In order touse an oversampling ratio of 8, the
decorrelating pattern should be band limited. This filteringcan be achieved in a
number of ways which result in an N bits/chip pattern. This Nbits could be
multiplied by the output of the delta sigma modulator by simply2's complementing
or not 2's complementing the PNG N bit pattern.
If a 1 x 1 multiplier is still required,one can oversample the signal r(t) by more
than 8 and reduce the amount of noise. Thenext graphs show the correlation
outputs for R=33. In Figure 4.18, the bandwidth limited input signalis multiplied by
the bit serial PNG pattern to show the idealoutput achievable. The bottom graph,
Figure 4.19, shows the result of multiplying theoutput of the delta-sigma modulator
with the bit serial PNG pattern (1x 1 mult.). Now it is seen that the noise is smaller
and more spread out, thus providing good SNR fora very low false alarm rate. The
designer of the system thereforecan trade off complexity for performance.
Review of Musts and Wants
Comparing the earlier musts and wants with the resultsof the simulations
shows that the use of delta-sigma modulation isa good choice. There is a wide range
of options in the design process. The chiparea should be minimal since only
rectangular decimation is required but other decimationtechniques could be used to
enhance the performance. The architecture lends itselfto being expandable which is54
required to develop the search in time correlator. Other than selectionof the
operating frequency, correlating pattern and detection level,no tuning is required.
There are design tradeoffs between SNR for bit detection, false alarmrates and
oversampling ratios. Most important, the approach allows theuse of a non-binary
correlation pattern which allows optimization of detection for the near-farproblem.55
SUMMARY
The main contributions of thispaper have been the formulation of a new
method to generate the "one-shot" near-farresistant pattern and an architecture by
which to implement it using delta-sigmamodulation. It was shown that this approach
can be easily extended to multiple users. Simulationwas performed to determine the
resolution required both for the A/D conversionprocess and the sampling intervals.
The sampling rate was shown to be themost critical since any correlation of the
received signal r(t) and the near-far resistantorthogonal pattern at other than C=0, is
directly proportional to the resolution ofT.Higher sampling rates can reduce the
uncertainty of ti which allow for less complicatedAl modulators. Higher rates,
however, will require morememory in order to store the intermediate results.
Current technology limits the samplingspeed and thus the rate of data transmission
possible.
Due to drifts in the carrier frequencybecause of doppler effects, clock drift,
and tracking, the synchronization portionof the system must be carefully designed.
Making the synchronizer near-far resistantstill remains an open problem dueto the
difficulty of generating a pattern which isorthogonal to the users at all time intervals.
Since the performance of thesystem will be limited to the synchronizing circuitry,
the modified Gram-Schmidt approachpresented can reduce the complexity required
to implement the GSOSG and still maintain goodnear-far resistance.
The main purpose in choosing delta-sigmamodulation is that the signal can be
digitized into a serial bit stream. This actionprovides for a lossless delay of the signal
by using standard digitalmemory elements. It also can allow reduction in theamount56
of complexity required to implement the multiplying circuitry inthe correlators.
Since the incoming signal r(t) must be resolved witha high degree of time resolution,
the high sampling rates required of AZ modulation worksto an advantage in this
approach.57
BIBLIOGRAPHY
1. R. Lupas, "Near-far resistant linear multi-userdetection," Ph.D. dissertation,
Princeton University, Jan. 1989.
v 2.S. Verdu, "Minimum probability oferror for asynchronous Gaussian multiple-
access channels," IEEE Trans. Inform. Theory, vol. IT-32,pp 85-96, Jan. 1986.
3. H.V. Poor and S. Verdu, "Single-user detectorsfor multiuser channels," IEEE
Trans. Commun., vol COM-36, pp. 50-60, Jan. 1988.
4. C.K. Rushforth, Z. Xie, and R.T. Short, "Methodand apparatus for decoding
multiple bit sequences that are transmitted simultaneouslyin a single channel," U.S.
patent number 4,908,836, March 13, 1990.
5. M.K. Varanasi and B. Aazhang, "Multistagedetection in asynchronous code-
division multiple-access systems," IEEE Trans.Commun., vol COM-38, pp. 509-519,
Apr. 1990.
6. P.Quinton and Y. Robert, Systolic Algorithmsand Architectures, Prentice Hall,
1991.
7. W. Vetterling, et. all, Numerical Recipiesin C, Cambridge University Press, 1988.
8. H.T. Kung, "Why systolic architectures?",Computer, Jan. 1982
9. M.W. Hauser, "Principles of oversamplingA/D conversion", J. Audio Eng. Soc.,
vol 39, no. 1/2, Jan-Feb, 1991.
10. R.C. Dixon, Spread Spectrum Systems, 2ndEd., Wiley, 1984.
11. G.R. Cooper and C.D. McGillem, ModernCommunications and Spread
Spectrum, McGraw Hill, 1986.
12.S. Verdu, "Optimum multiuser asymptoticefficiency," IEEE Trans. on Commun.,
vol. COM-34, no. 9, Sept. 1986.58
13. R. Lupas and S. Verdu, "Near-far resistance ofmultiuser detectors in
aysnchronous channels," IEEE Trans. Commun., vol. COM-38,pp 496-508, April
1990.
14. J.C. Candy and G.C. Temes, Oversampling delta-sigmadata converters : theory,
design and simulation, IEEE Press, 1992.
15. B.W. Dickinson, Systems: analysis, design and computation, Prentice Hall,
1991
16. M.K. Varansi and B. Aazhang, "Optimally near-farresistant multiuser detection
in differentially coherent synchronous channels,"IEEE Trans. Info. Theory, vol. 37,
no. 4, July 1991.
17. R. Schreier, Class lecture notes, ECE619, Fall 91,Oregon State University.APPENDIX59
APPENDIX A
SECTION 1
Introduction
This section investigates the systolic implementation of the "One-Shot
Decorrelating Detector[1]". This circuit is used in code division multiple
access(CDMA) spread spectrum communication to separate out multiplemessages
which occupy the same frequency spectrum. In [1] the one-shot was shown to bean
easier to compute alternative to the optimum near-far decorrelating detector. The
main benefits of such a detector are its memorylessness, linear complexity, and
performance which is relatively independent of the energies of interferingusers
which provides its near-far resistance[1]. Since the data needs to be processed in
real time at high speeds, the systolic computation approach appears to be the most
plausible since it leads readily to implementation in a VLSI structure.
This appendix firsts looks at the 2 user case, both from an ad-hoc approach and
then from a systolic synthesis method which allows extension to Nusers. Some
different alternatives are examined and their features are compared in relation to
number of processors, timing and signal flow.60
One-Shot Decorrelating Theory of Operation
The message received by a 2 user CDMA consists ofa combination of two
signals which are modulated bya spreading code si(t). The combined received signal
is
r(t) = bimi(t)si(t +T + b2m2(t)s2(t +T Z2)
m1(t) = data message of user].
m2 (t) = data message of user2
si(t) = used spreading code
s2 (t) = user2 spreading code
bi = energy of user].b2 = energy of user2
For the case of the one-shot, the idea is to restrict attentionto one bit interval
at a time. This requires that we look at each interferinguser as consisting of two
consecutive bits which overlap with the bit of theuser we are trying to decode. Thus
in the two user case, we split the secondusers spreading code into two distinct
patterns so that the received waveformsare now {si (0,4 (t),4 (t),i = 2,...,K },
where
S2 (tT 1Z2 T 1), 0.5 t 511'2 Til
s2L(t)=
0, 122 Ti1 _5 tT
05 t1z2 4(t)=
s2(t +T-11.2 11),11'2"r11 5 t 5 T
From the above waveforms, theone shot correlation matrix has the form
r 1P21P12
R =p21e2 0
P1201 e261
where e2 is the energy in waveform s2L. From [1], thenew pattern which is
correlated with r(t) to decode users is R11-11s' s2L s2Rir which is
[Si (0P21 SL P12 R 1
2/ J
e2(1 e2)
Block Diagram of Systolic One-Shot
Figure Al shows the block diagram used for the approaches taken in this
section in trying to implement the systolic one-shot. The received signal r(t)must
first pass through two correlators which determine theproper sync signals for the two
messages. The systolic implementation of these correlators is examined in the
synthesis portion of this appendix. The sync signals from the correlators thenstart
the generation of the pseudo-noise pattern generators whichare matched to the
pattern used to spread the respective messages. The output of these pattern
generators are then used to determine the correlation and energies of the time
intervals shown in Figure A2. This processing is done in the block labeled"one-shot
systolic array proc.". The FIFO (first in, first out) blocksare used to delay the
received signal for the time required to compute thenew pattern. The new pattern is
then correlated with the delayed r(t) in the final correlator and detector which
outputs the original message mi(t). In this block diagram it is assumed that the
analog signal r(t) is at baseband and has been digitized bya delta-sigma A/D into a
serial bit stream[2] in the systolic synchronizer correlator.Therefore all signals in
the block diagram are digital levels, thus allowing theuse of standard VLSI digital
cells.r t
Isync_patl
LIsystolic
correlator
PRNGI sl(t)
PRNG
r(t+ T 1)9--
1
s2ftl
AMINIIIIIMIM
i t2
systolic
correlator
1syncpat2
one-shot
systolic
array p MC.
systolic
correlator
and det.
m1(t)
1'(t)
FIFOF
r(t +t2)
2'(t)
systolic
correlator
and det.
m2(t)
Figure Al - Block diagram for systolic one-shot decorrelating detector
1st sync 1st sync +mN
sl (t) L
one bit time
s2(t)
2nd sync 2nd sync+mN
I I
co ea2
eb2 ebl
P21 P12
1
1
Figure A2 - Timing for one-shot decorrelating detector
6263
Ad-Hoc Approach
From the timing, the following procedurecan be used to determine the values
of the correlation matrix and then computationof the final pattern.
Procedure:
0) initialize circuit
1) wait for the first sync pulse
For (m=0;inf ;m+ +) {
2) From the 1st sync to the 2ndsync:
a) ea 1 =zsi.12(t)
b) eb2=Est.22(t)
c) P21=IsLi(t)sL2(t)
4) at 2nd sync +mN (N=numberof clock periods in one bit time)
a) s'2( t)=s2(0021/eal)sLi(t)-(P12/ea2)sRi(t)
3) From 2nd sync to 1St sync+mN
a) ea2=zsRi2(t)
b) eb1=IsR22(t)
c) P12 =zsRi wsR2(t)
4) at 1St sync +mN
a) s'1(t)=s1(t)-(p21/eb2)st,2(t)-(P12/eb1)sR2(t)
}
The processor layout diagram for the aboveequation is shown in Figure A3.
There are nine processing units required.The far left multiply and accumulate
processors must operate at the chip clock rate. Atany sync pulse, the accumulators
are output to the register pipeline and reset. The registerscontents then move to the64
divide processor which divides the cross correlation by theenergy in the signal. The
output of the divisor is then latched and it is multiplied by the pattern of the other
user and subtracted from the users pattern. While the diagram shows the original
pseudo noise patterns being delayed by a bank of shift registers, thepattern could be
replicated by a delayed clock as well. This would be more economical in VLSI real-
estate if the amount of time required for the divider is long.
-I shift registerdelay
Figure A3 - Processor layout diagram for one-shot systolicarray with two users
Since the two user case's formulae were relatively simple, itwas fairly easy to
derive a systolic version of it. When the number ofusers are increased, it is very
difficult to use an adhoc approach. Therefore the next section triesto come up with
recurrence equations and architectures that can be easily expanded.65
SECTION 2
Systolic Synthesis of One-Shot
The difficult areas in extending the architecture to Nusers is in the generation
of the correlation matrix and inverting it. Anotherconcern is the size of the
synchronizing correlators since one is required for eachuser. Each of these areas are
examined in this section. In order to make the circuits realizable,the minimal
number of processors used is consideredmore optimal than other possible solutions.
Correlation Matrix
The correlation matrix can be derived by letting therows of a matrix S contain
the pseudo-noise patterns of each of theusers and multiplying S by ST. S is a n x m
matrix where n is the number of users andm is the number of chips(or smaller time
intervals if using delta-sigma modulation)per bit. After S is multiplied by its
transposed, the result is a symmetric matrix of sizen x n. Therefore only the upper
or lower triangular portion of the matrix needs to be computed. The uniform
recurrence equations are derived as:66
R#=iSikS ki'I i = {1,...N },j = {i,...M }, St=ST
i=1
FOR i =ltoN
FOR j = itoN
FORk =ltoM
1)s(i,jk 3= s(i,j 1,k)
2)s'(i,j ,k)= s'(i 1,j ,k)lj .i
3) R(i,j,k) = R(i,j,k 1)+s(i,j,k)st(i, j,k)
initial conditions:
s(i3O,k) = s(i,j)
s'(0,j ,k)= s'(i,j)
R(i,j,0) = 0
Final output:
R(i,j) = R(i,j,M +1)
All the variables are localized.If we take advantage of the fact that sji= sal , and
change the direction which s' propagates in the DG then the followingDG emerges:
k=1 k=2
s12 i
k=3
Output on k axis
R13 R23 R33
R12 R22
R11
Figure A4 - Dependency graph for correlation matrix
After changing the direction on s', equation 2 becomes s'(i,j,k) = s' (i +1,j ,k).
Now the final array in the space time domaincan be derived from the above D.G.
Since the correlation coefficients will be used in thenext sub-block of the design, the
allocation vector should be orthogonal to Rid. It should also be orthogonalto sib since
it needs to be input to the array. Two are presented here andcompared.67
The first array sets d = [--100]T and s= [-111]T. The final allocation matrix is
01
=001
111
The processor layout diagram is shown in Figure AS for N,M=3. Thenumber
of processors required is NM. Total throughput time is M+N clock cycleswith 1
clock delay between successive outputs fora pipeline rate of 1.
$33 s23 s13
s32 s22 s12
s31 s21 sl I
R11 R22
R12
R33
R23
R13
Figure AS - Processor layout diagram for correlation matrix
The above implementation has the nice feature that the correlationcoefficients
leave the array at one boundary of the array. It has the disadvantagein that it has
many more processors than if we leave the Ris in the computation cell. The results
will then have to be output via a separate bus for each columnof processors. This
smaller processor array is formed by choosing d= [00I]l' and s= [-111f.68
100
The allocation function is then A=010.The processor layout graph is shown
in Figure A6.
111
s11
sl 2
s13
Figure A6 - Processor layout diagram for correlation matrix with fewer
processors
This allocation provides for approximately N2/2 processingelements with a
total computation time of M+N anda pipeline rate of 1.
Correlation Matrix Inversion
This is the area where there is the most difficulty. Itmay be possible to exploit
the symmetry of the correlation matrix by using the factthat it's inversion is also
symmetrical. This would mean that only theupper or lower triangular portion of the
inverse would need to be computed. The firstrow would consist of the minors of
each element and the determinant could easily befound by multiplying the first rows69
new elements by the previous first row elements and summing them. A good
solution extendable to N users is found in [6] and [7].This approach is known as the
Gauss-Jordan Elimination.
The Jordan diagonalization method is described insection 4.3 of (6) and
section 2.1 of (7). This methodcan compute the inverse of R in 5N-2 clock cycles
and requires N(N+ 1) processors.
One problem with the inversion of R deals with itsstability. If the delay
between users is small or zero, therow in R for that sub-interval is nearly zero or is
zero. This will not allow the Gauss-Jordan method find the inverse of R. In
implementation, the inversion block should have the abilityto invert a variable N size
R matrix which would have thezero rows removed.
Correlators
This portion describes the systolic implementationof the correlators in the
block diagram. The one-shot needs twotypes of correlators. The first correlator
must search in time trying to match the incoming signals witha sync pattern. The
second correlator has a much easier job (sincesync with input is known) as it is only
required to multiply its inputs andsum up M samples of them, dump the results and
reset. Since this second block has been evaluated in chapter4, its function is not
described in this portion.70
The search in time correlator equationsare basically the same as convolution
except in this case it always is assumed that thesummation encompasses the entire
length of the bit sequence. In practice,most correlators have used the approach in
Figure 5. of [8] where the pattern is storedin the cells, the input moves systolically
and are multiplied by the pattern weightsand the output is formed by fanning in the
results from all the cells. This approachwas used since the multiply was only of two
bits and the summationwas performed using currents. This resulted ina simple
architecture that was easily implemented. However,to make the synchronizer near-
far resistant, we need to multiply theincoming signal by a multi-bit unit in each
multiplier. By using a systolic implementationwe can remove the global
interconnections and perform the summingdigitally. It would help if both the
pattern, incoming signal and outputare moving through the cells.
A promising architecture is shown inFigure A7. In this mapping, theoutputs
stay in the processing cells but since theyare done at different time intervals, the data
can be transmitted out on a bus that is parallel to thearray. The architecture has a
latency of 2N-1 and a pipeline rate of 1. Itrequires N cells(N being the length of the
sync pattern. It is shown in chapter 4 that since the incomingdata is a one bit stream
from a delta-sigma A/D, the multiplicationin the cells could be performed by simply
2's complementing or not thesync pattern which is a multi-bit data stream. The
summation is a simple multi-bit digitaladder. Since each processor is only usedon
every other timing cycle, two adjacentprocessors could be combined to share the
same adder, with the addition of temporary holding registersand some control logic.
This technique allows for longersync patterns by compacting the array.Si (t)
4""7(''
4111111111111Mr
(-7"
1011111
I-I--I--I-
Output Bus
Sync Pattern
Correlation
Output
Figure A7 - Mapping of search in time correlator
Appendix Summary
71
The results of this investigation shows that the One-Shotdetector can be
implemented using a systolic approach. The synthesismethods provide a far easier
tool to work with than an ad-hoc type approach. Thecomplexity in terms of the
number of processors, however, is proportionalto N2. This is due to the generation
of the correlation matrix and its inversion. Also inorder to limit the number of
processing elements, not all data flows systolically,e.g. some must stay in the P.E.s.
This requires that there be busses and additionalcontrol circuitry to unload the P.E.s.
It was shown that some reduction of P.E.sare possible by exploiting the symmetry in
the correlation matrix and by grouping adjacentcells in the correlators. Further
work may be able to exploit the symmetry in the inversionof R and reduce the
complexity. Further study should also focuson how to reduce the number of P.E.s so
that it is proportional to N and howto interconnect the processing sub-blocks.