Hardware emulation of wireless communication fading channels by Ren, Fei
Scholars' Mine 
Doctoral Dissertations Student Theses and Dissertations 
Fall 2011 
Hardware emulation of wireless communication fading channels 
Fei Ren 
Follow this and additional works at: https://scholarsmine.mst.edu/doctoral_dissertations 
 Part of the Electrical and Computer Engineering Commons 
Department: Electrical and Computer Engineering 
Recommended Citation 
Ren, Fei, "Hardware emulation of wireless communication fading channels" (2011). Doctoral 
Dissertations. 1912. 
https://scholarsmine.mst.edu/doctoral_dissertations/1912 
This thesis is brought to you by Scholars' Mine, a service of the Missouri S&T Library and Learning Resources. This 
work is protected by U. S. Copyright Law. Unauthorized use including reproduction for redistribution requires the 
permission of the copyright holder. For more information, please contact scholarsmine@mst.edu. 





Presented to the Faculty of the Graduate School of the
MISSOURI UNIVERSITY OF SCIENCE AND TECHNOLOGY













This dissertation consists of the following five published or accepted papers,
formatted in the style used by the Missouri University of Science and Technology,
listed as follows:
Paper 1, F. Ren, and Y.R. Zheng, “Hardware Emulation of Wideband Corre-
lated Multiple-Input Multiple-Output Fading Channels,” has been accepted to pub-
lish in Journal of Signal Processing Systems, Jun. 2011. Pages 11-34
Paper 2, F. Ren, and Y.R. Zheng, “A Novel Emulator for Discrete-time MIMO
Triply-selective Fading Channels,” has been published in IEEE Trans. Circuits and
Systems I: Regular Paper, vol. 57, pp.2542-2551, Sep. 2010. Pages 35-65
Paper 3, F. Ren, and Y.R. Zheng, “Hardware Implementation of Triply Selec-
tive Rayleigh Fading Channel Simulators,” has been published in Proc. International
Conference on Acoustics, Speech, and Signal Processing (ICASSP10), March 2010.
Pages 66-77
Paper 4, F. Ren, and Y.R. Zheng, “A Low-complexity Hardware Implementa-
tion of Discrete-time Frequency-selective Rayleigh Fading Channels,” has been pub-
lished in Proc. IEEE International Symposium on Circuits and Systems (ISCAS09),
May 2009. Pages 78-89
Paper 5, S. Subedi, H. Lou, F. Ren, M. Wang, and Y. R. Zheng, “Vali-
dation of the Triply Selective Fading Channel Model Through a MIMO Test Bed
and Experimental Results,” has been accepted to publish in Proc. International
Conference for Military Communications (MILCOM11), Nov. 2011. Pages 90-104
iv
ABSTRACT
This dissertation investigates several main challenges to implementing hardware-
based wireless fading channel emulators with emphasis on incorporating accurate
correlation properties. Multiple-input multiple-output (MIMO) fading channels are
usually triply-selective with three types of correlation: temporal correlation, inter-tap
correlation, and spatial correlation. The proposed emulators implement the triply-
selective fading Channel Impulse Response (CIR) by incorporating the three types
of correlation into multiple uncorrelated frequency-flat Rayleigh fading waveforms
while meeting real-time requirements for high data-rate, large-sized MIMO, and/or
long CIR channels. Specifically, mixed parallel-serial computational structures are
implemented for Kronecker products of the correlation matrices, which makes the
best tradeoff between computational speed and hardware usage. Five practical fad-
ing channel examples are implemented for RF or underwater acoustic MIMO ap-
plications. The performance of the hardware emulators are verified with an Altera
Field-Programmable Gate Array (FPGA) platform and the results match the software
simulators in terms of statistical and correlation properties.
The dissertation also contributes to the development of a 2-by-2 MIMO transc-
eiver testbench that is used to measure real-world fading channels. Intensive chan-
nel measurements are performed for indoor fixed mobile-to-mobile channels and the
estimated CIRs demonstrate the triply-selective correlation properties.
vACKNOWLEDGMENTS
I would like to express my gratitude to all the people who have helped and
supported me in my Ph.D. study.
First and foremost, I thank my advisor, Dr. Yahong Rosa Zheng, who advised
my M.S. thesis and then provided me with the opportunity of Ph.D. study. With her
enthusiasm and inspiration, she thoughtfully guided me in research attitude, specific
knowledge, and technical writing. Throughout my five years of study, she provided
me encouragement, sound advice, good teaching, lots of good ideas, and financial
support. I would have not been where I am today without her help.
Next, I would also like to express my deep gratitude to Dr. Chengshan Xiao,
for his guidance and support in several joint research projects. I also acknowledge
the support of the Office of Naval Research and the National Science Foundation for
sponsoring these research projects.
I would like to thank the members of my advisory committee, Drs. Jagan-
nathan Sarangapani, Steve Grant, Randy H. Moss, and Maggie Chen, for their guid-
ance in my Ph.D. studies and suggestions in my dissertation.
I wish to thank all my colleagues at Missouri S&T and friends in Rolla for
their kind assistance in my research, study, and rural life during the past five years.
Last but not least, I wish to express my special thanks to my family for their
love, encouragement, and support. Particularly, I would like to thank my parents,
who not only have raised and supported me through the years, but also are taking
care of my baby now. I also thank my parents-in-law for their understanding and
support in my most difficult time. I would also like to gratefully thank my wife, Jing




PUBLICATION DISSERTATION OPTION . . . . . . . . . . . . . . . . . . . . iii
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
LIST OF ILLUSTRATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
SECTION
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 PROBLEM STATEMENT AND DESIGN APPROACH . . . . . . 5
1.3 SUMMARY OF CONTRIBUTIONS . . . . . . . . . . . . . . . . . 8
1.4 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
PAPER
I. HARDWARE EMULATION OF WIDEBAND CORRELATED
MULTIPLE-INPUT MULTIPLE-OUTPUT FADING CHANNELS . . . . . . 11
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 THE MATHEMATIC MODEL . . . . . . . . . . . . . . . . . . . . . . 14
3 HARDWARE IMPLEMENTATION METHOD . . . . . . . . . . . . . 15
3.1 The FRFG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Ping-Pong Buffers . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3 Correlation Multiplier Module . . . . . . . . . . . . . . . . . . . . 20
3.4 Interpolator Module . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4 IMPLEMENTATION EXAMPLES . . . . . . . . . . . . . . . . . . . . 24
4.1 Implementation Example I - Underwater Acoustic Channel . . . . 24
vii
4.2 Implementation Example II - WiMAX Channel . . . . . . . . . . . 26
5 PERFORMANCE EVALUATION . . . . . . . . . . . . . . . . . . . . . 28
5.1 Performance Comparison of Serial and Mixed P-S Methods . . . . 28
5.2 Parameter Specifications and Hardware Usage . . . . . . . . . . . . 29
5.3 Interfacing with Digital Up-Convertor and Down-Convertor . . . . 31
6 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
7 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
II. A NOVEL EMULATOR FOR DISCRETE-TIME MIMO
TRIPLY-SELECTIVE FADING CHANNELS . . . . . . . . . . . . . . . . . . 35
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2 DISCRETE-TIME TRIPLY SELECTIVE FADING MODEL . . . . . . 40
3 HARDWARE IMPLEMENTATION METHOD . . . . . . . . . . . . . 43




퐼푆퐼 Generator Module . . . . . . . . . . . . . . . . . . . . . . . . 47
3.3 Correlation Multiplier Module . . . . . . . . . . . . . . . . . . . . 49
3.4 Interpolator Module . . . . . . . . . . . . . . . . . . . . . . . . . . 50




퐼푆퐼 Generator Performance Evaluation . . . . . . . . . . . . . . . 51
4.2 KP Module Memory Usage Evaluation . . . . . . . . . . . . . . . . 52
4.3 Frequency Selective Fading Channel Example . . . . . . . . . . . . 54
4.4 Triply Selective Fading Channel Example . . . . . . . . . . . . . . 55
4.5 Evaluation of Flat Rayleigh Fading Generators . . . . . . . . . . . 57
4.6 Parameter Specifications and Hardware Usage . . . . . . . . . . . . 59
5 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
viii
III. HARDWARE IMPLEMENTATION OF TRIPLY SELECTIVE RAYLEIGH
FADING CHANNEL SIMULATORS . . . . . . . . . . . . . . . . . . . . . . . 66
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2 DISCRETE-TIME MIMO TRIPLY SELECTIVE RAYLEIGH FADING
MODEL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3 HARDWARE IMPLEMENTATION METHOD . . . . . . . . . . . . . 69
4 EXAMPLES AND PERFORMANCE EVALUATION . . . . . . . . . . 72
5 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
IV. A LOW-COMPLEXITY HARDWARE IMPLEMENTATION OF DISCRETE-
TIME FREQUENCY-SELECTIVE RAYLEIGH FADING CHANNELS . . . . 78
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
2 DISCRETE-TIME FREQUENCY-SELECTIVE FADING CHANNEL
MODELS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3 FPGA IMPLEMENTATION . . . . . . . . . . . . . . . . . . . . . . . . 83
4 IMPLEMENTATION EXAMPLE AND PERFORMANCE EVALUA-
TION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
6 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
V. VALIDATION OF THE TRIPLY SELECTIVE FADING CHANNEL MODEL
THROUGH A MIMO TEST BED AND EXPERIMENTAL RESULTS . . . . 90
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
2 DISCRETE-TIME TRIPLY SELECTIVE FADING MODEL . . . . . . 91
3 TESTBED AND EXPERIMENT . . . . . . . . . . . . . . . . . . . . . 93
4 PROCEDURE, RESULTS AND ANALYSIS . . . . . . . . . . . . . . . 97
4.1 Channel Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 97
ix
4.2 Estimation of the Channel Coefficient Covariance Matrix . . . . . 98
4.3 Decomposition of the Kronecker Product . . . . . . . . . . . . . . 99
4.4 Estimation of Intertap Covariance Matrix and Spatial Correlation
Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
SECTION
2 CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3 PUBLICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106




1 Block diagram of proposed correlated MIMO fading channels emulator. . . 16
2 Implementation blocks of the FRFG module. . . . . . . . . . . . . . . . . . 18
3 Hardware implementation of the Ping-Pong buffer module. This diagram
shows the data buffer for the real part 푍푐푖(푅푘). The imaginary part uses a
similar buffer structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4 Hardware implementation of CM module using the mixed P-S method. In
this design, (푀푁) coefficients of matrix E are output in parallel per clock
cycle, and one row of E is output in every 퐿 clock cycles. . . . . . . . . . 23
5 Implementation of the interpolator module. . . . . . . . . . . . . . . . . . 23
6 The placement of transmit elements and hydrophones of the underwater
communication system. This is a 2-by-6 MIMO underwater acoustic com-
munication system where the speed of the acoustic carrier is 1500 m/s and
the frequency of the carrier is 15 kHz. The carrier wavelength is 휆=10 cm. 25
7 Performance of underwater acoustic fading channel emulator. Auto-corre-
lation of ℎ1,1(75, 푘), cross-correlation between ℎ1,1(75, 푘) and ℎ1,1(76, 푘), and
cross-correlation between ℎ1,1(75, 푘) and ℎ2,1(75, 푘). The channel index is
according to (3). The results are based on hardware outputs of 200 trials
with 2× 103 samples pre subchannel per trial. . . . . . . . . . . . . . . . . 26
8 Performance of the WiMAX fading channel emulator. Auto-correlation
of ℎ1,1(0, 푘), cross-correlation between ℎ1,1(0, 푘) and ℎ1,1(1, 푘), and cross-
correlation between ℎ1,1(0, 푘) and ℎ2,1(1, 푘). The channel index is according
to (3). The results are based on hardware outputs of 50 trials with 2.8×104
samples per subchannel per trial. . . . . . . . . . . . . . . . . . . . . . . . 28
9 Performance comparison for generating one correlated fading complex re-
sponse using the serial method and the proposed mixed P-S method. . . . 30
PAPER II
1 Block diagram of discrete-time MIMO triply selective fading emulator. . . 43
2 Implementation blocks of the RNG and FRFG modules. . . . . . . . . . . 46
3 Implementation blocks of the C
1
2
퐼푆퐼 generator module. . . . . . . . . . . . . 48
xi
4 Implementation of the correlation multiplier module. . . . . . . . . . . . . 51
5 Implementation of the interpolator module. . . . . . . . . . . . . . . . . . 52
6 The normalized PDPs for the typical urban channel model presented in
3GPP standard [26]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53




hardware fixed-point and Matlab floating-point. Note that the scales of y
axes are different in sub-figures. . . . . . . . . . . . . . . . . . . . . . . . . 53
8 The comparison of memory usage between the pre-compute and store method
and the proposed KP method. . . . . . . . . . . . . . . . . . . . . . . . . . 55
9 Performance of Doubly selective fading channel emulator. Auto-correlation
of ℎ1,1(푘, 0), cross-correlation between ℎ1,1(푘, 0) and ℎ1,1(푘, 1) and between
ℎ1,1(푘, 0) and ℎ1,1(푘, 2). The results were based on hardware outputs of 50
trials with 2.8× 104 samples per sub-channel per trial. . . . . . . . . . . . 56
10 Performance of the triply selective channel emulator. Auto-correlation of
ℎ1,1(푘, 1), cross-correlation between ℎ1,1(푘, 0) and ℎ1,1(푘, 1), and between
ℎ1,1(푘, 0) and ℎ2,1(푘, 1). The numbers of trials and samples are the same as
those in Fig. 9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
11 Performance of triply selective channel emulator. Cross-correlation between
ℎ1,1(푘,−1) and ℎ1,1(푘, 1), and between ℎ1,1(푘,−1) and ℎ1,1(푘, 2). Note the
change of scale in y-axis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
12 PDF of 푍푐푖 and 푍푠푖. The numbers of trials and samples are the same as the
previous figure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
13 PDF of ∣푍푖∣, where 푍푖 = 푍푐푖 + 푗푍푠푖. The numbers of trials and samples are
the same as the previous figure. . . . . . . . . . . . . . . . . . . . . . . . . 62
14 LCR of ∣푍푖∣. The numbers of trials and samples are the same to the previous
figure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
PAPER III
1 Hardware implementation block diagram of the triply selective fading sim-
ulator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2 The datapath of the C
1
2
퐼푆퐼 generator module. . . . . . . . . . . . . . . . . . 73
3 The datapath of the CM module. . . . . . . . . . . . . . . . . . . . . . . . 74
4 The memory usage comparison of the pre-compute and store method and
the proposed KP method. . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
xii
5 The auto-correlation of ℎ1,1(1, 푘), the cross-correlation between ℎ1,1(0, 푘)
and ℎ1,1(1, 푘), and cross-correlation between ℎ1,1(0, 푘) and ℎ2,1(1, 푘). The
channel index is according to (2). The results are based on hardware outputs
of 50 trials with 2.8× 104 samples in each channel per trial. . . . . . . . . 76
PAPER IV
1 (a) A typical urban channel PDP with multiple WSSUS rays. (b) Average
power/tap of 푇푠-spaced discrete-time channel response. . . . . . . . . . . . 82
2 Bandpass filter of the 푖-th ray sampled at 푇푠푦푚, where the delay 휏푖 is a
fraction of 푇푠푦푚. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3 Block diagram of FPGA implementation of the frequency-selective Rayleigh
fading simulator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4 FPGA implementation of the flat fading generator module. . . . . . . . . . 85
5 FPGA implementation of frequency-selective fading generator module. . . 86
6 Autocorrelation and cross-correlation of the 푖-th flat fading ray sampled at
at 푇푠푦푚 interval. The normalized Doppler frequency was 푓푑푇푠푦푚 = 0.0008
and 푓푑 = 125 Hz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7 Cross-correlation between 퐻푐(푙, 푘) and 퐻푐(푙+1, 푘) of the frequency-selective
channel simulator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
PAPER V
1 Transmitter setup architecture. . . . . . . . . . . . . . . . . . . . . . . . . 94
2 Receiver setup architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3 Floorplan used for the experiment. . . . . . . . . . . . . . . . . . . . . . . 96
4 Magnitudes of channel impulse responses for four subchannels. . . . . . . . 98
5 Magnitude of estimated channel coefficient covariance matrix. . . . . . . . 99
6 Magnitudes of intertap covariance matrices for each subchannel. . . . . . . 102
7 Magnitude of averaged intertap covariance matrix,Ψ푇푎푝. . . . . . . . . . . . 103





1 Parameter ranges of the proposed emulator with 퐹푐푙푘=50 MHz. . . . . . . 30
2 Resource usage of two examples on Stratix III EP3SL150F1152C2N FPGA
with 퐹푐푙푘=50 MHz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3 Resource usage comparisons of related fading channel emulators. . . . . . . 32
PAPER II
1 Parameter ranges of the proposed emulator with 퐹푐푙푘=50 MHz and 푇푠 =
3.69 휇푠. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2 Resource usage of the MIMO triply selective fading emulator on Stratix III
EP3SL150F1152C2N FPGA with 퐹푐푙푘=50 Mhz. . . . . . . . . . . . . . . . 60
3 Performance comparisons of related Rayleigh fading channel emulators. . . 61
PAPER III
1 Hardware usage of the simulator on a Stratix III EP3SL150F1152C2N FPGA
chip. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
PAPER V
1 Comparison of correlation matrices using CMD. . . . . . . . . . . . . . . . 101
1 INTRODUCTION
1.1 BACKGROUND
Wireless fading channel modeling, along with its software simulation and hard-
ware emulation, is an important topic in wireless communications, because it can
provide the basis for verification of new algorithm design, testing of transceiver per-
formance, and analysis of channel capacity. Comparing to field tests, wireless fading
channel simulation and emulation are more cost-effective, time-efficient, and can pro-
vide repeatable and reproducible results. Therefore, it has been widely adopted in
academic and industry.
Two types of channel modeling approaches are usually employed: site-specific
wave propagation approach and statistical channel modeling approach. The site-
specific wave propagation approach uses Maxwell equations to simulate wave propa-
gation through the communication media and it requires detailed physical environ-
ment, geometry and dielectric properties [1]. This approach can provide channel
models for specific sites but requires high computational power for simulation. In
contrast, the statistical channel modeling approach ignores the physical details and
only generates the fading channel impulse responses (CIR) with accurate statistical
properties [2,3,4]. Well-designed statistical models can model real-world fading chan-
nels matching realistic statistical properties such as probability distribution function
(PDF), power delay profiles, and auto/cross-correlation. Its computational complex-
ity is much smaller than the wave propagation approach, thus gaining wide spread
use in the last two decades.
This dissertation takes the statistical modeling approach to multiple-input
multiple-output (MIMO) fading channel modeling and focuses on hardware emula-
tion. The input-output relationship of a MIMO fading channel in the discrete-time
2domain can be modeled by using a finite impulse response (FIR) filter method. The




H푙(푘) ⋅ x(푘 − 푙) + v(푘) (1.1)
Where x(푘) = [푥1(푘), 푥2(푘), ..., 푥푁 (푘)]
푡, y(푘) = [푦1(푘), 푦2(푘), ..., 푦푀(푘)]
푡 , and v(푘) =
[푣1(푘), 푣2(푘), ..., 푣푀(푘)]
푡 are the input vector, output vector, and noise vector at time
instant 푘, respectively. The parameters푀 , 푁 , 퐿1, and 퐿2 are the numbers of receiver
(Rx) elements, transmitter (Tx) elements, low and high indices of channel taps per
sub-channels, respectively. The matrix H푙(푘) is the channel matrix at time instant 푘








ℎ푀,1(푘, 푙) ⋅ ⋅ ⋅ ℎ푀,푁(푘, 푙)
⎞
⎟⎟⎟⎟⎠ (1.2)
Where ℎ푚,푛(푘, 푙) is the instantaneous channel coefficient at time instant 푘, and delay
tap 푙 for the sub-channel of the 푚-th Rx element and 푛-th Tx element. We assume
the symbol duration is denoted as 푇푠. For the convenience of hardware emulation,
the matrix H푙(푘) is reshaped to a MIMO channel coefficient vector hvec(푘) as follows:
h푣푒푐(푘) = [h1,1(푘), ...,h1,푁(푘) ∣ ... ∣ h푀,1(푘), ...,h푀,푁(푘)]푡
h푚,푛(푘) = [ℎ푚,푛(푘,−퐿1), ..., ℎ푚,푛(푘, 퐿2)]푡 (1.3)
A practical MIMO fading channel exhibits three types of correlation. Spa-
tial correlation models space-selectivity; temporal correlation describes the time-
selectivity; inter-tap correlation exhibits frequency-selectivity. Therefore, this is re-
ferred to as the MIMO triply-selective fading channel [2]. Spatial correlation is usually
3measured and predefined according to properties of multiple antennas, and denoted
as the Rx correlation matrix Ψ푅푥 and Tx correlation matrix Ψ푇푥. The inter-tap cor-
relation is denoted as the inter-tap correlation matrix, C퐼푆퐼 . With these correlation














Where the vector Φ(푘) = [푍1(푘), ..., 푍푀푁퐿(푘)]
푡 consists of CIRs of multiple frequency-
flat fading sub-channels at time instant 푘. The operator “⊗” and “(⋅) 12” are the
Kronecker product and square root of a matrix, respectively. More details about this
equation can be found in [2].
The CIRs of the 푖-th frequency-flat fading sub-channel, 푍푖(푘), exhibit temporal
correlation, and can be generated by flat Rayleigh fading generators (FRFG) using a
sum of sinusoids (SoS) method in [3]. The SoS method is described by the following
equations:














cos(2휋(푓푑푘푇푠 sin훼푚,푖 + 휑푚,푖)),
훼푚,푖 =
휋(푚− 0.5 + 휃푖)
2푀
, 푚 = 1, 2, ⋅ ⋅ ⋅ ,푀. (1.5)
Where 푍푖(푘) is the complex CIR of frequency-flat fading channel at time instant 푘, 푓푑
is the maximum Doppler frequency,푀 is the total number of sinusoids, and 푗 =
√−1.
The angle of arrival 훼푚,푖 is randomized by the variable 휃푖. The random variables
휙푚,푖 and 휑푚,푖 are the random phases of the in-phase and quadrature components,
4respectively. The random variables 휙푚,푖, 휑푚,푖, and 휃푖 are statistically independent
and uniformly distributed on [−0.5, 0.5].
The inter-tap correlation matrix C퐼푆퐼 is related to channel power delay profiles
(PDP), transceiver shaping/matching filters and symbol rates. According to [2], the








휎2푖 훿(휏 − 휏푖) (1.6)
Where 퐺(휏) is the power delay profile, 퐾 is the number of total resolvable paths and
휎2푖 is the power of the 푖-th path with delay 휏푖, 푅푃푇푃푅(휀) is the convolution function
of the Tx shaping filter and Rx matching filter.
Ignoring correlation properties result in inaccurate MIMO channel models,
lead to incorrect CIRs, and may cause failure of transceiver design. For example,
temporal correlation, caused by Doppler shift, can affect the selection of coding length
and receiver performance. Spatial correlation, caused by insufficient spacing between
antenna elements, can reduce channel capacity drastically. Inter-tap correlation can
affect channel gains at different frequencies, thus influence equalizer design, adaptive
modulation selection, and multi-user channelization, etc. Therefore, considering all
three types of correlation is the key, as well as difficult, aspect of accurate MIMO
channel modeling.
Several software-based fading channel simulators have been developed to gener-
ate CIRs of MIMO fading channels employing general-purpose processors and floating-
point algorithms. A discrete-time MIMO triply-selective fading channel simulator has
been proposed in [2]. It computes the inter-tap correlation matrix according to the
power delay profile, and then incorporates the inter-tap and spatial correlation ma-
trices into multiple uncorrelated frequency-flat fading CIRs via Kronecker product.
5Another simulator, proposed in [4], synthesizes correlated vector channels (with user-
specified correlation function) using the auto-regression modeling method to shape the
spectrum of uncorrelated white Gaussian processes. Software-based fading channel
simulators are widely used for research algorithm testing in non real-time settings.
For real-time testing and instrumentation, however, hardware-based wireless
fading channel emulators are required to generate analog fading waveforms or digital
CIRs in real-time. Implementing hardware-based emulators for MIMO fading chan-
nels with accurate correlation properties is more difficult than software simulation
due to timing and resource constraints. Existing hardware-based channel emulators
in commercial products or academic papers often simplify their design by ignoring
some of the correlation functions in MIMO channels. For example, existing commer-
cial MIMO emulators, such as NoiseCom MP-2500, Agilent N5115B, Azimuth ACE-
MX/440B, etc., only implement the temporal correlation function of fading channels.
Many so-called MIMO emulators in the literature [5] and [6] only implement multiple
uncorrelated frequency-flat fading sub-channels or consider only spatial or temporal
correlation.
1.2 PROBLEM STATEMENT AND DESIGN APPROACH
This dissertation investigates several main challenges in hardware-based triply-
selective MIMO fading channel emulators with accurate correlation properties. These
challenges include incorporating correlation matrices into channel models, generating
CIR outputs in real-time, and making tradeoff between processing speed and hardware
resource usage.
The first challenge is to implement some matrix computation modules in hard-
ware. To incorporate the three types of correlation in (1.4) into MIMO fading channels
in hardware emulators, extensive matrix computations such as Kronecker product,
matrix square root, and matrix multiplication are required. Effectively implementing
6them in hardware is challenging due to 2-D properties of matrix computations. In
particular, computing the correlation coefficients in (1.6) and matrix square roots are
the most complex tasks for hardware implementations.
The second challenge is to meet output real-time requirements for different
fading channels. The Kronecker product, matrix square root, and matrix multipli-
cation require a large amount of multiplications and additions, especially when the
sizes of matrices are large. In practical RF and underwater acoustic wireless chan-
nels, MIMO sizes, 푀 ×푁 , range from 2× 2 to 8× 12. The CIR length, 퐿, can vary
from ten to several hundreds. The resulting matrix, C
1
2
ℎ (0), can have a size as large
as 1000 × 1000. The implementation challenges lay in how to speed up the com-
putation via parallel processing, pre-computing, and how to algorithmically reduce
computational load.
The third challenge is to achieve good balance between processing speed and




eliminate the huge computational load completely and achieve high processing speed,
but this approach requires huge memory resource to store the pre-computed data
and place stringent requirement on hardware memories. Another example is the
selection between parallel and serial structures. Using fully parallel processing for all
MIMO sub-channels can increase processing speed but again requires large hardware
resources that are linearly proportional to the size of the channel. Using all serial
processing saves hardware resources but requires a large amount of processing time.
Analyzing real-time requirement of different MIMO channels and designing balanced
implementation are the challenges as well as the contributions of this dissertation.
The approach to dealing with the hardware challenges utilizes four techniques:


























퐼푆퐼 . This technique achieves good tradeoff between
processing speed and memory usage.
2. Develop a mixed parallel-serial (mixed P-S) computational structure, which
employs different numbers of computational paths to compute the Kronecker
product and vector multiplication in parallel. The computational speed of his
mixed P-S structure is flexible and adjustable to meet various real-time require-
ments of different triply-selective fading channels.








퐼푆퐼 , and can rebuild another half according to the stored half. When long
CIR channels with the large-sized C
1
2
퐼푆퐼 are emulated, this module can save a
large amount of memory usage.
4. Efficiently reuse the FRFG of (1.5) to generate multiple independent signals
required in Φ(푘) of (1.4). The proposed method employs one FRFG to generate
up to hundreds of frequency-flat fading sub-channels in parallel. It makes use of
Ping-Pong buffers to buffer the outputs of the FRFG, and synchronize speed of
these outputs with mixed P-S computational paths. A large amount of hardware
resources are saved by reducing multiple FRFGs to only one.
The proposed MIMO triply-selective emulators are implemented and tested
on an FPGA platform. The Altera Stratix III EP3SL150F1152C2N FPGA/DSP de-
velopment kits is employed as the hardware platform. This development kit contains
Stratix III EP3SL150F1152C2N FPGA chip, 72 MB SRAM, 16 MB flash memory,
display LEDs, push-buttons, DIP switches, and data conversion high speed mezzanine
daughter cards. The Stratix III FPGA chip features 142000 logic elements (LEs), 5499
Kbits of memory, 384 multiplier blocks, eight phase locked loops (PLLs), 16 global
clock networks, and 736 user I/Os. Altera Quartus II v9.1, DSP Builder, and Matlab
8Simulink are used for for hardware development. Several fading channel examples
are implemented on the development kit, and their output results are analyzed and
verified in Matlab.
In addition, this dissertation also developed a hardware wireless 2-by-2 MIMO
channel testbench. This testbench can be used to study CIRs and correlation proper-
ties of real-world MIMO channels. Experimental results have verified the spatial and
temporal correlation properties with in-door fixed mobile-to-mobile MIMO channels.
1.3 SUMMARY OF CONTRIBUTIONS
Research of this dissertation addresses the technical challenges in hardware
implementation of triply-selective MIMO fading channel emulators and the wireless
MIMO channel testbench. My work results in one journal publications, one journal
submission, and four conference publications. The complete publication list can be
found in Section 3. The technical contributions of the dissertation are.




generator is proposed and successfully incorporated to triply-selective fading
channel emulators. So far, none of the other existing emulators implement all
three types of correlation. The algorithm of computing C
1
2
퐼푆퐼 is based on (1.6)
and matrix square root. The C
1
2
퐼푆퐼 generator employs a LUT scheme and serial
computational structure to generate the coefficient of C
1
2
퐼푆퐼 one by one, In order
to compute the matrix square root, the C
1
2
퐼푆퐼 generator implements the Jacobi
algorithm for singular-value decomposition, the square root calculation for each




퐼푆퐼 generator can be run once at the beginning of each simulation trial, and
its results can be stored and used for the entire trial.
92. The mixed P-S computational structure is proposed and incorporated fully serial
FRFG to the triply-selective fading channel emulators. Comparing to the serial
and parallel structures, the mixed P-S structure makes the best tradeoff between
computational speed and hardware usage for extreme time-consuming modules
such as the Kronecker product and vector multiplication. It employs parallel
computational paths to generate multiple results of Kronecker product and
vector multiplication in one clock period. The number of computational paths
can be adjusted to meet real-time requirements of different fading channels.
Proved by our testing, The mixed P-S structure can handle MIMO channel
with (푀푁퐿) up to 1600. According to the equation (1.5), the serial FRFG
takes Doppler frequency and symbol period as inputs, and generates multiple
frequency-flat fading sub-channels in parallel. The linear feedback shift register
(LFSR) random number generators (RNG) and an accurate LUT scheme are
employed to generate results of cos / sin(⋅) in the equation (1.5) at the precision
level of 6.1 × 10−5. Sub-channels generated by this FRFG are proven to have
accurate statistical properties and temporal correlation.
3. The hardware implementation of wireless MIMO channel testbench includes
transmitter and receiver design. In the transmitter side, the frame assemble
module and digital up convertor (DUC) are implemented on the FPGA devel-
opment kit. In the receiver side, the bandpass sampler, digital down convertor
(DDC), frame synchronization module, carrier phase detection and compensa-
tion module, and frame extraction module are implemented. Based on real-
world measurement experiments, this testbench can provide experimental re-
sults for the CIR estimation and correlation matrices estimation. Experimental
results demonstrate that the discrete-time triply selective fading channel can be
expressed as separable temporal, inter-tap and spatial correlations. The spatial
10
and inter-tap correlation matrices can be estimated through the decomposition
of channel coefficient covariance matrix.
1.4 REFERENCES
[1] A. Goldsmith, Wireless Communications. Cambridge, England: Cambridge Uni-
versity Press, 2005.
[2] C. Xiao, J. X. Wu, S.-Y. Leong, Y. R. Zheng, and K. B. Letaief, “A Discrete-
time Model for Triply Selective MIMO Rayleigh Fading Channels,” IEEE Trans.
Wireless Commun., vol. 3, no. 5, pp. 1678-1688, Sep. 2004.
[3] Y. R. Zheng and C. Xiao, “Simulation Models with Correct Statistical Properties
for Rayleigh Fading Channels,” IEEE Trans. Commun., vol. 51, no. 6, pp. 920-
928, Jun. 2003.
[4] B. E. Baddour and N. C. Beaulieu, “Accurate Simulation of Multiple Cross-
correlated Rician Fading Channels,” IEEE Trans. Commun., vol. 52, no. 11, pp.
1980-1987, Nov. 2004.
[5] M. Cui, H. Murata, and K. Araki, “FPGA Implementation of 4×4 MIMO Test-
bed for Spatial Multiplexing Systems,” in Proc. IEEE Int. Symp. Pers., Indoor,
Mobile Radio Commun., Barcelona, Spain, Sep. 2004, pp. 3045-3048.
[6] A. Alimohammad, S. F. Fard, B. F. Cockburn, and C. Schlegel, “A Novel Tech-
nique For Efficient Hardware Simulation of Spatiotemporally Correlated MIMO
Fading Channels,” in Proc. IEEE Int. Conf. Commun., Beijing, China, May
2008, pp. 718-724.
[7] P. L’ Ecuyer, “Tables of Maximally Equidistributed Combined LFSR Genera-
tors,” Math. Comput., vol. 68, no. 225, pp. 261-269, Jan. 1999.
[8] Y. R. Zheng, “A Non-isotropic Model for Mobile-to-Mobile Fading Channel Sim-
ulations,” in Proc. IEEE MILCOM, Washington DC, USA, Oct. 2006, pp. 1-7.
11
PAPER
I. HARDWARE EMULATION OF WIDEBAND
CORRELATED MULTIPLE-INPUT
MULTIPLE-OUTPUT FADING CHANNELS
Fei Ren and Yahong Rosa Zheng
Abstract—A low-complexity hardware emulator is proposed for wideband, corre-
lated, multiple-input multiple-output (MIMO) fading channels. The proposed emu-
lator generates multiple discrete-time channel impulse responses (CIR) at the symbol
rate and incorporates three types of correlation functions of the subchannels via Kro-
necker product: the spatial correlation between transmit or receive elements, tem-
poral correlation due to Doppler shifts, and inter-tap correlation due to multipaths.
The Kronecker product is implemented by a novel mixed parallel-serial (mixed P-S)
matrix multiplication method to reduce memory storage and to meet the real-time
requirement in high data-rate, large MIMO size, or long CIR systems. We present two
practical MIMO channel examples implemented on an Altera Stratix III EP3SL150F
FPGA DSP development kit: a 2-by-2 MIMO WiMAX channel with a symbol rate of
1.25 million symbols/second and a 2-by-6 MIMO underwater acoustic channel with
100-tap CIR. Both examples meet real-time requirement using only 12–14 percent of
hardware resources of the FPGA.
1 INTRODUCTION
Fading channel emulators provide a fast and low-cost method for testing
and verifying new algorithm design, transceiver performance, and channel capacity
12
analysis [1, 2, 3]. Many products are available commercially for emulating single-
input single-output (SISO) or multiple-input multiple-output (MIMO) fading chan-
nels. For example, the NoiseCom MP-2500 multipath fading emulator can emulate
SISO frequency-selective fading channels with up to 12 delay paths. The Agilent
N5115B baseband studio test set is featured with standards-based fading configura-
tions and can support fading channels with up to 48 delay paths. The Rohde&Schwarz
ABFS simulator offers two independent six-path baseband fading channels with pre-
programmed fading models in mobile radio standards [4]. The Azimuth ACE-400WB
supports up to 4-by-4 MIMO fading channels in real time with antenna correla-
tion [5]. The Elektrobit’s Propsim F8 RF channel emulator can support up to 16
MIMO fading channels with various radio interfaces such as 802.11n, 3GPP LTE,
WiMAX, and Wi-Fi [6]. Although most of them are equipped with advanced features
such as fading channel profiles specified by current standards, bi-directional channel
modeling, RF interfaces, etc., these existing emulators only provide multiple indepen-
dent fading subchannels with the temporal correlation function implemented through
Doppler spectrum filtering or Sum of Sinusoids (SoS). However, practical MIMO fad-
ing channels usually exhibit all three types of correlation functions, referred to as
triply-selective channels: time-selectivity due to Doppler (described by temporal cor-
relation), frequency-selectivity due to multipath (described by inter-tap correlation),
and space-selectivity (associated with the spatial correlation of transmitters and re-
ceivers) [3]. It has been shown in [1] that these correlation functions have significant
impact on channel capacity, bit error rate (BER), and transceiver design. Ignoring
these correlation functions will lead to impractical testing results.
Incorporating correlation functions into fading subchannels is the key but
difficult aspect of accurately generating correlated MIMO fading channels. Many
13
software-based channel simulators [1,2,3,7,8,9,10] have successfully simulated doubly-
selective or triply-selective correlated fading channels, and they provide the the-
oretical foundation for hardware-based channel emulators. Recently, research on
hardware-based channel emulators for doubly-selective fading channels are reported
in [11,12,13,14,15], where [11,12,13] propose frequency-selective SISO fading channel
emulators, and [14,15] report MIMO fading channel emulators without spatial or/and
inter-tap correlation. Recently, we developed a hardware-based MIMO fading chan-
nel emulator [16] incorporating all three types of correlation functions based on the
software simulation method in [3]. This emulator [16] computes the three correlation
matrices in the hardware and can emulate a baseband MIMO triply-selective fading
channel with (푀 ×푁 ×퐿)=160, where 푀 is the number of receive elements, 푁 is the
number of transmit elements, and 퐿 is the number of taps. It is more challenging for
such a correlated MIMO fading channel emulator to meet the real-time requirement
in high data-rate, large MIMO size, or long channel impulse response (CIR) fading
channels.
In this paper, we improve the MIMO fading channel emulator in [16] through
a novel mixed parallel-serial (mixed P-S) multiplication structure and two sets of
Ping-Pong buffers to achieve real-time implementations of large-dimension MIMO
channels. The new emulator is capable of generating MIMO baseband equivalent
fading channels with up to (푀 × 푁 × 퐿)=1600. This is equivalent to either 1600
independent frequency-flat fading channels, or 16 SISO frequency-selective fading
channels with 100 taps each, or a 푁 -by-푀 (푀푁 < 16) triply-selective fading channel
with 100 taps per subchannel. To demonstrate the capability and accuracy of the
emulator, two typical MIMO fading channel examples: a 2-by-2 WiMAX channel with
a short symbol duration time 0.8 휇푠 and a 2-by-6 underwater acoustic channel with
100 taps CIRs, are implemented on a Stratix III EP3SL150F FPGA DSP development
kit, and their outputs are proved to have accurate correlation properties. Less than
14
15 percent of the hardware resource is required in these two examples and real-
time requirements are met. The proposed MIMO channel emulators are tested via
Hardware-in-Loop (HIL) models in Simulink.
2 THE MATHEMATIC MODEL
The mathematic model of the proposed emulator is the discrete-time MIMO
triply selective fading model in [3]. Consider a MIMO channel with 푁 transmit and
푀 receive elements. The input-output relationship of the channel in the discrete-time




H(푘, 푙) ⋅ x(푘 − 푙) + v(푘), (1)
where x(푘) = [푥1(푘), 푥2(푘), ..., 푥푁(푘)]
푡 is the transmitted signal vector, y(푘) = [푦1(푘),
푦2(푘), ..., 푦푀(푘)]
푡 is the received signal vector, the superscript (⋅)푡 is the transpose
operator of a matrix or vector, and v(푘) = [푣1(푘), 푣2(푘), ..., 푣푀(푘)]
푡 is the background
white Gaussian noise. Note that we assume the symbol duration being 푇푠. The
variables 퐿1 and 퐿2 are nonnegative integers representing the range of delay taps,
and derive that the total channel length is 퐿 = (퐿1 + 퐿2 + 1) taps. The MIMO












For the convenience of description, we reshape the matrixH(푘, 푙) to (푀푁퐿)×1
coefficient vector as
h푣푒푐(푘) = [h1,1(푘), ...,h1,푁(푘) ∣ ... ∣ h푀,1(푘), ...,h푀,푁(푘)]푡 (3)
where h푚,푛(푘) is the complex coefficient vector of the (푚,푛)-th subchannel at time
index 푘 given by h푚,푛(푘) = [ℎ푚,푛(푘,−퐿1), ..., ℎ푚,푛(푘, 퐿2)]. Based on the software
















where ⊗ denotes the Kronecker product and X 12 is the square root of matrix X such
that X = X
1
2 ⋅ (X 12 )ℎ with the superscript (⋅)ℎ being the Hermitian operator. The
spatial correlation matricesΨ푅푥 andΨ푇푥 are determined by properties of the transmit
and receive elements, respectively, and are usually pre-known by users. The inter-tap
covariance matrix CISI is computed according to the power delay profile using (17)
in [3]. The (푀푁퐿)×1 vector Φ(푘) is defined as Φ(푘)=[푍1(푘), 푍2(푘), ..., 푍(푀푁퐿)(푘)]푡.
Each complex coefficient 푍푖(푘) = 푍푐푖(푘)+푗푍푠푖(푘) (푖 = 1, 2, ..., (푀푁퐿)) represents one
of multiple uncorrelated Rayleigh fading waveforms and can be efficiently simulated
by the sum of sinusoids (SoS) method in [17, 18].
3 HARDWARE IMPLEMENTATION METHOD
For the convenience of describing hardware implementations, we define three
















(0). The coefficients of the chan-
nel vector h푣푒푐(푘) in (3) are rearranged as h푣푒푐(푘)=[퐻(1, 푘), 퐻(2, 푘), ..., 퐻(푀푁퐿, 푘)]
푡,
and 퐻(푤, 푘)=퐻푐(푤, 푘) + 푗퐻푠(푤, 푘), where 푤 = 1, 2, ..., (푀푁퐿).
16
The proposed MIMO fading channel emulator outputs h푣푒푐(푘) for 푁 -by-푀
subchannels with 퐿 taps per subchannel within a symbol period. Its hardware im-
plementation consists of five modules: a flat Rayleigh fading generator (FRFG), two
Ping-Pong buffers, a correlation multiplier (CM) module, and an interpolation mod-
ule, as shown in Fig. 1. The FRFG module serially generates (푀푁퐿) uncorrelated
flat Rayleigh fading waveforms 푍푖(푅푘) (for 푖 = 1, 2, ..., (푀푁퐿)) with proper symbol
duration 푇푠, maximum Doppler frequency 푓푑, and decimation rate 푅. Its outputs are






















Figure 1. Block diagram of proposed correlated MIMO fading channels emulator.
The Ping-Pong buffers save the serial outputs of the FRFG and convert them
into parallel outputs that are required by the following CM module. Utilizing the
Ping-Pong buffer ensures that only a single FRFG module is needed to provide all
(푀푁퐿) uncorrelated Rayleigh fading channel waveforms. The Ping buffer and Pong
buffer work alternatively to temporarily store (푀푁퐿) uncorrelated fading channel
responses. Two sets of Ping-Pong buffers are employed to buffer the real and imagi-
nary parts of uncorrelated complex fading channel responses separately. The design
parameter 푅 is carefully chosen to meet the real-time requirements.
The CM module incorporates three types of correlation functions into the un-
correlated fading channel responses via Kronecker product and vector multiplication
17
in (4). It is memory demanding if an all-parallel structure is used, it is time consum-
ing if an all-serial structure is employed, especially when variables 푁 , 푀 , and 퐿 are
large. The proposed CM module employs a mixed P-S method to implement matrices
D and E thus drastically reducing memory requirement and processing time. We also




and employ a symmetric storage submodule
to save approximate half of the memory space.
Then the interpolator module linearly interpolates samples with an interpola-
tion rate 푅 (same to the decimation rate) to output symbol-rate fading waveforms.
The hardware implementation of FRFG and interpolation modules are sim-
ilar to those in [19] and the Ping-Pong buffers and CM module are new structures
developed in this work. A brief review of the FRFG and interpolation modules and
detailed structures of the Ping-Pong buffers and CM module are given in the next
few subsections.
3.1 The FRFG
One FRFG module is utilized to generate (MNL) independent flat Rayleigh
fading coefficients 푍푖(푅푘) in series with a downsampling factor 푅. The SoS method
[18] is employed to implement the flat Rayleigh fading waveforms via random number
generator, LUT for sine and cosine functions, and multipliers and adders, as shown
in Fig. 2. The SoS method generates the real and imaginary parts of the coefficients
by sum of 푃 sinusoids














cos(2휋(푓푑푘푇푠 sin훼푝,푖 + 휑푝,푖)),
훼푝,푖 =
휋(푝− 0.5 + 휃푖)
2푃
, 푝 = 1, 2, ⋅ ⋅ ⋅ , 푃.
18
where 푓푑 is the maximum Doppler frequency, 푃 is the total number of sinusoids and
푗 =
√−1. The angle of arrival 훼푝,푖 is randomized by a 휃푖. The random variables
휙푝,푖 and 휑푝,푖 are the random phases of the in-phase and quadrature components,
respectively. The random variables 휙푝,푖, 휑푝,푖, and 휃푖 are statistically independent and
























































































Figure 2. Implementation blocks of the FRFG module.
3.2 Ping-Pong Buffers
The Ping-Pong buffers synchronize the FRFG module with the CM module
and make it possible for the single FRFG to continuously provide multiple uncorre-
lated Rayleigh fading channel responses for the CM module. They perform a serial
19
to parallel data conversion via properly buffering and outputting data. Two identical
sets of Ping-Pong buffers are needed to buffer the real part 푍푐푖(푅푘) and the imaginary
part 푍푠푖(푅푘) separately. Each Ping-Pong buffer contains two banks of RAMs named
푃푖푛푔푅퐴푀푠 and 푃표푛푔푅퐴푀푠. The block diagram of the Ping-Pong buffer storing
the real part of coefficients is shown in Fig. 3.
The Ping Pong buffer contains (푀푁) units of RAMs and each RAM contains
퐿 words. The inputs 푍푐푖(푅푘) (where 푖 = 1, 2, ..., (푀푁퐿)) are fed to the Ping-Pong
buffer in the following format. In a macro period of (푀푁퐿퐿) clock cycles, the serial
sequence: 푍푐1(푅푘), ..., 푍푐푀푁퐿(푅푘), is input sequentially in the first (푀푁퐿) clock
cycles, and then all zeros are input in the rest of (푀푁퐿(퐿 − 1)) clock cycles. In
the next macro period, the variable 푘 is increased by one and then an updating
sequence is input in the similar format. The demultiplexer DEMUX and up-counter
Counter Sel 1 work together to distribute coefficients 푍푐푖(푅푘) into different RAM
units. The up-counter, Counter Sel 1, increases by one in every 퐿 clock cycles to select
one of the (푀푁) output ports of the DEMUX. Another up-counter, Counter Addr 1,
generates write/read addresses for all RAMs. The pulse with length of L clock cycles,
which is generated by a periodic pulse generator, is used as the control signal “wren”
for the RAMs enabling the write/read operations. The pulse is delayed by (푖 − 1)퐿
clock cycles for the 푖-th Ping RAM unit, and it is delayed by (푀푁퐿퐿+(푖−1)퐿) clock
cycles for the 푖-th Pong RAM unit. In Fig. 3, some connecting lines between delay
blocks and their corresponding “wren” ports are not drawn so as to avoid increasing
complexity of the figure.
Totally, (푀푁) multiplexers named MUX are used to select Ping RAMs or
Pong RAMs to be connected to the (푀푁) parallel output ports named 푍푐 표푢푡 1 ∼
푍푐 표푢푡 푀푁 . These multiplexers are controlled by the selection signal generated by
the up-counter, Counter Sel 2. Each output port sequentially outputs real parts of




























































Figure 3. Hardware implementation of the Ping-Pong buffer module. This diagram
shows the data buffer for the real part 푍푐푖(푅푘). The imaginary part uses a similar
buffer structure.
clock cycles, the output port 푍푐 표푢푡 푖 serially outputs the sequence: 푍푐퐿(푖−1)+1(푅푘),
푍푐퐿(푖−1)+2(푅푘), ..., 푍푐퐿푖(푅푘), for (푀푁퐿) times. In the next period, the variable 푘
is increased by one and then an updating sequence is output in the similar format.
These outputs are fed to the CM module and to be multiplied with the coefficients
of matrix E.
3.3 Correlation Multiplier Module
The proposed CM module is implemented by the mixed P-S method, as shown
in Fig. 4. It employs 3(푀푁) multipliers, five adders and two accumulators, all capable
of outputting results within one clock cycle. Two memory banks 푅퐴푀퐶 and 푅퐴푀퐷푖
stores the pre-computed coefficients of matrices C and D, respectively. If the size
of matrix C is large which is often the case in wideband systems, then only its
21
diagonal and upper-triangular elements are stored to save memory space, thanks to
its symmetric property [3]. The 푗-th row of matrix C is stored in 푅퐴푀퐶 with
(퐿 − 푗 + 1) coefficients. The addresses of 푅퐴푀퐶 are sequentially allocated ranging
from 1 to (퐿+1)퐿
2
. Two up-counters, Counter 2 and Counter 3, are used to generate
the proper row and column indices of matrix C, and an address convertor converts
these indices into corresponding read addresses of 푅퐴푀퐶. Actually, the address
convertor computes the read addresses by:
Read Address = (min{푙푟, 푙푐} − 1)
(




where 푙푟 is the row index; 푙푐 is the column index; min{} and max{} find the minimum
and maximum values of their arguments, respectively. The address convertor and
푅퐴푀퐶 build up a storage submodule that implements a symmetric storage method.
The size of matrix D is often small and coefficients of each column are stored
in 푅퐴푀퐷1 through 푅퐴푀퐷푀푁 separately. The up-counter 퐶표푢푛푡푒푟1 and an adder
generate the read address for 푅퐴푀퐷푖 to output (푀푁) coefficients simultaneously. If
the size (푀푁) is large, then a similar memory scheme as 푅퐴푀퐶 maybe adopted for
푅퐴푀퐷푖. In every clock cycle, the output of 푅퐴푀퐶 is multiplied with the outputs
of 푅퐴푀 퐷1 ∼ 푅퐴푀 퐷푀푁 to obtain (푀푁) coefficients of matrix E in parallel.
The vector multiplication in (4) is implemented by multiplying the (푀푁) co-
efficients of matrix E with the real and imaginary parts of the (푀푁) uncorrelated
Rayleigh channel responses stored in the Ping-Pong buffers. Results are added to-
gether for the real and imaginary parts respectively, and then two sums are sent to
the two accumulators. In a period of 퐿 clock cycles, the accumulator sums its inputs
in the previous 퐿 clock cycles to obtain a single output 퐻푐(푤,푅푘) or 퐻푠(푤,푅푘).
The outputs of the accumulators are down-sampled with a down-sampling rate 퐿
before outputting to the interpolation module. Finally, the interpolation module
takes 퐻푐/푠(푤,푅(푘 − 1)) and 퐻푐/푠(푤,푅푘) to produce all coefficients of h푣푒푐(푘) in real
22
time. It’s worth nothing that the Kronecker product can be computed alternatively












⊗D. The proposed mixed P-S method
can implement this case by simply switching the contents of 푅퐴푀퐷 and 푅퐴푀퐶.













, and use 푅퐴푀퐷 for the Kronecker product of the
other two matrices.
In contrast to the mixed P-S method, the emulator in [16] employed a serial













. The emulator can meet the real-time requirement only for a small
value of (푀푁퐿). The serial method cannot compute fast enough to meet real-time
requirement when the channel has long CIRs and/or the symbol duration reduces.
The mixed P-S method can solve this problem. It employs (푀푁) parallel compu-
tational paths and can compute the Kronecker product (푀푁) times faster than the
serialmethod does. It also requires significantly less memory space and multiplier uti-
lization than a pure parallel method that can output really fast. Therefore, the mixed
P-S method achieves the best tradeoff between computational speed and hardware
resource utilization.
3.4 Interpolator Module
The interpolator module performs a linear interpolation with a rate 푅 to gen-
erate fading coefficients at the symbol rate. The structure of the interpolator module
is shown in Fig. 5, where the inputs of the real and imaginary parts from the correla-
tion module, 퐻푐(푤,푅푘) and 퐻푠(푤,푅푘), are processed separately in parallel through a
common control logic. In every (푀푁퐿) BCPs (basic clock period), the enable control
block controls the counter to increase from 0 to (푅 − 1) in the first 푅 BCPs and to
hold at (푅− 1) in the remaining (푀푁퐿 −푅) BCPs. The counter output is normal-
ized with 1/푅. The real part input, 퐻푐(푤,푅푘), is delayed by (푀푁퐿)
2 BCPs and
23
then subtracted from the original input. The result is multiplied with the normalized
counter output and then added to the delayed input 퐻푐(푤,푅(푘 − 1)) to obtain the



















































































Figure 4. Hardware implementation of CM module using the mixed P-S method. In
this design, (푀푁) coefficients of matrix E are output in parallel per clock cycle, and































Figure 5. Implementation of the interpolator module.
24
4 IMPLEMENTATION EXAMPLES
The proposed MIMO fading channel emulator was implemented on an Altera
Stratix III EP3SL150F1152C2N FPGA/DSP development kit. The clock frequency
in this implementation was 퐹푐푙푘=50 MHz, which derived a clock cycle 20 푛푠. We
used Quartus II version 9.0, DSP Builder version 9.0, and Matlab Simulink for this
development, and hardware-in-the-loop (HIL) method for testing. The emulator ex-
amples can be found at author’s website at http://web.mst.edu/˜zhengyr/ as free
download.
Two MIMO fading channel examples were implemented on the emulator to
evaluate accuracy and capability of this emulator. The first example demonstrated
feasibility of the emulator in underwater communications by emulating a 2-by-6 un-
derwater acoustic channel with long CIRs 퐿=100 and a long symbol duration 푇푠=250
휇s. The second example emulated a WiMAX 2-by-2 fading channel with a short sym-
bol duration 푇푠=0.8 휇s and short CIRs 퐿=5, and proved the emulator to be suitable
for high data rate communication channels. In order to evaluate accuracy of this em-
ulator, the auto/cross-correlation functions of its output waveforms were computed
and compared to theoretical ones.
4.1 Implementation Example I - Underwater Acoustic Channel
The 2-by-6 underwater acoustic channel was implemented using the following
configuration. This underwater communication system consisted of two transmit
elements and six hydrophones placed as shown in Fig. 6. The angle of arrival and
the angular spread were 90∘ and 10∘ respectively. The 100-tap power delay profile
linearly ramped up from 0.2 to 1.8 in the first 40 taps, and then fell down from 1.8 to
0.27 in the 40-100 taps. Its total power was normalized to one. The Tx and Rx filters
25













Figure 6. The placement of transmit elements and hydrophones of the underwater
communication system. This is a 2-by-6 MIMO underwater acoustic communication
system where the speed of the acoustic carrier is 1500 m/s and the frequency of the
carrier is 15 kHz. The carrier wavelength is 휆=10 cm.
Other implementation parameters were selected as푀=6, 푁=2, 퐿=100, 푇푠=250
휇s, 푓푑=40 Hz, and 푅=10. The square roots of the correlation coefficient matrices Ψ푇푥






0.9793 −0.1418 0.0926 −0.0725 0.0616 −0.0563
−0.1418 0.9716 −0.1378 0.0901 −0.0711 0.0616
0.0926 −0.1378 0.9697 −0.1369 0.0901 −0.0725
−0.0725 0.0901 −0.1369 0.9697 −0.1378 0.0926
0.0616 −0.0711 0.0901 −0.1378 0.9716 −0.1418













Based on the outputs of the emulator, auto/cross-correlation functions of sev-
eral subchannels, including the auto-correlation of ℎ1,1(75, 푘), the cross-correlation
between ℎ1,1(75, 푘) and ℎ1,1(76, 푘), and the cross-correlation between ℎ1,1(75, 푘) and
ℎ2,1(75, 푘), were computed offline and plotted in Fig. 7. According to (19) in [3],
their theoretical correlation functions were 0.7155, 0.1177, and −0.1774 multiplying
by 퐽0[2휋푓푑(푘1 − 푘2)푇푠], respectively. As can be seen, the results of hardware outputs
closely matched the theoretical ones.








Normalized Time Lag: kfdTs
 
 
Theoretical Result, auto−corr of h1,1(75,k)
Hardware Output Result, auto−corr of h1,1(75,k)
Theoretical Result, xcorr of h1,1(75,k) and h1,1(76,k)
Hardware Output Result, xcorr of h1,1(75,k) and h1,1(76,k)
Theoretical Result, xcorr of h1,1(75,k) and h2,1(75,k)
Hardware Output Result, xcorr of h1,1(75,k) and h2,1(75,k)
Figure 7. Performance of underwater acoustic fading channel emulator. Auto-
correlation of ℎ1,1(75, 푘), cross-correlation between ℎ1,1(75, 푘) and ℎ1,1(76, 푘), and
cross-correlation between ℎ1,1(75, 푘) and ℎ2,1(75, 푘). The channel index is according
to (3). The results are based on hardware outputs of 200 trials with 2× 103 samples
pre subchannel per trial.
4.2 Implementation Example II - WiMAX Channel
The proposed emulator also implemented the WiMAX 2-by-2 fading chan-
nel example. The implementation parameters were selected as 푀=푁=2, 푇푠=0.8휇푠,
푓푑푇푠=0.001 and 퐿=5. The angle of arrival, the angular spread, and the Tx and Rx
filters were the same as those used in the underwater example. The distances between
27
two transmit elements and two receive elements were 12휆 and 0.5휆, respectively. The
power delay profile contained three taps and was given by the SUI-3 model [20], which































0.0005 0.0015 −0.0013 −0.0013 0.0036
0.0015 0.0044 −0.0043 −0.0123 0.0099
−0.0013 −0.0043 0.8776 0.0512 −0.0056
−0.0013 −0.0123 0.0512 0.3601 0.0048
0.0036 0.0099 −0.0056 0.0048 0.0252
⎞
⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎟⎠
The short symbol duration caused a higher real-time requirement that ex-
pected 2.5 × 107 complex responses to be generated per second. The short CIRs
reduced computational time of Kronecker product and thus lower the real-time re-
quirement, to some extent. Taking the short symbol duration and CIRs into consid-
eration, we set 푅=3 to met the real-time requirement.
Auto/cross-correlation functions of several subchannels, including the auto-
correlation of ℎ1,1(0, 푘), the cross-correlation between ℎ1,1(0, 푘) and ℎ1,1(1, 푘), and the
cross-correlation between ℎ1,1(0, 푘) and ℎ2,1(1, 푘), were computed offline and plotted
in Fig. 8. Their theoretical correlation functions were 0.7728, 0.0634 and –0.0193 mul-
tiplying by 퐽0[2휋푓푑(푘1 − 푘2)푇푠], respectively. As can be seen, auto/cross-correlation
functions of hardware outputs matched the theoretical ones very well.
28








Normalized Time Lag: kfdTs
 
 
Theoretical Result, auto−corr of h1,1(0,k)
Hardware Output Result, auto−corr of h1,1(0,k)
Theoretical Result, xcorr of h1,1(0,k) and h1,1(1,k)
Hardware Output Result, xcorr of h1,1(0,k) and h1,1(1,k)
Theoretical Result, xcorr of h1,1(0,k) and h2,1(1,k)
Hardware Output Result, xcorr of h1,1(0,k) and h2,1(1,k)
Figure 8. Performance of the WiMAX fading channel emulator. Auto-correlation
of ℎ1,1(0, 푘), cross-correlation between ℎ1,1(0, 푘) and ℎ1,1(1, 푘), and cross-correlation
between ℎ1,1(0, 푘) and ℎ2,1(1, 푘). The channel index is according to (3). The results
are based on hardware outputs of 50 trials with 2.8×104 samples per subchannel per
trial.
5 PERFORMANCE EVALUATION
In addition to accuracy, we evaluated other performances of the proposed em-
ulator including speed and hardware usage. The speed of this emulator was compared
to the emulator in [16] which employed a serial method. Moveover, multipliers and
memory utilization of the mixed P-S and serial methods were analyzed and com-
pared. Finally, parameter specifications and detailed hardware usage of the proposed
emulator were presented.
5.1 Performance Comparison of Serial and Mixed P-S Methods





generate correlated fading complex responses much faster than its counterpart in
[16] with the serial method. The cost was higher hardware utilization, especially
29
multipliers, which were used to construct multiple computational paths. The speed
comparison for typical values of 푀 , 푁 , and 퐿 was shown in Fig. 9(a) which clearly
demonstrated that the mixed P-S method saves a large amount of time. The y-axis
indicated the number of clock cycles that were required to generate one correlated
fading complex response. As can be seen, when the two methods were set to the same
values of 푀 , 푁 , and 퐿, respectively, the mixed P-S method was (푀푁) times faster
than the serial method. Note that the serial method required more clock cycles when
either (푀 × 푁) or 퐿 increased. But the mixed P-S method demanded more clock
cycles only when 퐿 increased.
The mixed P-S method used more multipliers to construct parallel computa-
tional paths in the CM module; while the serial method used a small constant number
of multipliers. The multiplier utilization of the two methods in the CM module was
shown in Fig. 9(b). The serial method employed seven multipliers to implement one
serial computational path irrespective of the values of 푀 , 푁 , and 퐿. The mixed P-S
method employed a variable number of multipliers, which was equal to (3푀푁).










. The full storage method needed 퐿2 words, and the symmetric storage
method only needed 퐿(퐿+1)
2
words that approximately saved half number of words.
5.2 Parameter Specifications and Hardware Usage
The proposed MIMO fading channel emulator is flexible in parameter selec-
tion and can be customized to simulate channel scenarios other than the examples
presented here. Table 1 shows the parameter ranges of the emulator with the FPGA
chip clock 퐹푐푙푘=50 MHz.
According to Table 1, the proposed emulator can emulate any MIMO antenna
array combination of Rx and Tx up to (푀푁)=16, including 2×2, 2×8, 3×3, 4×4 and
30





















(a) Numbers of clock cycles required.

































(b) Numbers of multipliers needed.
Figure 9. Performance comparison for generating one correlated fading complex re-
sponse using the serial method and the proposed mixed P-S method.
Table 1. Parameter ranges of the proposed emulator with 퐹푐푙푘=50 MHz.
Number of Number Normalized Output Speed
Rx, and Tx of taps Doppler 푇푠푓푑 (Samples/sec)
(푀푁) ≤ 16 퐿 ≤ 100 1.9×10−6∼1 50×106×푅퐿
so on. The maximum number of channel taps 퐿=100 covers most of practical long
CIR fading channels including underwater acoustic channels. The proposed emulator
stores the normalized Doppler frequency 푇푠푓푑 in the Q1.19 format to ensure high
accuracy 1
219
= 1.9 × 10−6. The emulator can generate 퐹푐푙푘푅
퐿
complex samples per
second. Each complex sample consists of the real and imaginary parts represented
by the Q4.14 format. For the underwater acoustic channel with 푇푠=250 휇s, the
real-time requirement can be met by setting 푅=10. For the WiMAX channel with
푇푠=0.8 휇s, the real-time requirement can be met by setting 푅=3. For channels with
smaller symbol durations, the real-time requirement can be met by increasing the
clock frequency and 푅.
31
The hardware usage of previous two implementation examples is summarized
in Table 2, where ALUT, DLR, BM, DSP, and LU denote adaptive look-up table,
dedicated logic register, block memory, DSP block (high-speed 18-bit multiplier), and
overall logical utilization, respectively. Compared to the WiMAX one, the underwa-





14845 4042 1241613 78
Underwater(13%) (4%) (22%) (20%) 14%
11926 3301 659407 35
WiMAX (10%) (3%) (12%) (9%) 12%
ter example employs more hardware resources. Especially, it employs approximately
double-size BMs and DSPs, since the implementations of Ping-Pong buffers, large size
RAM C, and parallel computational paths. Note that the total logical utilizations of
two examples are only 12% and 14% of the whole FPGA chip, respectively. The low
hardware utilization makes it possible to implement other functional modules on the
same FPGA chip.
The capability and hardware usage of the proposed emulator are compared
with those of the existing emulators in Table 3. The numbers of LE, memory block,
and DSP elements are based on the WiMAX channel emulator with 푀푁퐿 = 160 for
the proposed emulator and the one in [19]. The (푀푁퐿) for other emulators are listed
in the table. It is clear that the capability of the proposed emulator is much higher
than the existing ones; while the hardware usage of the proposed emulator remains
very low.
5.3 Interfacing with Digital Up-Convertor and Down-Convertor
Although the proposed MIMO fading channel emulator was tested by the HIL
modules via Simulink, it can be easily integrated with the digital up-convertor and
32
down-convertors to generate intermediate frequency (IF) channel waveforms. The
IF channel waveforms can be further converted via analog mixers to generate RF
channel waveforms. Altera provides several readily designed digital IF convertors for
Stratix III DSP development kit as DSP Builder Simulink models [21]. The Stratix III
DSP development kit has two HSMC interfaces that can interface with two daughter
boards, each having two ADCs and two DACs, thus a 4-by-4 MIMO channel with IF
waveforms can be easily integrated.
Table 3. Resource usage comparisons of related fading channel emulators.
Proposed Emulator Emulator Emulator Emulator









Block Memory 659407 Unknown 1920089 Unknown 440960
DSP Element 35 Unknown 194 Unknown 136
Rx×Tx 푀×푁 1 1×1 푀×푁 2 4×4 4×4




퐼푆퐼 Calculator No No Yes No No





Inter-tap Correlation Yes Yes 3 Yes 4 No Unclear
Spatial Correlation Yes No Yes No Unclear
Note: 1. The numbers of Rx, Tx, and taps meet the relationship: (푀푁퐿)≤1600.
2. The numbers of Rx, Tx, and taps meet the relationship: (푀푁퐿)≤160.
3. The inter-tap correlation is implemented by upsampling to pass band.




is calculated on chip.
6 CONCLUSIONS
A wideband MIMO fading channel emulator with accurate correlation proper-
ties has been proposed. The emulator employs a novel mixed P-S method to increase
the speed of incorporating correlation functions. This improvement makes the emu-
lator capable of emulating MIMO fading channels with a high data rate, large MIMO
33
size, and long CIRs in real-time. Two MIMO fading channel examples of underwater
acoustic and WiMAX have been implemented on one Altera Startix III FPGA/DSP
development kit and evaluated in aspects of accuracy, speed, and hardware usage.
Results exhibit that the proposed emulator employs low hardware resources and can
generate accurate MIMO fading channel responses in real time.
7 REFERENCES
[1] P. Hoeher, “A statistical discrete-time model for the WSSUS multipath channel,”
IEEE Trans. Veh. Technol., vol. 41, no. 4, pp. 461-468, Nov. 1992.
[2] K.-W. Yip and T.-S. Ng, “Efficient simulation of digital transmission over WS-
SUS channels,” IEEE Trans. Commun., vol. 43, no. 12, pp. 2907-2913, Dec. 1995.
[3] C. Xiao, J.X. Wu, S-Y. Leong, Y.R. Zheng, and K. B. Letaief, “A discrete-
time model for triply selective MIMO Rayleigh fading channels,” IEEE Trans.
Wireless Commun., vol. 3, no. 5, pp. 1678-1688, Sep. 2004.
[4] Rohde&Schwarz, “R&S ABFS baseband fading simulator,” Online:
http://www2.rohde-schwarz.com/product/abfs.html, May 2009.
[5] Azimuth Systems, “ACE 400WB MIMO channel emulator,” Online:
http://www.azimuthsystems.com, May 2009.
[6] Elektrobit, “EB Propsim F8-advanced RF channel emulator,” Online:
http://www.elektrobit.com, Jun. 2009.
[7] A. Anastasopoulos and K.M. Chugg, “An efficient method for simulation of fre-
quency selective isotropic Rayleigh fading,” in Proc. IEEE Veh. Technol. Conf.,
Phoenix, AZ, May. 1997, pp. 2084-2088.
[8] E. Chiavaccini and G.M. Vitetta, “GQR models for multipath Rayleigh fading
channels,” IEEE Journal Selected Areas in Communications, vol. 19, no. 6, pp.
1009-1018, Jun. 2001.
[9] B. E. Baddour and N. C. Beaulieu, “Accurate simulation of multiple cross-
correlated Rician fading channels,” IEEE Trans. Commun., vol. 52, no. 11, pp.
1980-1987, Nov. 2004.
34
[10] K.E. Baddour and N.C. Beaulieu, “Autoregressive models for fading channel
simulation,” IEEE Trans. Wireless Commun., vol. 4, no. 4, pp. 1650-1662, July.
2005.
[11] M.A. Wickert and J. Papenfuss, “Implementation of a real-time frequency-
selective RF channel simulator using a hybrid DSP-FPGA architecture,” IEEE
Trans. Microw. Theory Tech., vol. 49, no. 8, pp. 1390-1397, Aug. 2001.
[12] M. Kahrs and C. Zimmer, “Digital signal processing in a real-time propagation
simulator,” IEEE Trans. Instrum. Meas., vol. 55, no. 1, pp. 197-205, Feb. 2006.
[13] Fei Ren and Y.R. Zheng, “A low-complexity hardware implementation of
discrete-time frequency-selective Rayleigh fading channels,” in Proc. IEEE IS-
CAS, Taipei, Taiwan, May 2009, pp. 1759-1762.
[14] M. Cui, H. Murata, and K. Araki, “FPGA implementation of 4×4 MIMO test-
bed for spatial multiplexing systems,” in Proc. IEEE International Symposium
on Personal, Indoor and Mobile Radio Communications, Barcelona, Spain, Sep.
2004, pp. 3045-3048.
[15] A. Alimohammad, S.F. Fard, B.F. Cockburn, and C. Schlegel, “A novel technique
for efficient hardware simulation of spatiotemporally correlated MIMO fading
channels,” in Proc. IEEE Int. Conf. Commun., Beijing, China, May 2008, pp.
718-724.
[16] F. Ren and Y.R. Zheng, “Hardware implementation of triply selective Rayleigh
fading channel simulators,” in Proc. IEEE International Conference on Acous-
tics, Speech and Signal Processing, Dallas, Texas, Mar. 2010, pp. 1498-1501.
[17] Y. R. Zheng and C. Xiao, “Improved models for the generation of multiple un-
correlated Rayleigh fading waveforms,” IEEE Trans. Commun., vol. 6, no. 6, pp.
256-258, Nov. 2002.
[18] Y. R. Zheng and C. Xiao, “Simulation models with correct statistical properties
for Rayleigh fading channels,” IEEE Trans. Commun., vol. 51, no. 6, pp. 920-928,
Jun. 2003.
[19] F. Ren and Y.R. Zheng, “A novel emulator for discrete-time MIMO triply-
selective fading channels,” IEEE Trans. Circuit Systems, Part-I, vol. 57, no.
9, pp. 2542C2551, Nov. 2010.
[20] J. Milanovic, S. Rimac-Drlje, and K. Bejuk, “Comparison of propagation models
accuracy for WiMAX on 3.5 GHz,” in Proc. 14th IEEE International Confer-
ence on Electronics, Circuits and Systems, Marrakech, Morocco, Dec. 2007, pp.
111C114.
[21] Altera Corporation, “DSP Builder-advanced blockset with timing-driven




II. A NOVEL EMULATOR FOR DISCRETE-TIME MIMO
TRIPLY-SELECTIVE FADING CHANNELS
Fei Ren and Yahong Rosa Zheng
Abstract—Hardware implementation of discrete-time triply selective Rayleigh fad-
ing channel emulators is proposed for multiple-input multiple-output (MIMO) com-
munications. The proposed work differs from existing ones in that it incorporates
temporal correlation, inter-tap correlation, and spatial correlation matrices into mul-
tiple uncorrelated frequency-flat Rayleigh fading waveforms to obtain a triply selective
fading channel. The flat fading waveforms with temporal correlation or Doppler spec-
trum are generated using a Sum-of-Sinusoids (SoS) method. The inter-tap correlation
matrix associated with multipath delay spread is computed according to the chan-
nel power delay profile and transmit/receive filters. The spatial correlation matrices,
including the transmit correlation and receive correlation matrices, are predefined in-
puts associated with antenna arrangements. The square roots of the three correlation
matrices are computed via Singular Value Decomposition (SVD) and then combined
in real time via Kronecker product with the flat fading waveforms. Several fading
channel examples are implemented on an Altera Stratix III EP3SL150F FPGA DSP
development kit with fixed-point arithmetics. A 4× 4 MIMO triply-selective channel
with 10 correlated delay-taps per sub-channel utilizes one third of the hardware re-
source of the FPGA chip. The statistical and correlation properties of the emulated
fading waveforms match those of the software-based simulators and the theoretical




Wireless fading channel modeling, simulation, and emulation are important
topics in communications because they provide a fast and low-cost method for test-
ing and verifying new algorithm design, transceiver performance, and channel ca-
pacity analysis [1, 2, 3, 4]. To generate correct and realistic fading waveforms, it is
important that channel models reproduce accurate properties of actual propagation
environments. One significant statistical property for wireless fading channel models
is the correlation of fading channel waveforms. For a multiple-input multiple-output
(MIMO) system, three types of correlation functions, namely temporal correlation,
inter-tap correlation, and spatial correlation, need to be taken into consideration since
a practical MIMO fading channel is usually time-selective (described by the temporal
correlation), frequency-selective (exhibiting inter-tap correlation), and space-selective
(associated with transmitter and/or receiver spatial correlation). This is referred to
as the MIMO triply selective fading channel [5]. Incorporating the three types of
correlation functions into channel simulators or emulators is the key and yet difficult
aspect of accurately simulating MIMO fading channels. The frequency-flat Rayleigh
fading channel and the frequency-selective fading channels are two special cases of
the MIMO triply-selective fading model [5], when only the temporal correlation, or
both the temporal and inter-tap correlation functions are involved, respectively. It
is worth noting that the most commonly used Wide-Sense Stationary Uncorrelated
Scattering (WSSUS) channel model [1] assumes uncorrelated scatterers in the pass
band, which leads to inter-tap correlated fading waveforms in the baseband equivalent
channel due to the bandlimited nature of wireless systems [6].
Software-based channel simulators usually employ general-purpose processors
and floating-point arithmetic to generate fading channel impulse responses. The flat
37
Rayleigh fading channels can be simulated by one of the two methods: the spectrum
filtering method [7, 8, 9] and the Sum-of-Sinusoids (SoS) method [10, 11, 12, 13]. The
spectrum filtering method shapes the spectrum of a white Gaussian waveform using
a filter that has a transfer function equal to the square root of the power spectrum
density (PSD) of the desired fading process. Doppler spectrum filtering may be im-
plemented by FIR filters [7] or IIR filters [8,9]. It can simulate fading channels having
various PSD shapes and reach accurate statistical properties in every trial. The SoS
method sums a finite number of sinusoidal waveforms having amplitudes, frequen-
cies, and phases that are appropriately selected to reproduce the desired statistical
properties and Doppler spectra. It is computationally efficient and flexible in param-
eter reconfiguration (such as the maximum Doppler frequency). Frequency-selective
fading channels incorporating inter-tap correlation is more difficult to simulate be-
cause the channel is modeled as a time-varying system with a 2-D scattering function.
The Doppler frequency is often much smaller than the symbol rate; while the WS-
SUS delays are often at fractional spacing of the symbol interval. Many approaches
to discrete-time frequency-selective channel simulation are found in the literature,
including the delay-weight-and-sum method [6,14], the correlation matrix multiplica-
tion method [15,5], and the 2-D filtering method [16]. The first one uses fractionally
delayed transmit/receive filter taps to weigh and delay multiple uncorrelated flat fad-
ing waveforms and sums them together according to the power delay profile (PDP).
The correlation matrix multiplication method first computes the inter-tap correlation
matrix according to the PDP and then multiplies the square root of the correlation
matrix with multiple uncorrelated flat fading waveforms. The 2-D filtering method
filters multiple white Gaussian processes by an approximating filter of the delay-
Doppler spread function using Gaussian quadrature rules. A MIMO triply selective
fading channel model is even more challenging, which needs to consider the transmit
38
and receive spatial correlation functions in addition to the temporal and inter-tap cor-
relation functions. A discrete-time triply-selective channel, proposed by [5], computes
the symbol-spaced inter-tap correlation matrix according to the power delay profile,
and incorporates the inter-tap and spatial correlation matrices into multiple uncor-
related SoS flat fading CIRs via Kronecker product. Another MIMO channel simu-
lator, proposed in [17, 8], synthesizes correlated vector channels (with user-specified
correlation function) using the Auto-Regression (AR) modeling method to shape the
spectrum of uncorrelated white Gaussian processes.
Research on software-based channel simulators provides the basis for the de-
sign of hardware-based channel emulators. Several hardware emulators for frequency-
flat and doubly-selective fading channels have been reported in the literature. For
frequency-flat Rayleigh fading channels, the emulator in [18] is based on the SoS
method with a modified random phase variable and its hardware implementation
uses reduced rate sinusoids to achieve high speed and low hardware usage. Several
other hardware emulators [19, 20] are based on the spectrum filtering method where
FIR or IIR filtering with single or multiple interpolation stages is used to improve
the accuracy of the narrowband U-shaped Doppler spectrum. For doubly-selective
fading channels, the emulator in [21] uses the spectrum filtering method to gener-
ate multiple uncorrelated flat fading waveforms, converts them to pass band signals
(with upsampling, and then combines them as uncorrelated scatterers according to
the power delay profile. Another emulator [22] generates one baseband complex Gaus-
sian process with Doppler filtering and filters it again in the delay-spread domain to
generate multipaths. Recently, we propose a hardware emulator [23] based on the
software simulation model in [14] which uses the SoS method to generate multiple
uncorrelated flat fading waveforms, upsamples them to fractionally spaced baseband
signals, and then combines them using weight-delay-and-sum. However, hardware im-
plementation of MIMO triply selective fading channels still presents some challenge
39
in accurately incorporating correlation functions among the multiple sub-channels.
Currently, many so-called MIMO emulators only implement multiple uncorrelated
flat fading channels in parallel. For example, the emulator in [24] outputs multiple
uncorrelated flat Rayleigh fading channels without considering inter-tap and spatial
correlation; the emulator in [25] attempts to incorporate only the spatial correlation
matrices into multiple frequency-flat fading waveforms. To the best of our knowledge,
no hardware-based channel emulators in the literature or commercial products has
properly implemented all three types of correlation functions of MIMO channels.
In this paper, we propose a hardware implementation method for discrete-
time MIMO triply selective channel emulators. The proposed method implements
the three types of correlation functions of the triply selective channel based on the
software simulator in [5]. The emulator consists of five major functional modules:
random number generator (RNG), frequency-flat Rayleigh fading generator (FRFG),




ator), correlation multiplier (CM), and interpolator module. The use of Kronecker
product in the CM module saves a large amount of hardware memory storage at
the expense of slightly increased computational complexity. This method achieves
the best tradeoff between hardware resources and simulation speed. In addition,
mixed parallel-serial architecture is used to meet real-time requirements while reduc-
ing hardware area. For example, the SoS computation for a single flat Rayleigh fading
waveform is performed in parallel, and the generation of multiple flat Rayleigh fading
waveforms and correlation combination are performed in series. The proposed emula-
tor is implemented on a Stratix III EP3SL150F FPGA DSP development kit. Several
fading channel examples are provided to demonstrate the accuracy and statistical
properties of the emulator.
40
The rest of the paper is organized as follows. Section 2 reviews the software-
based model of discrete-time MIMO triply selective fading channels. Section 3 pro-
poses the hardware implementation method for this model. Section 4 presents several
FPGA hardware implementation examples and evaluates their performance. Section
5 draws the conclusion.
2 DISCRETE-TIME TRIPLY SELECTIVE FADING MODEL
We choose the discrete-time MIMO triply selective fading model in [5] as the
basis for our hardware implementation. Consider a MIMO channel with 푃 transmit
and 푂 receive antennas. The input-output relationship of the channel in the discrete-




H(푘, 푞) ⋅ x(푘 − 푞) + v(푘), (1)
where the superscript (⋅)푡 is the transpose operator of a matrix or vector, x(푘) =
[푥1(푘), 푥2(푘), ..., 푥푃 (푘)]
푡 is the transmitted signal vector, y(푘) = [푦1(푘), 푦2(푘), ..., 푦푂(푘)]
푡
is the received signal vector, v(푘) = [푣1(푘), 푣2(푘), ..., 푣푂(푘)]
푡 is the background white
Gaussian noise, 푘 is the time index, and 푄1 and 푄2 are nonnegative integers repre-
senting the range of delay taps yielding the total channel length of 푄 = 푄1 +푄2 +1.
Note we assume the symbol interval is 푇푠. The MIMO channel coefficient matrix












For the convenience of implementation, we reshape the matrix H(푘, 푞) to an
(푂푃푄)× 1 coefficient vector as
h푣푒푐(푘) = [h1,1(푘), ...,h1,푃 (푘) ∣ ... ∣ h푂,1(푘), ...,h푂,푃 (푘)]푡
where h표,푝(푘) is the coefficient vector of the (표, 푝)-th sub-channel given by h표,푝(푘) =















where ⊗ denotes the Kronecker product and X 12 is the square root of matrix X
such that X = X
1
2 ⋅ (X 12 )ℎ with the superscript (⋅)ℎ being the Hermitian operator.
The spatial correlation matrices, Ψ푅푥 and Ψ푇푥, are associated with the receiver and
transmitter antennas, respectively; C퐼푆퐼 is the inter-tap covariance matrix which
causes intersymbol interference (ISI); and Φ(푘) is an (푂푃푄)× 1 vector for each time
index 푘.
Define Φ(푘) = [푍1(푘), 푍2(푘), ..., 푍(푂푃푄)(푘)]
푡, where 푍푖(푘) is one of the uncorre-
lated flat Rayleigh fading waveforms. The multiple uncorrelated flat Rayleigh fading
waveforms 푍푖(푘) can be efficiently simulated by one of the SoS models [12] and a
typical one is














cos(2휋(푓푑푘푇푠 sin훼푚,푖 + 휑푚,푖)),
훼푚,푖 =
휋(푚− 0.5 + 휃푖)
2푀
, 푚 = 1, 2, ⋅ ⋅ ⋅ ,푀.
42
where 푓푑 is the maximum Doppler frequency, 푀 is the total number of sinusoids and
푗 =
√−1. The angle of arrival 훼푚,푖 is randomized by a 휃푖. The random variables
휙푚,푖 and 휑푚,푖 are the random phases of the in-phase and quadrature components,
respectively. The random variables 휙푚,푖, 휑푚,푖, and 휃푖 are statistically independent
and uniformly distributed on [−0.5, 0.5) for all 푚.
The spatial correlation matrices, Ψ푅푥 and Ψ푇푥, are usually known and spec-
ified by users. The inter-tap covariance matrix, C퐼푆퐼 , is computed according to the








푐(푄2,−푄1) ⋅ ⋅ ⋅ 푐(푄2, 푄2)
⎞
⎟⎟⎟⎟⎠ (5)




휎2푛푅푃푇푃푅(푞1푇푠 − 휏푛)푅∗푃푇푃푅(푞2푇푠 − 휏푛) (6)
and 푅푃푇푃푅(휉) is the convolution of the transmit and receive filters. Note 푁 is the
number of total resolvable paths in the PDP. The superscript ()∗ is the conjugate
operator. Parameters 휎푛 and 휏푛 come from the PDPs, 퐺(휏), which are often specified




휎2푛훿(휏 − 휏푛) (7)
43
3 HARDWARE IMPLEMENTATION METHOD
Our hardware implementation of the discrete-time MIMO triply selective fad-
ing emulator outputs h푣푒푐(푘) for 푂×푃 sub-channels in parallel with 푄 taps per sub-
channel. For the convenience of description, we give elements of h푣푒푐(푘) new indices
by defining 퐻(푙, 푘) = ℎ표,푝(푘, 푞), where 푙 = 푄⋅[(표−1)⋅푃 +(푝−1)]+(푞+푄1+1). There-
fore, the channel vector h푣푒푐(푘) is converted to [퐻(1, 푘), 퐻(2, 푘), ..., 퐻((푂푃푄), 푘)]
푡,
where 퐻(푙, 푘) is the complex fading coefficient with 퐻(푙, 푘) = 퐻푐(푙, 푘) + 푗퐻푠(푙, 푘).
The proposed implementation scheme consists of five major modules: a ran-




erator, a correlation multiplier (CM) module, and an interpolator module, as shown







































Figure 1. Block diagram of discrete-time MIMO triply selective fading emulator.
The RNG module is a bank of pseudo-random number generators, which gen-
erate uniform random variables used in the SoS channel model (3). Note that a set of
44
the random variables is generated at the beginning of each trial and they are stored
and used for all 푘 (time index).
The FRFG module serially generates a large number of uncorrelated flat
Rayleigh fading waveforms with proper temporal correlation (or Doppler spectrum)
each at a low sampling rate. It takes the random variables from the RNG module as
inputs and computes multiple uncorrelated flat Rayleigh fading responses according
to (3), but with a decimation factor, 푅, of the symbol interval. This implementation
takes advantage of the fact that the maximal Doppler frequency of typical fading chan-
nels is often much smaller than the symbol rate and fading variation within channel
coherence time is small. This technique reduces the computational complexity while




퐼푆퐼 generator computes the coefficients of the inter-tap correlation ma-
trix using (5) and the square root of C퐼푆퐼 using the Jacobi algorithm for SVD [27].
Note that this module is also used only once at the beginning of each simulation trial
and the square root matrix C
1
2







푅푋 , for the entire trial.
The CM module incorporates the three square-root matrices with the multiple
uncorrelated flat Rayleigh fading channel responses using the Kronecker product and
vector multiplication. Note that the three square root matrices are saved in the
on-chip memory of the FPGA development board.
The interpolator module linearly interpolates the triply selective fading chan-
nel waveforms into symbol-spaced samples with an interpolation rate 푅. This module




퐼푆퐼 generator and CM module are novel FPGA hardware implemen-
tations proposed by this paper. Instead of storing a big square root matrix of size
(푂푃푄) × (푂푃푄), this architecture stores the coefficients of three small matrices of
sizes 푂 × 푂, 푃 × 푃 , 푄 × 푄, respectively. The memory savings is accomplished by
45
slight increase of computational complexity associated with the Kronecker product
calculation. The other three modules used in the proposed scheme are similar to the
ones in [20, 25, 29] with slight modifications. The following subsections will describe
each module in details, with emphasis on the C
1
2
퐼푆퐼 generator and CM modules.
3.1 Random Number Generator and Flat Rayleigh Fading Generator
The RNG and FRFG modules work together to generate (푂푃푄) uncorrelated
flat Rayleigh fading channels using (3). The data path of RNG and FRFG mod-
ules is shown in Fig. 2. The FRFG module has a parallel-serial mixed structure,
which generates (2푀) sinusoids in parallel and 푍1(푅푘)∼푍(푂푃푄)(푅푘) in serial. The




required by a serial structure; while a serial structure is a better choice for generating
푍1(푅푘)∼푍(푂푃푄)(푅푘) because (푂푃푄) is a large and reconfigurable number. The serial
structure outputs 푍1(푅푘) ∼ 푍(푂푃푄)(푅푘) sequentially, which matches the requirement
of the CM module needing serial inputs.
The RNG module generates all uniform random variables including 휙푚,푖, 휑푚,푖,
and 휃푖, where 푚 ranges from 1 to 푀 and 푖 ranges from 1 to (푂푃푄). It consists of
a set of (2푀 + 1) RNGs: (2푀) of them are for 휙푚,푖 and 휑푚,푖, and one for 휃푖. To
provide sufficient accuracy and randomness, we employ the combined linear feedback
shift register (LFSR) random number generator (RNG) [29], which has a longer reoc-
currence period, better randomness, and correlation properties than the conventional
LFSR RNGs. In our implementation, outputs of the RNGs are scaled and shifted to
meet the range requirement before storing in buffers of the FRFG module.
The FRFG module involves a large set of cos and sin functions, whose accu-
racy affects the performance of the emulator significantly. We propose a simplified
but accurate look-up-table (LUT) scheme to compute cos훼푚,푖 and sin훼푚,푖. Since




























































































). We build a set of (2푀) LUTs named 퐶1, 퐶2, ... 퐶푀 ,
푆1, 푆2, ..., and 푆푀 , each of which has 퐷1 non-overlap entries. The entries in the














). The outputs of the RNG for 휃푖 are rounded and scaled
to the range of [1, 퐷1]. Taking these outputs as the read addresses, the set of LUTs
output the desired cos훼푚,푖 and sin훼푚,푖 values. This LUT scheme achieves a very




) or sin(0) ∼ sin(휋
2
). The format of the entries in the LUTs is the fixed
point Q2.14, which is enough to meet the accuracy requirement.
In the FRFG module, a generator block is employed to output (푓푑푅푘푇푠),
where 푘 is generated by an increase counter with a proper updating period, and
it is then multiplied by the constants 푓푑, 푅, and 푇푠). The outputs of this block
are multiplied by cos훼푚,푖 (or sin훼푚,푖) and added with 휙푚,푖 (or 휑푚,푖) to obtain
(푓푑푅푘푇푠 cos훼푚,푖 + 휙푚,푖) (or (푓푑푅푘푇푠 sin훼푚,푖 + 휑푚,푖)). A set of modulo operators
extract the fractional parts of the inputs and convert them into the read addresses
of the (2푀) LUTs named COS1, COS2, ..., and COS2푀 . These LUTs are employed
to compute cos(2휋(푓푑푅푘푇푠 cos훼푚,푖 + 휙푚,푖)) and cos(2휋(푓푑푅푘푇푠 sin훼푚,푖 + 휑푚,푖)), re-






outputs of these LUTs are summed by two accumulators, and multiplied by
√
2/푀 to
obtain 푍푐푖(푅푘) and 푍푠푖(푅푘). Taking the corresponding parameters of the (푂푃푄) sub-
channels, the FRFG module is re-used to generate (푂푃푄) uncorrelated flat Rayleigh













consists of two submodules: the C퐼푆퐼 generator module which computes the coeffi-
cients of C퐼푆퐼 according to (4) and (5), and the matrix square root (MSR) module




shown in Fig. 3. The 푞1 and 푞2 counters range from −푄1 to 푄2 (assuming 푄1 ≤ 푄2).
The 푞1 counter increases by one in every (푁푄) basic clock periods (BCP); while the
푞2 counter increases by one in every 푁 BCPs. Two buffers store values of 휎
2
푛 and 휏푛
and sequentially output them while 푛 increases from 0 to 푁 . The outputs of the two































































and 푅∗푃푇푃푅(휉) are implemented using a LUT scheme. Since 푅푃푇푃푅(휉) is a real and
even function, the size of the LUT can be reduced by only storing the values cor-
responding to 휉 ≥ 0. In our implementation, the LUTs 푅1 and 푅2 have the same
퐷3 entries: the results of 푅푃푇푃푅(휉) where 휉 = (0 :
푄1푇푠
퐷3−1
: 푄1푇푠 + MAX(휏푛)). The
read addresses of 푅1 and 푅2 are computed from ∣푝1푇푠 − 휏푛∣ and ∣푞2푇푠 − 휏푛∣, and the
outputs are multiplied together and then by the corresponding 휎2푛. The accumulator
following the two multipliers sums the 푁 inputs to obtain one coefficient 푐(푞1, 푞2) in
every 푁 BCPs. The C퐼푆퐼 generator module sequentially outputs the coefficients in a
row-wise order.
The MSR module employs the EigenValue Decomposition (EVD) method [27]
to find the matrix square root of C퐼푆퐼 . According to (4) and (5), C퐼푆퐼 is always
a symmetric positive definite matrix, whose eigenvalues are equal to its singular




, where D퐶퐼푆퐼 is diagonal with its diagonal elements being the eigen-








The MSR module is implemented by an EVD submodule and two matrix multipliers.
49
The coefficients of C퐼푆퐼 are sequentially fed via a buffer to the EVD module and the
eigenvalues are computed by the Jacobi rotation algorithm [27] with implementation













V−1퐶퐼푆퐼 , or C
1
2
퐼푆퐼 , are computed using two
matrix multipliers and are output sequentially in a row-wise order.
3.3 Correlation Multiplier Module
The CM module incorporates inter-tap and spatial correlation matrices into
the multiple uncorrelated flat Rayleigh fading waveforms generated by the FRFG. It

















ℎ (0) ⋅ Φ(푘). Although the coefficients of C
1
2
ℎ (0) are fixed values which can be pre-
computed by software and stored in hardware memory, the pre-compute and store
method consumes a large amount of hardware memory for C
1
2
ℎ (0) storage, especially,




ℎ (0) in real-time and its results are input to the next module without storing. The
datapath of the CM module is shown in Fig. 4.
The KP module consists of three random access memories (RAMs), six coun-
ters and a few multipliers and adders. The RAM A, RAM B, and RAM C store the










in a row-wise order. The counters, multipliers, and adders work together to generate
the proper read addresses for the three RAMs. Counters have different clock periods
(integer multiples of one BCP) and modulo operators, 푂, 푃 , and 푄. Two multipliers
are employed to multiply outputs of the three RAMs together. Their results are the
coefficients of the matrix C
1
2
ℎ (0) in a row-wise order.
50
In the VM module, two buffers storing 푍푐푖(푅푘) and 푍푠푖(푅푘) are output in
a proper order to multiply the corresponding coefficients of C
1
2
ℎ (0). Take the buffer
storing 푍푐푖(푅푘) for example, it repeatedly outputs the sequence, 푍푐1(푅푘), 푍푐2(푅푘), ...,




is aligned with the outputs of the two buffers to ensure correct multiplication. The
accumulators sum the (푂푃푄) results in every (푂푃푄) BCPs and the final results are
down-sampled by the same rate to yield 퐻푐(푙, 푅푘) or 퐻푠(푙, 푅푘). Therefore, for each
time index 푘, it takes (푂푃푄) BCPs to output one single 퐻푐(푙, 푅푘) and 퐻푠(푙, 푅푘), and
(푂푃푄)2 BCPs to output all results of 퐻푐(푙, 푅푘) and 퐻푠(푙, 푅푘) for 푙 = 1, ⋅ ⋅ ⋅ , 푂푃푄.
3.4 Interpolator Module
The interpolator module performs a linear interpolation with a rate 푅 to meet
the real-time requirements. The datapath of the interpolator module is shown in
Fig. 5. The inputs from the CM module, 퐻푐(푙, 푅푘) and 퐻푠(푙, 푅푘), are delayed by
(푂푃푄)2 BCPs and become 퐻푐(푙, 푅(푘−1)) and 퐻푠(푙, 푅(푘−1)), respectively. A enable
control block holds “HIGH” for 푅 BCPs and then changes to “LOW” for (푂푄푃 −푅)
BCPs. Therefore, the output of the counter increases from 0 to (푅 − 1) in the first
푅 BCPs and holds (푅 − 1) in the rest of (푂푃푄− 푅) BCPs in every (푂푃푄) BCPs.
The counter output is multiplied with 1/푅 as well as 퐻푐(푙, 푅푘)−퐻푐(푙, 푅(푘− 1)) and
퐻푠(푙, 푅푘) − 퐻푠(푙, 푅(푘 − 1)), respectively. The results are added to 퐻푐(푙, 푅(푘 − 1))
and 퐻푠(푙, 푅(푘− 1)), respectively, to obtain 퐻푐(푙, 푘) and 퐻푠(푙, 푘) as the final outputs.
4 EXAMPLES AND PERFORMANCE EVALUATION
The proposed discrete-time MIMO triply selective fading channel emulator was







































































































Figure 4. Implementation of the correlation multiplier module.
kit. The basic clock frequency of the FPGA chip was 퐹푐푙푘 = 50 MHz, which provided
BCP=20 ns. We used Quartus II version 8.0, DSP Builder version 5.0, and Matlab




퐼푆퐼 Generator Performance Evaluation
As an example, the C
1
2
퐼푆퐼 generator module was implemented to compute the
coefficients for the typical urban channel model with a 12-tap PDP, as shown in Fig.6.





























퐻푐(푙, 푅푘)−퐻푐(푙, 푅(푘 − 1))
퐻푠(푙, 푅푘)−퐻푠(푙, 푅(푘 − 1))
퐻푐(푙, 푘)
퐻푠(푙, 푘)
Figure 5. Implementation of the interpolator module.
The transmit and receive filters were the square-root raised cosine (SRC) filters with
a roll-off factor 0.3 and group delay 3푇푠, where 푇푠 = 3.69 휇푠. We configured 푄1 = 4
and 푄2 = 5, therefore the size of C
1
2








and compared to those of Matlab floating-point computation. Their squared error
per coefficient and mean squared errors (MSE) are shown in Fig.7, where the x-axis
is the coefficient index. All the three fixed-point formats had small MSEs less than
−30 dB. The outputs with Q2.10 and Q2.14 resulted in similar MSEs, but the former
consumed less hardware resource. Therefore, the Q2.10 format was selected as a good
tradeoff between performance and cost.
4.2 KP Module Memory Usage Evaluation


















퐼푆퐼 , and computes the Kronecker product using a serial structure with
53















Figure 6. The normalized PDPs for the typical urban channel model presented in
3GPP standard [26].






Coefficient index of CISI
1/2











Coefficient index of CISI
1/2











Coefficient index of CISI
1/2









hardware fixed-point and Matlab floating-point. Note that the scales of y axes are
different in sub-figures.
54
five multipliers, six adders and six counters. The memory usage of the pre-compute
and store method is (푂푃푄)2 words, a quadratic function of the matrix size; while the
proposed KP method requires only (푂2+푃 2+푄2) words. The comparison for typical
values of 푂,푃 , and 푄 is shown in Fig. 8 which clearly demonstrates that the mem-
ory required by the pre-compute and store method becomes prohibitively high when
the matrix size (푂푃푄) is large, making the method impractical. A large amount of
hardware memory can be saved by the proposed KP method with only slight increase
in hardware resource usage, making it a better choice for triply-selective channel
emulation.
4.3 Frequency Selective Fading Channel Example
A doubly-selective channel emulator was implemented using the proposed
method by configuring Ψ푅푥 and Ψ푇푥 to identity matrices thus simultaneously gener-
ates (푂×푃 ) doubly-selective channels. The implementation parameters were selected
as 푀=16 (number of sinusoids), 푇푠=3.69 휇푠 (symbol duration), 푓푑푇푠=0.001 (normal-
ized Doppler), 푄=10 (channel taps), and 푅=140 (interpolation rate). The PDPs and
transmit/receive filters were the same as those in Section 4.1.
The performance of this emulator was evaluated by the auto/cross-correlation
of its output waveforms, as shown in Fig. 9. The theoretical autocorrelation function
of ℎ1,1(푘, 푙) is given by 푐(푙, 푙)⋅퐽0[2휋푓푑(푘1−푘2)푇푠]; while the theoretical cross-correlation
function of ℎ1,1(푘, 푙1) and ℎ1,푙2(푘, 1) is given by 푐(푙1, 푙2) ⋅ 퐽0[2휋푓푑(푘1 − 푘2)푇푠], where
the correlation coefficients were 푐(0, 0)=0.7794, 푐(0, 1)=0.1551, and 푐(0, 2)=−0.0544.
The auto/cross-correlation functions of the emulator outputs were computed offline
using 50 trials with 2.8× 104 samples per sub-channel per trial. As can be seen, the
results of hardware outputs closely matched theoretical ones.
55




















Pre−compute and Store Method
KP Method
Figure 8. The comparison of memory usage between the pre-compute and store
method and the proposed KP method.
4.4 Triply Selective Fading Channel Example
The proposed emulator also implemented the MIMO triply selective channel
example presented in [5]. The size of Ψ푅푥, Ψ푇푥, and C퐼푆퐼 were 푂 = 푃 = 2, and
56







Normalized Time Lag: kfdTs
 
 
Theoretical auto−corr of h1,1(k,0)
Hardware output auto−corr of h1,1(k,0)
Theoretical xcorr of h1,1(k,0) and h1,1(k,1)
Hardware output xcorr of h1,1(k,0) and h1,1(k,1)
Theoretical xcorr of h1,1(k,0) and h1,1(k,2)
Hardware output xcorr of h1,1(k,0) and h1,1(k,2)
Figure 9. Performance of Doubly selective fading channel emulator. Auto-correlation
of ℎ1,1(푘, 0), cross-correlation between ℎ1,1(푘, 0) and ℎ1,1(푘, 1) and between ℎ1,1(푘, 0)
and ℎ1,1(푘, 2). The results were based on hardware outputs of 50 trials with 2.8×104
samples per sub-channel per trial.













The PDP was 퐺(휏) = 퐴 exp(−휏/휇푠) for 0 ≤ 휏 ≤ 5휇푠 and zero elsewhere. The
transmit filter was a linearized Gaussian filter with a time-bandwidth product equal
to 0.3, the receive filter was an SRRC filter with a roll-off factor 0.3. The matrix
57




0.0091 0.0426 0.0178 −0.0016
0.0426 0.3664 0.3407 0.0367
0.0178 0.3407 0.5583 0.1414
−0.0016 0.0367 0.1414 0.0602
⎞
⎟⎟⎟⎟⎟⎟⎟⎠
The parameters,푀 , 푓푑, 푇푠, and 푅, were the same as those used in the doubly selective
fading example. The auto/cross-correlation of several sub-channels was computed in
comparison to the theoretical ones and software simulation results reported in [5], as
depicted in Fig. 10 and Fig. 11. The theoretical auto/cross-correlations are given by
푐0(푙1, 푙2)퐽0[2휋푓푑(푘1 − 푘2)푇푠], where 푐0(푙1, 푙2) is the (푙1, 푙2)-th coefficient of Cℎ(0). The
results of the emulator outputs were also based on 50 trials and they matched the
theoretical ones very well.
4.5 Evaluation of Flat Rayleigh Fading Generators
The performance of the flat Rayleigh fading generator was analyzed by statis-
tical properties of FRFG outputs. The FRFG module had the same parameters 푀 ,
푇푠, 푓푑, and 푅 as those of the previous examples. The probability density function
(PDF) of the real/imaginary part of the outputs, the PDF of the envelop, and the
level crosing rate (LCR) are compared with the theoretical ones, as shown in Fig. 12–
14. The PDF curves of the hardware outputs matched the theoretical ones very well.
The LCR of the emulator outputs had slightly lower values than the theoretical ones
at lower rates because the number of simulated samples was limited to provide an
58










Normalized Time Lag: kfdTs
 
 
Theoretical Result, auto−corr of h1,1(k,1)
Hardware Output Result, auto−corr of h1,1(k,1)
Theoretical Result, xcorr of h1,1(k,0) and h1,1(k,1)
Hardware Output Result, xcorr of h1,1(k,0) and h1,1(k,1)
Theoretical Result, xcorr of h1,1(k,0) and h2,1(k.1)
Hardware Output Result, xcorr of h1,1(k,0) and h2,1(k,1)
Figure 10. Performance of the triply selective channel emulator. Auto-correlation
of ℎ1,1(푘, 1), cross-correlation between ℎ1,1(푘, 0) and ℎ1,1(푘, 1), and between ℎ1,1(푘, 0)
and ℎ2,1(푘, 1). The numbers of trials and samples are the same as those in Fig. 9.








Normalized Time Lag: kfdTs
 
 
Theoretical xcorr of h1,1(k,−1) and h1,1(k,1)
Hardware output xcorr of h1,1(k,−1) and h1,1(k,1)
Theoretical xcorr of h1,1(k,−1) and h1,1(k,2)
Hardware output xcorr of h1,1(k,−1) and h1,1(k,2)
Figure 11. Performance of triply selective channel emulator. Cross-correlation be-
tween ℎ1,1(푘,−1) and ℎ1,1(푘, 1), and between ℎ1,1(푘,−1) and ℎ1,1(푘, 2). Note the
change of scale in y-axis.
59
accurate LCR count. These results indicated that the hardware implementation of
the FRFG module had good accuracy.
4.6 Parameter Specifications and Hardware Usage
The proposed MIMO triply-selective fading emulator is flexible in parameter
selection and can be customized to simulate channel scenarios other than the exam-
ples presented here. Table 1 shows the parameter ranges of the emulator with the
chip clock 퐹푐푙푘 = 50 MHz and symbol duration 푇푠 = 3.69 휇푠. The emulator can
generate triply-selective channels with any PDPs specified in [26]. For systems with
smaller symbol durations, the real-time requirement can be met by increasing the
clock frequency 퐹푐푙푘 or the interpolation rate 푅 such that (푂푃푄)
2/(퐹푐푙푘푅) < 푇푠.
This ensures that the number of output coefficients is 푂푃푄/푇푠 complex samples per
second. The normalized Doppler 푇푠푓푑 covers most of practical channel scenarios,
which is often on the order of 10−6 to 10−2. The product (푂푃푄) is also limited
by 퐹푐푙푘 in this case. Increasing 퐹푐푙푘 to 200 MHz and keeping 푇푠 and 푅 unchanged
lead to max(푂푃푄) = 320 and the on-chip memory is adequate for channels with
푂2 + 푃 2 + 푄2 ≤ 104. The proposed emulator stores the value of 푇푠푓푑 in the Q1.19
format to ensure high accuracy. Each output sample is a complex value whose real
and imaginary parts, 퐻푐(푙, 푘) and 퐻푠(푙, 푘), are represented by the Q4.14 format. This
is sufficient to avoid overflow and to provide the accuracy of 10−4. The hardware
Table 1. Parameter ranges of the proposed emulator with 퐹푐푙푘=50 MHz and 푇푠 =
3.69 휇푠.
Number of Rx, Normalized Complex Output
Tx, and Taps Doppler 푇푠푓푑 Samples/s Accuracy
(푂푃푄) ≤ 160 1/219 ∼ 1 푂푃푄/푇푠 10−4
usage of the MIMO triply-selective fading emulator with 푂 = 푃 = 4 and 푄 = 10 is
summarized in Table 2, where ALUT denotes adaptive look-up table, DLR denotes
60
the dedicated logic register, BM denotes block memory, and DSP means the DSP
blocks (high-speed 18-bit multipliers). The percentage uses of the ALUT, DLR, and
BM were roughly one third of the total hardware resources of the Stratix III FPGA
chip, and the percentage use of the DSP multipliers is about half of the total resource.
It is noted that the C
1
2
퐼푆퐼 Generator consumes the most hardware resources. If the
C퐼푆퐼 coefficients and matrix square root calculation were done externally by software,
the percentage of ALUT, DLR, BM, and DSP would drop dramatically to 11%, 2%,
13%, and 13% of the total resources, respectively.
Table 2. Resource usage of the MIMO triply selective fading emulator on Stratix III
EP3SL150F1152C2N FPGA with 퐹푐푙푘=50 Mhz.




퐼푆퐼 Generator 22636 36586 1194944 143
RNG & FRFG 11988 231 618743 16
CM 648 1111 10304 10
Other 120 429 96098 25
Total 35392 38357 1920089 194
percentage (31%) (34%) (34%) (51%)
The proposed emulator is also compared to other four related emulators of
[21,23,24,25], as shown in Table 3, where LE denotes logic elements in Altera FPGA,
and LC denotes logic cell in Xilinx FPGA. The LE count is converted from ALUT
by ALUT≈1.25LE [32]. Although LE and LC are different, one LE is considered
equivalent to approximately one LC. Even though it requires slightly more hardware
usage, the proposed emulator implements significantly more functionalities than other
emulators, including the on-chip 퐶
1
2
퐼푆퐼 calculation and the CM module incorporating
three correlation matrices into the MIMO fading waveforms.
61



















Block Memory 1920089 Unknown 822484 Unknown 440960
DSP Element 194 Unknown Unknown Unknown 136
Rx×Tx 푂×푃 1 1×1 1×1 4×4 4×4




퐼푆퐼 Calculator Yes No No No No





Inter-tap Correlation Yes 2 Yes 3 Yes 4 No Unclear
Spatial Correlation Yes No No No Unclear
Note: 1. The numbers of Rx, Tx, and taps meet the relationship: (푂푃푄)≤160. 2. The




is calculated on chip. 3. The inter-tap correlation is imple-
mented by upsampling to pass band. 4. The inter-tap correlation is implemented using baseband
upsampling.


















Figure 12. PDF of 푍푐푖 and 푍푠푖. The numbers of trials and samples are the same as
the previous figure.
62










Hardware output, PDF of |Zi|
Theoretical Rayleigh PDF
Figure 13. PDF of ∣푍푖∣, where 푍푖 = 푍푐푖 + 푗푍푠푖. The numbers of trials and samples
are the same as the previous figure.










Theoretical result, LCR of |Zi|
Hardware output, LCR of |Zi|
Figure 14. LCR of ∣푍푖∣. The numbers of trials and samples are the same to the
previous figure.
5 CONCLUSIONS
A hardware implementation scheme has been proposed for discrete-time MIMO
triply selective fading emulators which utilizes a mixed parallel-serial structure to
63
achieve the best tradeoff of hardware usage and output speed. The proposed method
is capable of simulating MIMO triply selective fading channels by combining the inter-
tap and spatial correlation matrices with uncorrelated flat Rayleigh fading waveforms.
The proposed emulator has been implemented on an Altera’s Startix III FPGA devel-
opment kit and meet real-time requirement. The hardware outputs exhibit accurate
correlation properties closely matching the theoretical results.
6 REFERENCES
[1] R. H. Clarke, “A statistical theory of mobile-radio reception,” Bell Syst. Tech.
J., vol. 47, no. 6, pp. 957-1000, Jul./Aug. 1968.
[2] H. C. Park, H. S. Lim, and J. W. Yu, “Performance analysis of single-frequency
CW signal-based I/Q regeneration in five-port junction-based direct receivers on
Rayleigh fading channels,” IEEE Trans. Circuits and Systems II: Express Briefs,
vol. 55, no. 6, pp. 561-565, Jun. 2008.
[3] L. Wang, C. Zhang, and G. Chen, “Performance of an SIMO FM-DCSK com-
munication system,” IEEE Trans. Circuits and Systems II: Express Briefs, vol.
55, no. 5, pp. 457-461, May 2008.
[4] Y. Xia, C.K. Tse, and F.C.M. Lau, “Performance of differential chaos-shift-
keying digital communication systems over a multipath fading channel with delay
spread,” IEEE Trans. Circuits and Systems II: Express Briefs, vol. 51, no. 12,
pp. 680-684, Dec. 2004.
[5] C. Xiao, J.X. Wu, S-Y. Leong, Y.R. Zheng, and K. B. Letaief, “A discrete-
time model for triply selective MIMO Rayleigh fading channels,” IEEE Trans.
Wireless Commun., vol. 3, no. 5, pp. 1678-1688, Sep. 2004.
[6] P. Hoeher, “A statistical discrete-time model for the WSSUS multipath channel,”
IEEE Trans. Veh. Technol., vol. 41, no. 4, pp. 461-468, Nov. 1992.
[7] J.I. Smith, “A computer generated multipath fading simulation for mobile radio,”
IEEE Trans. Veh. Technol., vol. 24, no. 3, pp. 39-40, Aug. 1975.
[8] K.E. Baddour and N.C. Beaulieu, “Autoregressive models for fading channel
simulation,” IEEE Trans. Wireless Commun., vol. 4, no. 4, pp. 1650-1662, July.
2005.
64
[9] C. Komninakis, “A fast and accurate Rayleigh fading simulator,” in Proc. IEEE
Global Telecommun. Conf., San Francisco, CA, Dec. 2003, pp. 3306-3310.
[10] W. C. Jakes, Microwave Mobile Communications, Piscataway, NJ: Wiley-IEEE
Press, 1994.
[11] M. Patzold, U. Killat, F. Laue, and Yingchun Li, “On the statistical properties of
deterministic simulation models for mobile fading channels,” IEEE Trans. Veh.
Technol., vol. 47, no. 1, pp. 254-269, Feb. 1998.
[12] Y. R. Zheng and C. Xiao, “Simulation models with correct statistical properties
for Rayleigh fading channels,” IEEE Trans. Commun., vol. 51, no. 6, pp. 920-928,
Jun. 2003.
[13] C.S. Patel, G.L. Stuber, and T.G. Pratt, “Comparative analysis of statistical
models for the simulation of Rayleigh faded cellular channels,” IEEE Trans.
Commun., vol. 53, no. 6, pp. 1017-1026, Jun. 2005.
[14] K.-W. Yip and T.-S. Ng, “Efficient simulation of digital transmission over WS-
SUS channels,” IEEE Trans. Commun., vol. 43, no. 12, pp. 2907-2913, Dec. 1995.
[15] A. Anastasopoulos and K.M. Chugg, “An efficient method for simulation of fre-
quency selective isotropic Rayleigh fading,” in Proc. IEEE Veh. Technol. Conf.,
Phoenix, AZ, May. 1997, pp. 2084-2088.
[16] E. Chiavaccini and G.M. Vitetta, “GQR models for multipath Rayleigh fading
channels,” IEEE Journal Selected Areas in Communications, vol. 19, no. 6, pp.
1009-1018, Jun. 2001.
[17] B. E. Baddour and N. C. Beaulieu, “Accurate simulation of multiple cross-
correlated Rician fading channels,” IEEE Trans. Commun., vol. 52, no. 11, pp.
1980-1987, Nov. 2004.
[18] A. Alimohammad, S.F. Fard, B.F. Cockburn, and C. Schlegel, “Compact
Rayleigh and Rician fading simulator based on random walk processes,” Com-
munications, IET, vol. 3, no. 8, pp. 1333-1342, Aug. 2009.
[19] R.A. Goubran, H.M. Hafez, and A.U.H. Sheikh, “Implementation of a real-time
mobile channel simulator using a DSP chip,” IEEE Trans. Instrum. Meas., vol.
40, no. 4, pp. 709-714, Aug. 1991.
[20] A. Alimohammad, S.F. Fard, B.F. Cockburn, and C. Schlegel, “A compact single-
FPGA fading-channel simulator,” IEEE Trans. Circuits and Systems II: Express
Briefs, vol. 55, no. 1, pp. 84-88, Jan. 2008.
[21] M.A. Wickert and J. Papenfuss, “Implementation of a real-time frequency-
selective RF channel simulator using a hybrid DSP-FPGA architecture,” IEEE
Trans. Microw. Theory Tech., vol. 49, no. 8, pp. 1390-1397, Aug. 2001.
65
[22] M. Kahrs and C. Zimmer, “Digital signal processing in a real-time propagation
simulator,” IEEE Trans. Instrum. Meas., vol. 55, no. 1, pp. 197-205, Feb. 2006.
[23] Fei Ren and Y.R. Zheng, “A low-complexity hardware implementation of
discrete-time frequency-selective Rayleigh fading channels,” in Proc. IEEE IS-
CAS, Taipei, Taiwan, May. 2009, pp. 1759-1762.
[24] M. Cui, H. Murata, and K. Araki, “FPGA implementation of 4×4 MIMO test-
bed for spatial multiplexing systems,” in Proc. IEEE International Symposium
on Personal, Indoor and Mobile Radio Communications, Barcelona, Spain, Sep.
2004, pp. 3045-3048.
[25] A. Alimohammad, S.F. Fard, B.F. Cockburn, and C. Schlegel, “A novel technique
for efficient hardware simulation of spatiotemporally correlated MIMO fading
channels,” in Proc. IEEE Int. Conf. Commun., Beijing, China, May. 2008, pp.
718-724.
[26] ETSI, “Radio transmission and reception,” GSM 05.05, ETSI EN 300910 V8.5.1,
2000.
[27] G. H. Golub and C. F. Van Loan, Matrix Computations, Baltimore, MD: John
Hopkins Univ. Press, 1996.
[28] A. Alimohammad and B.F. Cockburn, “Modeling and hardware implementation
aspects of fading channel simulators,” IEEE Trans. Veh. Technol., vol. 57, no. 4,
pp. 2055-2069, Jul. 2008.
[29] P. L’ Ecuyer, “Tables of maximally equidistributed combined LFSR generators,”
Math. of Comp., vol. 68, no. 225, pp. 261-269, Jun 1999.
[30] A. Ahmedsaid, A. Amira, and A. Bouridane, “Efficient systolic array for singular
value and eigenvalue decomposition,” in Proc. IEEE International Symposium
on Micro-NanoMechatronics and Human Science, Nagoya, Japan, Oct. 2003, pp.
835-838.
[31] I. Bravo, P. Jimenez, M. Mazo, J.L. Lazaro, and A. Gardel, “Implementation in
FPGAs of Jacobi method to solve the eigenvalue and eigenvector problem,” in
Proc. International Conference on Field Programmable Logic and Applications,
Madrid, Spain, Aug. 2006, pp. 1-4.
[32] Altera Corporation, “Nios II performance benchmarks,” Online:
http://www.altera.com/literature/ds/ds nios2 perf.pdf, Oct. 2009.
66
PAPER
III. HARDWARE IMPLEMENTATION OF TRIPLY SELECTIVE
RAYLEIGH FADING CHANNEL SIMULATORS
Fei Ren and Yahong Rosa Zheng
Abstract—In this paper, we implement a real-time hardware triply selective Rayleigh
fading simulator. This simulator incorporates the inter-tap and spatial correlation
matrices into multiple uncorrelated frequency-flat Rayleigh fading waveforms (in-
cluding temporal correlation) to simulate a multiple-input multiple-output (MIMO)
triply selective Rayleigh fading channel. In the correlation incorporation procedure,
this simulator uses a Kronecker product method to save a large amount of hardware
memories. Occupying 34% hardware resources of one Stratix III FPGA chip, this
simulator can simulate 4 × 4 MIMO fading channels with 10 correlated delay-taps
per subchannel in real-time for a symbol rate of 3.69 휇푠. Accuracy of this simula-
tor is proved by comparing the statistical properties of its outputs to corresponding
theoretical values, and they match perfectly.
1 INTRODUCTION
Wireless fading channel modeling and simulation are very useful for testing
and verifying communication algorithm design, transceiver products, and channel
capacity analysis. However, the software simulators based on general purpose proces-
sors are slow and difficult to meet a real-time simulation requirement. The hardware
simulator, which is based on low-cost FPGA and DSP chips, is a preferred solution
for the real-time fading channel simulation.
67
One significant statistical property for wireless fading channel models is the
correlation of fading channels waveforms. The subchannels of a MIMO Rayleigh fad-
ing channel are time-selective (described by temporal correlation), frequency-selective
(exhibiting inter-tap correlation), and space-selective (associated with spatial corre-
lation of transmitters and receivers). This is referred to as the triply selective fading
channel containing three types of correlations.
A discrete-time MIMO triply selective Rayleigh fading channel model and soft-
ware simulation are proposed by [1]. But hardware implementation of MIMO triply
selective simulators presents some challenges in accurately computing and incorpo-
rating three types of correlations into the discrete-time model. Current reported
hardware MIMO simulators do not implement all three types of correlations and may
result in inaccurate channel characteristics. For example, the simulator in [2] outputs
multiple uncorrelated frequency-flat Rayleigh fading waveforms as MIMO subchan-
nels; while another simulator in [3] attempts to incorporate the inter-tap and spatial
correlation matrices into multiple frequency-flat Rayleigh fading waveforms.
In this paper, we propose a hardware implementation method for the discrete-
time MIMO triply selective fading simulator on a Stratix III FPGA DSP development
kit. This simulator implements all three types of correlations of triply selective chan-
nels. The frequency-flat Rayleigh fading waveforms with temporal correlation or
Doppler spectrum are generated using a Sum-of-Sinusoids (SOS) method. The inter-
tap correlation matrix associated with multipath delay spread is computed according
to a channel power delay profile (PDP) and transmit/receive filters. The spatial cor-
relation matrices, including the transmit correlation and receive correlation matrices,
are pre-defined inputs associated with antenna arrangements. The matrix square
roots of correlation matrices are calculated using an eigenvalue decomposition (EVD)
method. Then they are combined with multiple uncorrelated frequency-flat Rayleigh
fading waveforms using the Kronecker product and vector multiplicity. The results of
68
the Kronecker product are computed in real-time for saving hardware memory. Sta-
tistical properties of simulator outputs are analyzed and compared to corresponding
theoretical ones for performance evaluation.
2 DISCRETE-TIME MIMO TRIPLY SELECTIVE RAYLEIGH
FADING MODEL
With accurate statistical properties and computational efficiency for hardware
implementation, the discrete-time MIMO triply selective fading model in [1] is chosen
as the basis of our hardware implementation. In [1], the MIMO channel matrix at
time instant 푘 and delay tap 푞 can be represented as an (푂푃푄)× 1 coefficient vector
h푣푒푐(푘), which is defined as
h푣푒푐(푘) = [h1,1(푘), ...,h1,푃 (푘) ∣ ... ∣ h푂,1(푘), ...,h푂,푃 (푘)]푡 (1)
where 푃 and 푂 are the numbers of transmit and receive antennas, respectively (Note we
assume the sampling interval being 푇푠). The vector h표,푝(푘) is the (표, 푝)-th subchannel FIR
coefficient vector at time instant 푘, which is given by
h표,푝(푘) = [ℎ표,푝(−푄1, 푘), ..., ℎ표,푝(푞, 푘), ..., ℎ표,푝(푄2, 푘)] (2)
where 푄1 and 푄2 are nonnegative integers representing the range of 푞, and 푄 = 푄1+푄2+1.














where ⊗ denotes the Kronecker product; X 12 is the square root of matrix X=X 12 ⋅ (X 12 )ℎ;
the matrices Ψ푅푥 and the matrix Ψ푇푥 are the spatial correlation matrices determined
69
by the transmit and receive antennas, respectively; C퐼푆퐼 is the inter-tap correlation ma-
trix; the vector Φ(푘) is an (푂푃푄)×1 vector and Φ(푘)=[푍1(푘), 푍2(푘), ..., 푍(푂푃푄)(푘)]푡, where
푍푖(푘)=푍푐푖(푘)+푗푍푠푖(푘) is one of multiple uncorrelated frequency-flat Rayleigh fading wave-
forms. Each frequency-flat Rayleigh fading waveform 푍푖(푘) can be efficiently simulated by
the SOS method proposed in [4].
For the proposed simulator, the square roots of spatial correlation matrices Ψ푅푥








푐(푄2,−푄1) ⋅ ⋅ ⋅ 푐(푄2, 푄2)
⎞
⎟⎟⎟⎟⎠ (4)




휎2푛푅푃푇푃푅(푞1푇푠 − 휏푛)푅∗푃푇푃푅(푞2푇푠 − 휏푛) (5)
where 푅푃푇푃푅(휉) is the convolution function of the transmit filter and receiver filter; 푁 is
the number of total resolvable paths in PDPs; ∗ is the conjugate operator. Parameters 휎푛




푛훿(휏 −휏푛), which are often
specified by communication standards like [5].
3 HARDWARE IMPLEMENTATION METHOD
Our proposed hardware simulator can output h푣푒푐(푘) in real-time. For the conve-
nience of description, we give new indices elements of h푣푒푐(푘),
퐻(푙, 푘) = ℎ표,푝(푞, 푘) (6)
70
where 푙 = 푄 ⋅ [(표 − 1) ⋅ 푃 + (푝 − 1)] + (푞 + 푄1 + 1). Therefore, the vector h푣푒푐(푘) can be
described as
h푣푒푐(푘) = [퐻(1, 푘),퐻(2, 푘), ...,퐻((푂푃푄), 푘)]
푡 (7)
where 퐻(푙, 푘) is a complex fading coefficient and 퐻(푙, 푘)=퐻푐(푙, 푘) + 푗퐻푠(푙, 푘).
The proposed hardware simulator consists of four major modules, as shown in
Fig. 1. The flat Rayleigh fading generator (FRFG) module generates multiple uncorre-




퐼푆퐼 generator module computes the inter-tap correlation matrix and its square root.
The correlation multiplier (CM) module implements the Kronecker product and vector mul-
tiplicity in hardware to perform (3). The interpolator module linearly interpolates samples
with an interpolation rate 푅 to increase the output speed for meeting real-time requirement.
Among the four modules, the C
1
2
퐼푆퐼 generator and CM module are novel hardware
implementations proposed by our paper. The FRFG and interpolator module are similar




퐼푆퐼 generator module consists of two submodules: the C퐼푆퐼 generator module
that computes the coefficients of C퐼푆퐼 according to (4) and (5), and the matrix square root




module is shown in Fig. 2. The Counters 푞1 and 푞2 are up counters with same output
ranges from −푄1 to 푄2. The Counter 푞1 increases by one in every (푁푄) basic clock periods
(BCP); while the Counter 푞2 increases by one in every 푁 BCPs. Two buffers store 휎
2
푛 and




computed using a lookup table (LUT) scheme. We note the result of 푅푃푇푃푅(휉) is always a
real value, which causes 푅푃푇푃푅(휉) = 푅
∗
푃푇푃푅
(휉). Besides, 푅푃푇푃푅(휉) is an even function, so
the size of the LUT can be reduced to half by only storing the result of 푅푃푇푃푅(휉) where 휉
is a nonnegative value. In our implementation, the LUT 푅1 is a 퐷3-entry LUT and store
the results of 푅푃푇푃푅(휉) where 휉=(0 :
4푇푠
퐷3−1
: 4푇푠). The LUT 푅2 is a copy of 푅1. If the
results of ∣푞1푇푠 − 휏푛∣ and ∣푞2푇푠 − 휏푛∣ are larger than 4푇푠, the 푅1 and 푅2 output zeros. If
71
not, they are converted into the proper read addresses of 푅1 and 푅2. The outputs of 푅1
and 푅2, and corresponding 휎
2
푛 are multiplied together. In every 푁 BCPs, the accumulator
sums 푁 previous inputs to obtain one coefficient of C퐼푆퐼 , 푐(푞1, 푞2). The C퐼푆퐼 generator
module sequentially outputs the coefficients of C퐼푆퐼 in a row-wise order.
The MSR module employs the EVD method to find the matrix square root of
C퐼푆퐼 [6]. We note C퐼푆퐼 is a symmetric positive definite matrix, whose eigenvalues are
always positive. The coefficients of C퐼푆퐼 are stored in a buffer and sent to the EVD module
which performs EVD. We employ the Jacobi rotation algorithm to perform EVD, since it is
a well-known and accurate method for hardware implementation, the details of which are







. The coefficients of D퐶퐼푆퐼 sequentially pass through a square












V−1퐶퐼푆퐼 are computed using two matrix multiplier modules. Eventually,
the MSR module sequentially outputs them in the row-wise order.
The CM module incorporates the inter-tap and spatial correlation matrices into
multiple uncorrelated frequency-flat Rayleigh fading waveforms. It consists of two submod-










퐼푆퐼 to obtain C
1
2








ℎ (0), but employs the KP module to compute it in real-time. The datapath of the CM







푇푥(푃 × 푃 ), and C
1
2
퐼푆퐼(푄×푄) in the row-wise order. Several counters, mul-
tipliers, and adders work together to generate the proper read addresses for three RAMs.
The clock periods of Counters 1 − 6 are measured by integer BCPs; while their modulo is
related to 푂, 푃 , and 푄. Two multipliers are employed to multiply outputs of three RAMs
together. Their results are the coefficients of the matrix C
1
2
ℎ (0) in the row-wise order.
The VM module takes multiple uncorrelated Rayleigh fading waveforms 푍푐푖(푅푘)
and 푍푠푖(푅푘) from the FRFG module, and rearranges their order by using two buffers.
Taking the buffer storing 푍푐푖(푅푘) for example, the buffer stores the sequence: 푍푐1(푅푘),
72
Figure 1. Hardware implementation block diagram of the triply selective fading
simulator.
푍푐2(푅푘), ..., 푍푐(푂푃푄)(푅푘), repeatedly outputs it (푂푃푄) times, and then do the same to
the next sequence: 푍푐1(푅(푘 + 1)) ,..., 푍푐(푂푃푄)(푅(푘 + 1)). The outputs of two buffers are
separately multiplied by the coefficients of C
1
2
ℎ (0). In every (푂푃푄) BCPs, the accumulator
sums the (푂푃푄) previous inputs to obtain one single 퐻푐(푙, 푅푘) or 퐻푠(푙, 푅푘). Therefore, it
takes (푂푃푄) BCPs to generate one single 퐻푐(푙, 푅푘) or 퐻푠(푙, 푅푘), and (푂푃푄)2 BCPs to
generate all 퐻푐(푙, 푅푘) or 퐻푠(푙, 푅푘) where 푙 ranges from 1 to (푂푃푄) and 푘 is fixed.
4 EXAMPLES AND PERFORMANCE EVALUATION
The discrete-time MIMO triply selective fading simulator was implemented on an
Altera Stratix III EP3SL150F1152C2N FPGA DSP development kit. We used Quartus II
version 8.0, DSP Builder version 5.0, and Matlab Simulink for this development.




ℎ (0) in real-time without storing. An alternative method is the pre-
compute and store method where the matrix C
1
2
ℎ (0) is pre-computed by software and stored
in hardware memory for further access. The memory usage comparison for the two methods
is shown in Fig. 4. The y-axis represents the memory usage measured in log, and the x-axis
represents the size of C
1
2
퐼푆퐼 , 푄. For the fixed 푂 and 푃 , compared to the pre-compute and
73




store method, the KP method occupies much less hardware memory, which increases slowly
as 푄 increases.
The performance of the simulator was evaluated through a hardware implementation
example with specified parameters. The Matlab simulation and theoretical results with
identical parameters have been reported by [1]. We analyzed the statistical properties of
hardware outputs and compared them to the theoretical ones. The size of Ψ푇푥, Ψ푅푥, and















Figure 3. The datapath of the CM module.
The PDP was an exponential function for 0≤휏푛≤5 휇푠. The transmit filter was a linearized
Gaussian filter with a time-bandwidth product 0.3, and the receive filter was an SRC
filter with a roll-off factor 0.3. Other implementation parameters were: 퐹푐푙표푐푘=50푀ℎ푧,
푇푠=3.69 휇푠, 푓푑푇푠=0.001, and the interpolation rate 푅=140.
The proposed hardware simulator met the real-time requirement and output 4.34×106
correlated fading complex coefficients per second. When simulated by matlab, this fading
channel scenario took approximate 1 second to output these coefficients. All outputs have
75




















Pre−compute and Store Method
KP Method
Figure 4. The memory usage comparison of the pre-compute and store method and
the proposed KP method.
the fixed-point format Q4.14, which is long enough to provide high accuracy and avoid over-
flow. Based on the hardware outputs, the auto/cross-correlation between several triply se-
lective channels was computed and depicted in Fig. 5. The matrix C퐼푆퐼 with 푞1 = −1, 0, 1, 2




0.0091 0.0426 0.0178 −0.0016
0.0426 0.3664 0.3407 0.0367
0.0178 0.3407 0.5583 0.1414
−0.0016 0.0367 0.1414 0.0602
⎞
⎟⎟⎟⎟⎟⎟⎟⎟⎠
Therefore, three theoretical curves were 0.5583, 0.3407 and –0.1036 multiplying by 퐽0[2휋푓푑(푘1−
푘2)푇푠], respectively. As can be seen, the correlation curves of hardware outputs matched
them very well.
We evaluated hardware resource usage using a hardware implementation example
with O=P=4 and Q=10. Hardware usage is summarized in Table 1, where ALUT denotes
76










Normalized Time Lag: kfdTs
 
 
Theoretical Result, auto−corr of h1,1(1,k)
Hardware Output Result, auto−corr of h1,1(1,k)
Theoretical Result, xcorr of h1,1(0,k) and h1,1(1,k)
Hardware Output Result, xcorr of h1,1(0,k) and h1,1(1,k)
Theoretical Result, xcorr of h1,1(0,k) and h2,1(1,k)
Hardware Output Result, xcorr of h1,1(0,k) and h2,1(1,k)
Figure 5. The auto-correlation of ℎ1,1(1, 푘), the cross-correlation between ℎ1,1(0, 푘)
and ℎ1,1(1, 푘), and cross-correlation between ℎ1,1(0, 푘) and ℎ2,1(1, 푘). The channel
index is according to (2). The results are based on hardware outputs of 50 trials with
2.8× 104 samples in each channel per trial.
adaptive look-up table, DLR is dedicated logic register, BM denotes block memory, and
DSP means the DSP blocks (high-speed 18-bit multipliers). The percentage uses of total
hardware resources were roughly one third for ALUT, DLR, and BM of one Stratix III
FPGA chip and slightly more than a half of the DSP multipliers were utilized.
Table 1. Hardware usage of the simulator on a Stratix III EP3SL150F1152C2N FPGA
chip.




퐼푆퐼 Generator 22636 36586 1194944 143
FRFG 11988 231 618743 16
CM 648 1111 10304 10
Other 120 429 96098 25
Total 35392 38357 1920089 194
(31%) (34%) (34%) (51%)
77
5 CONCLUSIONS
A hardware discrete-time MIMO triply selective Rayleigh fading simulator has been
implemented on an Altera Startix III FPGA DSP development kit. This simulator is capable
of simulating MIMO triply selective fading channels with all three types of correlations in
real-time. The outputs of the simulator are evaluated and proved to contain accurate
statistical properties as expected.
6 REFERENCES
[1] C. Xiao, J.X. Wu, S-Y. Leong, Y.R. Zheng, and K. B. Letaief, “A discret-time model
for triply selective MIMO Rayleigh fading channels,” IEEE Trans. Wireless Commun.,
vol. 3, no. 5, pp. 1678-1688, Sep. 2004.
[2] M. Cui, H.Murata, and K. Araki, “FPGA implementation of 4×4 MIMO test-bed for
spatial multiplexing systems,” in Proc. IEEE ISPIMRC., Barcelona, Spain, Sep. 2004,
pp. 3045-3048.
[3] A. Alimohammad, S.F. Fard, B.F. Cockburn, and C. Schlegel, “A novel technique for
efficient hardware simulation of spatiotemporally correlated MIMO fading channels,”
in Proc. IEEE ICC., Beijing, China, May. 2008, pp. 718-724.
[4] Y. R. Zheng and C. Xiao, “Simulation models with correct statistical properties for
Rayleigh fading channels,” IEEE Trans. Commun., vol. 51, no. 6, pp. 920-928, Jun.
2003.
[5] ETSI, “Radio transmission and reception,” GSM 05.05, ETSI EN 300910 V8.5.1, 2000.
[6] G. H. Golub and C. F. Van Loan, Matrix Computations, Baltimore, MD: John Hopkins
Univ. Press, 1996.
[7] I. Bravo, P. Jimenez, M. Mazo, J.L. Lazaro, and A. Gardel, “Implementation in FPGAs
of Jacobi method to solve the eigenvalue and eigenvector problem,” in Proc. Interna-




IV. A LOW-COMPLEXITY HARDWARE IMPLEMENTATION
OF DISCRETE-TIME FREQUENCY-SELECTIVE
RAYLEIGH FADING CHANNELS
Fei Ren and Yahong Rosa Zheng
Abstract—A low-complexity hardware implementation method is proposed for discrete-
time frequency-selective Rayleigh fading channels. The proposed method first employs the
Sum-of-Sinusoids method to generate multiple independent flat fading channel responses,
then utilizes a simple weight-delay-sum filtering method to incorporate the fractionally-
delayed multipath rays into inter-tap correlated tap gains. It thus achieves accurate corre-
lation properties in both inter-tap correlation and temporal correlation (or Doppler spec-
trum). The proposed method is implemented by an Altera Stratix II FPGA development
kit and the results show excellent performance match with those by MATLAB software
simulations.
1 INTRODUCTION
Wireless fading channel modeling and simulation provide a low-cost means for test-
ing and verification of transceiver products, new algorithm design, and channel capacity
analysis. A most commonly used model is the Rayleigh fading Wide-Sense Stationary Un-
correlated Scattering (WSSUS) channel which is often simulated by one of the two methods:
the Sum-of-Sinusoid and the Doppler spectrum filtering method [1]. Hardware and software
implementations of frequency-flat fading channels have been well studied and reported by,
for example, [1,2,3,4] and reference herein. Software implementation of frequency-selective
fading channels has also been well investigated [5, 6, 7]. However, hardware simulation of
frequency-selective fading channels still presents some challenge in computational complex-
ity and simulation accuracy [8, 9]. The most difficult aspect of frequency-selective fading
79
channel simulation is to accurately compute and incorporate the cross-correlation between
multiple channel taps in the discrete-time model. Although the WSSUS model assumes
multiple uncorrelated rays, the sampled discrete-time channel taps are often correlated due
to the bandpass nature of wireless communications systems. Many current hardware imple-
mentations fail to consider these correlation and result in inaccurate channel characteristics.
In this paper, we propose a simple and elegant method to incorporate inter-tap cor-
relation for hardware implementation of discrete-time frequency-selective fading channels.
The proposed method employs the weight-delay-sum filtering method [10] to implement the
fractional delays of the multiple WSSUS rays. It combines the weight-delay-sum method
with SoS flat fading simulators and ensures low-complexity for real-time hardware imple-
mentation. The proposed simulation method is implemented by Altera’s Stratix II Field
Programmable Gate Array (FPGA) development kit. The results show excellent perfor-
mance match with those of MATLAB software implementation. The proposed method has
advantages in low computational complexity, fast data rate, and more accurate waveforms
and correlation properties, in comparison with existing hardware implementation methods.
2 DISCRETE-TIME FREQUENCY-SELECTIVE FADING
CHANNEL MODELS
The frequency-selective Rayleigh fading channel model is often expressed as the




푃푖푔(휏 − 휏푖) exp[−푗(휔푖(푡− 휏푖)− 휙푖)] (1)
where 푃푖, 휔푖, and 휏푖 are the 푖-th multipath gain, angular Doppler frequency, and relative
delay, respectively. The pulse shaping filter 푔(휏) is a bandpass filter often implemented by
a raised cosine filter [1]. The multipath gains 푃푖 are normalized to yield unit total power
80
of the response. It is commonly assumed that the multiple rays in (1) are Wide-Sense
stationary uncorrelated scattering (WSSUS).
When the delay spread 휏푚푎푥 − 휏푚푖푛 is much smaller than the symbol interval 푇푠푦푚,




푃푖푔(푡) exp[−푗(휔푖푡− 휙푖)] (2)
If sampled at 푇푠푦푚 interval, the discrete-time flat fading channel can be efficiently
simulated by several SoS models [1, 3] and a typical one is














cos(휔푑푘 sin훼푛 + 휑푛),
훼푛 =
2푛휋 − 휋 + 휃
4
, 푛 = 1, 2, ⋅ ⋅ ⋅ ,푀.
where 휔푑 is the maximum angular Doppler frequency, 푀 is the total number of sinusoids,
and 푗 =
√−1. The angle of arrival 훼푛 is randomized by a uniformly-distributed 휃, and 휙푛
and 휑푛 are the random phases of the in-phase and quadrature components, respectively.
The random variables 휙푛, 휑푛, and 휃 are statistically independent and uniformly distributed
on [−휋, 휋) for all n.
When the channel coherence time is comparable to or larger than the symbol in-
terval, the fading channel is frequency-selective and inter-symbol interference often spans





푃푖푔(푙푇푠 − 휏푖)푍푖(푘), (4)
81
where 푍푖(푘), 푖 = 1, ⋅ ⋅ ⋅ , 퐼, are independent flat fading CIRs generated by (3). However, the
multipath delays 휏푖 are often fractions of the symbol interval. Sampling the fractional delays
at 푇푠푦푚 (or at 푇푠 = 푇푠푦푚/푈 , where typically the upsampling rate 푈 ∈ [1, 10].) results in







푃푖푃푘푔(푙1푇푠 − 휏푖)푔†(푙2푇푠 − 휏푘), (5)
note that 푅푔푔(휉) = 퐸[푔(휏)푔
†(휏 + 휉)] is the autocorrelation of the bandpass filter 푔(휏). The
resulting discrete-time power delay profile is shown in Fig.1.
Several methods have been proposed to incorporate the inter-tap correlation in
frequency-selective channel modeling including the spectrum factorization method [7] and
the correlation matrix factorization method [5, 6]. It has been shown that these meth-
ods yield accurate channel models with low computational complexity in software-based
simulation. However, the evaluation of correlation coefficients, and the spectrum and/or
correlation matrix factorization are costly in hardware implementation. Therefore, we pro-
pose a simple weight-delay-sum filtering method [10] to implement the fractional delays,









where 푙푖 = ⌊휏푖/푇푠⌋, and 퐸푙,푖 = 푔(푙푇푠 − 휏푖) are 푇푠-spaced samples of the delayed bandpass
filter, as shown in Fig. 2, where the raised cosine pulse is truncated to ±퐿푔푇푠 with 퐿푔 = 3.
The simple weight-delay-sum method captures the inter-tap correlation of frequency-
selective channels with very low computational complexity. The tradeoff is that it requires
퐼 independent flat fading waveforms rather than 퐿 = 2퐿푔 + 1 + ⌈휏푚푎푥/푇푠⌉ required in the
correlation matrix factorization method [5]. In practice, the number of multipath 퐼 is often
slightly larger than the total number of taps 퐿.
82




























































Figure 1. (a) A typical urban channel PDP with multiple WSSUS rays. (b) Average
power/tap of 푇푠-spaced discrete-time channel response.










Tansmitter Pulse Shaping Filter 
Waveform after Right Shifted 0.7Tsym
τi=0.7Tsym




For real-time hardware implementation, frequency-selective channel waveforms must





퐻(푙, 푘) ⋅ 푥(푘 − 푙) + 푣(푘), (7)
where 푥(푘) is the transmitted signal and 푣(푘) is the background white Gaussian noise. If
the symbol interval 푇푠푦푚 = 1휇푠 and the upsampling rate is 푈 = 10, then 퐿 samples of
퐻(푙, 푘) are needed for every 푇푠 = 0.1휇푠, where 퐿 is on the order of tens. This requirement
is stringent for sample-by-sample processing. However, in modern communications systems,
block transmission is often employed and channel response is often slowly time varying. We
exploit this feather and propose an efficient implementation with block processing.
The proposed hardware implementation scheme consists of three major blocks: a
parameter generator bank, a flat fading generator, and a selective fading generator module,
as shown in Fig. 3, where MUX is a multiplexer. The parameter generator bank generates
and stores all random variables needed for each of the 퐼 WSSUS rays. These include the ran-
dom phase vectors Φ푖 = [휙1,푖, 휙2,푖, ⋅ ⋅ ⋅ , 휙푀,푖] and Ψ푖 = [휑1,푖, 휑2,푖, ⋅ ⋅ ⋅ , 휑푀,푖], the maximum
Doppler frequencies 휔푑,푖, random phases 휃푖, and the power delay profile vectors P = {푃푖}
and D = {푡푎푢푖}. The parameter generator bank also computes and stores the quantities
cos훼푛,푖 and sin훼푛,푖 for all 푛 and 푖. The multiplexer selects the parameters of the 푖-th ray
and sends them to the flat fading generator in series. The flat fading generator generates
the real and imaginary components of the 푖-th flat fading channel responses according to (3)
and outputs 푍푐푖(푘) and 푍푠푖(푘) to two buffers of the selective fading generator. When the
푘-th flat fading samples of all 퐼 rays are ready at the buffers, the selective fading generator
processes them with the weight-delay-sum filtering method according to (6).
84
Figure 3. Block diagram of FPGA implementation of the frequency-selective Rayleigh
fading simulator.
The implementation of the parameter generator bank is straightforward with several
uniform random number generators and the sine and cosine functions are generated by Look
Up Tables (LUT). The flat fading generator is implemented as in Fig. 4, where 푀 cosine
functions are summed in series to generate the real/imaginary component of the fading
response. Flexible data formats are used for different parameters according to their fixed-
point precision. For example, the random phase/Doppler parameters use the format (3:20),
the number 푀 uses (2:10), the time-index 푘 uses (21:30), and the channel responses use
(3:20). Thus, accuracy of output can reach 2−20 ≈ 10−6.
The selective fading generator is the core module of the simulator and its structure
is shown in Fig. 5. The 푖-th flat fading channel responses are multiplied with its gain
푃푖 according to the PDP specifications prior to be stored in the buffers. The weights
퐸푙,푖 = 푔(푙푇푠 − 휏푖) are computed through multiple LUTs which store the raised cosine pulse
for 휏 = [−퐿푔푇푠푦푚 : 퐿푔푇푠푦푚] at a high resolution. The LUTs takes the delay parameter
퐷푖 = 휏푖 as the inputs and then outputs the corresponding weights 퐸푙,푖 to the multipliers
(MUL). Multiple MULs are used to weigh the corresponding flat fading rays in parallel. The
accumulators implement the summation of (6) and output a block of 퐻푐(푙, 푘) and 퐻푠(푙, 푘)
in parallel.
85
Figure 4. FPGA implementation of the flat fading generator module.
4 IMPLEMENTATION EXAMPLE AND PERFORMANCE
EVALUATION
The proposed frequency selective fading channel simulator was implemented by an
Altera Stratix II FPGA/DSP development kit. We used Quartus II version 8.0 and DSP
Builder version 5.0 for this development. DSP Builder provides a nice interface between the
FPGA hardware and MATLAB Simulink so that the parameters of channel specifications
were easily input to the channel simulator, and the outputs of the channel simulator were
logged in data files in Simulink.
As an example, results for a typical urban channel model of 20 WSSUSrays is
presented here. The implementation parameters were: the number of sinusoid 푀 = 16, the
upsampling rate 푈 = 10, the output block size is 10 × 1 per accumulator, and the channel
length 퐿 = 60 (in terms of 푇푠). When the clock period of the FPGA chip is set to 20ns,
it meets the real-time requirements for symbol interval 푇푠푦푚 = 6.4휇푠. The logic utilization
of the single FPGA chip was 33%, including 15704 (31%) combinational ALUTs and 1383
(2%) dedicated logic registers. Total block memory bits occupied was 822484 (32%). The
86
Figure 5. FPGA implementation of frequency-selective fading generator module.
proposed low-complexity hardware implementation occupies less than 1/3 resources on the
single FPGA chip.
The performance of the hardware simulator was evaluated by its output waveforms.
First, the auto- or cross-correlation of the flat fading generators 푍푐푖(푘) and 푍푠푖(푘) are
computed by averaging over five trails and each trial generated 푘 = 2 × 106 samples. The
results are shown in Fig. 6.
The cross-correlation between 퐻푐(푙, 푘) and 퐻푐(푙 + (1, 2, 4), 푘) are shown in Fig. 7.
When the accuracy of MATLAB simulations is set to 10−6, which is the same to the accuracy
of FPGA outputs. All FPGA outputs match MATLAB simulations very well.
87
















Figure 6. Autocorrelation and cross-correlation of the 푖-th flat fading ray sampled
at at 푇푠푦푚 interval. The normalized Doppler frequency was 푓푑푇푠푦푚 = 0.0008 and
푓푑 = 125 Hz.









Nomalized Time Lag: kfdTs
 
 
Cross−correlation between Hc(l,k) and Hc(l+1,k)
Cross−correlation between Hc(l,k) and Hc(l+2,k)
Cross−correlation between Hc(l,k) and Hc(l+4,k)




A low-complexity FPGA implementation of frequency selective Rayleigh fading
channels has been proposed, which employs a simple weight-delay-sum processing to incor-
porate the inter-tap correlation of discrete-time channel models. The proposed simulator
has been implemented on Altera’s Startix II development kits. The results of the hardware
simulator match those by the software simulation. The advantages of the proposed simulator
include its flexibility for parameter change and its simple, compact implementation.
6 REFERENCES
[1] W.C. Jakes, Microwave Mobile Communications, Piscataway, NJ: Wiley-IEEE Press,
1994.
[2] C.S. Patel, G.L. Stuber, and T.G. Pratt, “Comparative analysis of statistical models
for the simulation of Rayleigh faded cellular channels,” IEEE Trans. Commun., vol.
53, no. 6, pp. 1017-1026, Jun. 2005.
[3] Y.R. Zheng and C. Xiao, “Improved models for the generation of multiple uncorrelated
Rayleigh fading waveforms,” IEEE Trans. Communications Letters, vol. 6, no. 6, pp.
256-258, Jun. 2002.
[4] A. Alimohammad, S.F. Fard, B.F. Cockburn, and C. Schlegel, “A compact single-
FPGA fading-channel simulator,” IEEE Trans. Circuits and Systems II: Express Briefs,
vol. 55, no. 1, pp. 84-88, Jan. 2008.
[5] C. Xiao, J.X. Wu, S-Y. Leong, Y.R. Zheng, and K.B. Letaief, “A discret-time model
for triply selective MIMO Rayleigh fading channels,” IEEE Trans. Wireless Commun.,
vol. 3, no. 5, pp. 1678-1688, Sep. 2004.
[6] K.-W. Yip and T.-S. Ng, “Efficient simulation of digital transmission over WSSUS
channels,” IEEE Trans. Commun., vol. 43, no. 12, pp. 2907-2913, Dec. 1995.
[7] A. Abdi, “Stochastic modeling and simulation of multiple-input multiple-output chan-
nels: A unified approach,” in Proc. IEEE Intl. Symp. Antennas Propagat., Monterey,
CA, Jun. 2004, pp. 3673-3676.
[8] M. Kahrs and C. Zimmer, “Digital signal processing in a real-time propagation sim-
ulator,” IEEE Trans. Instrumentation and Measurement, vol. 55, no. 1, pp. 197-205,
Feb. 2006.
89
[9] M.A. Wickert and J. Papenfuss, “Implementation of a real-time frequency-selective RF
channel simulator using a hybrid DSP-FPGA architecture,” IEEE Trans. Microwave
Theory and Techniques, vol. 49, no. 8, pp. 1390-1397, Aug. 2001.
[10] T.I. Laakso, V. Valimaki, M. Karjalainen, and U.K Laine, “Splitting the unit delay




V. VALIDATION OF THE TRIPLY SELECTIVE FADING CHANNEL
MODEL THROUGH A MIMO TEST BED
AND EXPERIMENTAL RESULTS
Saurav Subedi, Huang Lou, Fei Ren, Mingxi Wang, and Y. R. Zheng
Abstract—Multiple-input multiple-output (MIMO) channel is often triply selective, mean-
ing that it has spatial, temporal and inter-tap correlation. The temporal correlation is well
characterized by its Doppler spectrum, but spatial and inter-tap correlation and their im-
pact on MIMO channels are less studied in the literature. A MIMO testbed has been
established to measure the impulse response of MIMO channels and an estimation method
is developed to quantitatively measure the correlation matrices from experimental data.
1 INTRODUCTION
The multiple-input multiple-output (MIMO) channel is analyzed as triply selec-
tive fading channel in existing literatures, [1], [2]. This model accounts for space-selective,
time-selective and frequency-selective nature of MIMO channels. It is shown in [1] that
correlation between channel coefficients of the discrete-time MIMO channel can be written
as a Kronecker product of temporal correlation, inter-tap correlation and spatial correla-
tions. It is argued in [2] that this model is not accurate and the Kronecker product for
the spatial correlations, in general, does not hold in the case of frequency selective channel.
The underlying assumptions in [1] are clarified and some emphatic conclusions are drawn
in [3] to approve the accuracy of this discrete time model for MIMO triply selective fading
channels. A general space-time cross-correlation function incorporating a wide range of
parameters of the MIMO fading channel is proposed in [4]. In [5], the vector autoregressive
(AR) stochastic models are proposed to simulate multiple cross-correlated Rician fading
91
channels. The joint effect of spatial and temporal correlation is studied in [6] and analysis
of ergodic capacity of a MIMO channel is presented based on the transmit and receive an-
tenna correlations matrices.
This paper validates the triply selective fading channel model through experimental
results. We verify the results through the decomposition of the channel coefficient covari-
ance matrix into its Kronecker factors. Approaches for decomposition of Kronecker product
into its components are suggested in [7]. However, those methods are applicable only for
real matrices. In this paper, we propose a method for approximating factors of a Kronecker
product, real or complex. Experimental data from a MIMO testbed is used to estimate
the channel impulse response (CIR) and quantitatively estimate the spatial and inter-tap
correlation matrices.
This paper is organized as follows. Section II reviews the discrete-time MIMO
triply selective fading channel model. Section III describes about the MIMO testbed and
the experimental setup. Section IV details the channel estimation procedure, the correlation
matrices estimation procedure, results and analysis. Finally, section V concludes the paper.
2 DISCRETE-TIME TRIPLY SELECTIVE FADING MODEL





H(푘, 푞) ⋅ x(푘 − 푞) + v(푘) (1)
where 푘 is the time index, 푄1 and 푄2 are non-negative integers representing the range of
delay taps yielding the total channel length 푄 = 푄1+푄2+1, x(푘) = [푥1(푘), 푥2(푘), ....푥푃 (푘)]
푡
is the transmitted signal vector, y(푘) = [푦1(푘), 푦2(푘), ....푦푂(푘)]
푡 is the received vector and
92
v(푘) = [푣1(푘), 푣2(푘), ....푣푂(푘)]
푡 is the additive white gaussian noise. The superscript (.)푡
notation represents the matrix transpose operator.









ℎ푂,1(푘, 푞) ⋅ ⋅ ⋅ ℎ푂,푃 (푘, 푞)
⎞
⎟⎟⎟⎟⎠ (2)
We reshape the matrix H(푘, 푞) to an (푂푃푄)× 1 coefficient vector as
h푣푒푐(푘) = [h1,1(푘), ..,h1,푃 (푘) ∣ .. ∣ h푂,1(푘), ..,h푂,푃 (푘)]푡 (3)
where h표,푝(푘) is the coefficient vector of the(표, 푝)-th sub-channel given by h표,푝(푘) = [ℎ표,푝(−푄1, 푘),
..., ℎ표,푝(푄2, 푘)].
It is stated in [1] that the stochastic fading channel coefficient vector, h푣푒푐(푘), is
zero-mean gaussian distributed and its covariance matrix, R is given by
R = 퐸[h푣푒푐(푘1) ⋅ h퐻푣푒푐(푘2)]
= (Ψ푅푥 ⊗Ψ푇푥 ⊗Ψ푇푎푝) ⋅ 퐽0[2휋푓푑(푘1 − 푘2)푇푠] (4)
where (.)퐻 denotes the Hermitian operator, ⊗ denotes the Kronecker product, Ψ푅푥 and
Ψ푇푥 are the spatial correlation matrices at the receiver and transmitter, respectively, and
Ψ푇푎푝 is the intertap covariance matrix. These matrices are defined in (5), (6) and (7). The
factor 퐽0[2휋푓푑(푘1− 푘2)푇푠] describes temporal correlation where 푓푑 is the maximum doppler





























휓(푄2,−푄1) ⋅ ⋅ ⋅ 휓(푄2, 푄2)
⎞
⎟⎟⎟⎟⎠ (7)
where 휌푅푥(푚, 푝) is the receive correlation coefficient between antennas m and p. Similarly,
휌푇푥(푛, 푞) is the transmit correlation coefficient between n and q transmit antennas. The
elements of intertap covariance matrix is determined according to the power delay profiles.
This paper focuses on the validation of the triply selective fading channel model using
(4) through the estimation of the spatial correlation matrices and the intertap covariance
matrix.
3 TESTBED AND EXPERIMENT
A 2×2 MIMO-OFDM testbed has been developed using Altera Stratix III EP3SL150F
field-programmable gate array (FPGA) DSP development kit. The discrete-time MIMO
94
triply selective fading channel model in [1] is the basis for the design of this testbed. Hard-
ware implementation of discrete-time MIMO triply selective fading channel emulators is
proposed in [8].
In the transmitter side, two independent data streams are generated in Stratix III
development kit. The outputs are then up-converted to 17.5 MHz of IF (Intermediate Fre-
quency) and then the signals are fed into RF (Radio Frequency) block, RF2-3000UCV1, to
be transmitted at 915 MHz. MPA-10-40 is used for power amplification. Devices AFG3252
and FS725 are clock sources for all other devices. The setup architecture of the transmitter
is shown in Fig. 1.
Figure 1. Transmitter setup architecture.
95
In the receiver side, RF signals are first down-converted to 70 MHz IF by the down-
converter block, RF200-2500RV1. Baseband data streams are then generated and recorded
in Stratix III development kit and transferred to PC. Devices AFG3252 and FS725 provide
clock sources. The receiver setup architecture is shown in Fig. 2.
Figure 2. Receiver setup architecture.
A bandwidth configuration of 3.90 MHz is used in this testbed. The number of
OFDM subcarriers is 256 and a cyclic prefix length of 64 samples is used. Experiment has
been carried out using three different modulation schemes viz. QPSK, 8PSK and 16QAM.
These subcarriers are used for channel sounding. Although BPSK is sufficient for channel
sounding, the transceiver was originally designed for MIMO communications, thus QPSK,
8PSK and 16QAM modulation schemes are used. Measurements are done for two different
experimental setups - one with both transmitter and receiver located in the same room
96
(inside 208) and the other with transmitter and receiver located in two different rooms (208
and 212) across a hallway as shown in the floor plan in Fig. 3.
Figure 3. Floorplan used for the experiment.
Experimental data from this testbed is used for channel estimation and subsequent
analysis.
97
4 PROCEDURE, RESULTS AND ANALYSIS
4.1 Channel Estimation
Time domain least squares (LS) method is used for estimation of channel impulse
response (CIR) for each subchannel of the 2×2 MIMO system based on the known training




where (.)퐻 and (.)−1 represent the hermitian and inverse operations respectively, X is the





푥푄 ⋅ ⋅ ⋅ 푥1 푥0








where Q is the number of channel taps and P is the number of pilot data for each antenna.
A long probing sequence is transmitted and CIRs are estimated progressively by
using cascading windows of size 푁푝 = 120 symbols. An example of 30-tap CIRs of the
four subchannels is shown in Fig. 4 where 80 cascading windows are used across the length
of the transmitted data sequence. Although the signal bandwidth is only 3.9 MHz, the
baseband equivalent channel did experience multipath delay spread spanning 30 taps. This
is because both transmitter and receiver antennas were placed very low, only a meter above
the floor. This is different from the case where one end is placed very high like a base
98
station where multipath may not be significant. This demonstrates the difference between
mobile-to-mobile channel and base-station to mobile channel. Number of windows can be
increased or overlapping windows can be used for the estimation of CIRs of highly scattering
































































































Figure 4. Magnitudes of channel impulse responses for four subchannels.
4.2 Estimation of the Channel Coefficient Covariance Matrix
The channel coefficient covariance matrix is calculated using the estimated channel
coefficients. The (푂푃푄 × 푂푃푄) covariance matrix, R, is calculated using (4).Estimated



















Figure 5. Magnitude of estimated channel coefficient covariance matrix.
4.3 Decomposition of the Kronecker Product
The Kronecker product of two matrices A and B is defined as
C = A⊗B =
⎛
⎜⎜⎜⎜⎝




푎푚1B ⋅ ⋅ ⋅ 푎1푛B
⎞
⎟⎟⎟⎟⎠ (10)
where A is (푚× 푛), B is (푝× 푞) matrix, and C, the resultant Kronecker product is of size
(푚푝× 푛푞).
The problem at hand is to find estimations of A and B from a given Kronecker
product C. Let us consider the first block of elements of matrix C, say C11 which is a
(푝× 푞) matrix given by
C11 = 푎11B (11)
100
If we calculate an ensemble average of all the elements of C11, that essentially results
in scalar multiplication of 푎11 and mean of all the elements of B as shown in 12
퐸[C11] = 푎11퐸[B] (12)
This isolates the first element ofA from the Kronecker product. We repeat the same process
to obtain other elements of A. The resulting estimation of matrix A, therefore, is a scaled
version of actual A and retains its spatial properties.
In this paper, we estimate the spatial correlation matrix Ψ푇푟푥 from the channel
coefficient covariance matrix R using the method explained in (12).
4.4 Estimation of Intertap Covariance Matrix and Spatial Correlation Matrix
We estimate the (푄 × 푄) intertap covariance matrices for each subchannel. Using
correlation matrix distance (CMD) as a metric [10], we show that these intertap covari-
ance matrices have identical spatial structure. CMD, the distance between two correlation
matrices R1 and R2 is defined as
푑푐표푟푟(R1,R2) = 1− 푡푟{R1R2}∥R1∥푓∥R2∥푓 (13)
where 푡푟{.} represents the trace of the matrix and ∥.∥푓 is the Frobenius norm. CMD be-
comes zero if the correlation matrices are equal up to a scaling factor and one if they differ
from each other. The smaller value, thus, verifies that the matrices are spatially identi-
cal.Results are summarized in Table 1 for data obtained from different experimental setups.
These results comply with the assumption in [3] that the power delay profile
of the physical channel model is identical for all transmit and receive antenna indices.
We compute an average intertap covariance matrix and use it as one of the Kronecker
factors of the channel coefficient covariance matrix to estimate the spatial correlation
101
Table 1. Comparison of correlation matrices using CMD.
Experi. Atten. CMD CMD CMD CMD CMD CMD
Setup (dB) 푅11, 푅12 푅11, 푅21 푅11, 푅22 푅21, 푅12 푅21, 푅22 푅,푅푣푒푟푖푓푦
In 208
22 0.0259 0.0129 0.0959 0.0259 0.0713 0.0196
26 0.0255 0.0169 0.5789 0.0255 0.0744 0.0966
30 0.0159 0.0192 0.3133 0.0159 0.0647 0.0465
208 and
2 0.2550 0.0822 0.1003 0.2550 0.0908 0.0942
6 0.0751 0.0230 0.0785 0.0751 0.0569 0.0355
212 10 0.1208 0.0244 0.0690 0.1208 0.1433 0.0403
matrix. The spatial similarity among the intertap covariance matrices of four sub-
channels can be observed in Fig. 6. Fig. 7 shows the average intertap covariance
matrix, Ψ푇푎푝.
The elements of spatial correlation matrix are estimated from the channel
coefficient covariance matrix R. The process in (12) gives a matrix spatially iden-
tical with Ψ푇푟푥. We again calculate the Kronecker product of the estimated spatial
correlation matrix, Ψ푇푟푥 and the average intertap covariance matrix, Ψ푇푎푝 using
R푣푒푟푖푓푦 = Ψ푇푟푥 ⊗Ψ푇푎푝 (14)
to validate the approach used for the decomposition of Kronecker product. Since the
transmitter-receiver setup in this experiment was static, the temporal correlation does
not have a significant impact on the results. The CMD metric is used to compare the
similarity between the channel coefficient covariance matrix calculated using (11) and
(14). Results for 6 different instances are shown in Table 1. The matrices R푖푗 are
the correlation matrices of the 푖푗-th subchannel. Fig.8 shows the channel coefficient
covariance matrix estimated using (14). Ψ푇푟푥, the (4× 4) spatial correlation matrix,






















































































Figure 6. Magnitudes of intertap covariance matrices for each subchannel.
5 CONCLUSIONS
In this paper, we validated the triply selective fading channel model through a
MIMO testbed and experimental results. Experimental results demonstrate that the
discrete-time triply selective fading channel can be expressed as separable temporal,
inter-tap and spatial correlations. Using correlation matrix distance as a metric we
show that the intertap correlations for all the subchannels are spatially identical. This
permits the estimation of spatial correlations matrices through the decomposition of







































Figure 8. Kronecker product of estimated Ψ푇푟푥 and Ψ푇푎푝.
the Kronecker product of the estimated correlation matrices and comparing the result
with the covariance matrix obtained directly from the estimated channel coefficients.
104
6 REFERENCES
[1] C. Xiao, J.X. Wu, S-Y. Leong, Y.R. Zheng, and K. B. Letaief, “A discrete-
time model for triply selective MIMO Rayleigh fading channels,” IEEE Trans.
Wireless Commun., vol. 3, no. 5, pp. 1678-1688, Sep. 2004.
[2] J. Mietzner and P. Hoeher, “A rigorous analysis of the statistical properties of
the discrete-time triply-selective MIMO Rayleigh fading channel model,” IEEE
Trans. Wireless Commun., vol. 6, no. 12, pp. 4199-4203, 2007.
[3] J. Mietzner, C. Xiao, P. Hoeher, and K. Ben Letaief, “A note on discrete-time
triply-selective MIMO Rayleigh fading channel models,” IEEE Trans. Wireless
Commun., vol. 7, no. 3, pp. 837, 2008.
[4] A. Abdi and M. Kaveh, “A space-time correlation model for multielement an-
tenna systems in mobile fading channels,” IEEE Journal. Selected Areas in Com-
munications, vol. 20, no. 3, pp. 550-560, Apr. 2002.
[5] B. E. Baddour and N. C. Beaulieu, “Accurate simulation of multiple cross-
correlated Rician fading channels,” IEEE Trans. Commun., vol. 52, no. 11, pp.
1980-1987, Nov. 2004.
[6] G. Byers and F. Takawira, “Spatially and temporally correlated MIMO channels:
modeling and capacity analysis,” IEEE Trans. Veh. Technol., vol. 53, no. 3, pp.
634-643, May 2004.
[7] C. V. Loan and N. Pitsianis, Approximation with Kronecker products, Kluwer
Publications, 1993.
[8] F. Ren and Y.R. Zheng, “A novel emulator for discrete-time MIMO triply-
selective fading channels,” IEEE Trans. Circuit Systems, Part-I, vol. 57, no.
9, pp. 2542C2551, Nov. 2010.
[9] S. Kay, “Fundamentals of Statistical Signal Processing: Estimation Theory,” On-
line: http://books.google.com/books?id=aFwESQAACAAJ, Mar. 2010.
[10] M. Herdin, N. Czink, H. Ozcelik, and E. Bonek, “Correlation matrix distance,
a meaningful measure for evaluation of nonstationary mimo channels,” in Proc.




This dissertation focuses on research of hardware-based wireless fading chan-
nel emulators. It solves the following main challenges in hardware implementations of
wireless fading channel emulators. The hardware implementation methods of triple-
selective fading channel emulators with accurate correlation properties are proposed.
On-chip FRFGs and inter-tap correlation matrix generators are implemented. A
mixed P-S computational structure, which incorporates three types of correlation
into sub-channels, is proposed to make the best tradeoff between processing speed
and hardware usage. These proposed algorithms and designs have been simulated
not only by simulation tools, but also implemented and verified on FPGA develop-
ment kit platforms. The emulated channels reach excellent statistical and correlation
properties, which match those theoretical ones. This dissertation also validates the
triply selective fading channel model through a MIMO testbed and experimental re-
sults. Experimental results demonstrate that the discrete-time triply selective fading
channel can be expressed as separable temporal, inter-tap and spatial correlations.
The contributions of all my research work during Ph.D. study are summarized
in two journal papers and five conference papers, among which, two journal papers and
three conference papers are included in this dissertation. The complete publication
list is included in Section 3.
106
3 PUBLICATIONS
[1] F. Ren, and Y.R. Zheng, “Hardware Emulation of Wideband Correlated
Multiple-Input Multiple-Output Fading Channels,” Journal of Signal Process-
ing Systems, accepted to publish, Jun. 2011.
[2] F. Ren, and Y.R. Zheng, “A Novel Emulator for Discrete-time MIMO Triply-
selective Fading Channels,” IEEE Trans. Circuits and Systems I: Regular Paper,
vol. 57, no. 9, pp. 2542-2551, Sep. 2010.
[3] F. Ren, and Y.R. Zheng, “Incorporating Correlation Matrices into Hardware
Triple Selective Fading Channel Emulators Using Kronecker Product,” in Proc.
Vehicular Technology Conference (VTC2010-Fall), Ottawa, Canada, Sep. 2010,
pp. 1-5.
[4] F. Ren, and Y.R. Zheng, “Hardware Implementation of Triply Selective Rayleigh
Fading Channel Simulators,” in Proc. International Conference on Acoustics,
Speech, and Signal Processing (ICASSP), Dallas, Texas, Mar. 2010, pp. 1498-
1501.
[5] F. Ren, and Y.R. Zheng, “A Low-complexity Hardware Implementation of
Discrete-time Frequency-selective Rayleigh Fading Channels,” in Proc. IEEE In-
ternational Symposium on Circuits and Systems (ISCAS), Taipei, Taiwan, May
2009, pp. 1759-1762.
[6] F. Ren, and Y. R. Zheng, and J. Sarangapani, “Incorporating Forward Error
Correction Codes into FlexRay Communications,” in Proc. Sensor, Signal and
Information Processing (SenSIP), Sedona, AZ, May, 2008.
[7] S. Subedi, H. Lou, F. Ren, M. Wang, and Y. R. Zheng, “Validation of the Triply
Selective Fading Channel Model Through a MIMO Test Bed and Experimental
Results,” Proc. International Conference for Military Communications (MIL-
COM11), accepted to publish.
107
VITA
Fei Ren was born May 8, 1983, in Chongqing, China. He received his B.S.
degree in 2005 in Electrical Engineering from University of Electronic Science and
Technology of China, Chengdu, Sichuan Province, China. He received his M.S. de-
gree in 2007 in Electrical Engineering from University of Missouri-Rolla, Rolla, MO,
USA. He began his Ph.D. study in January 2008, in the Department of Electrical and
Computer Engineering at Missouri University of Science and Technology. His research
interests include wireless channel emulators, VLSI, and hardware implementation of
wireless communication systems. He is expected to receive his Ph.D. degree in Elec-
trical Engineering from Missouri University of Science and Technology in December
2011.
