63 research outputs found
Soft-Decision-Driven Channel Estimation for Pipelined Turbo Receivers
We consider channel estimation specific to turbo equalization for
multiple-input multiple-output (MIMO) wireless communication. We develop a
soft-decision-driven sequential algorithm geared to the pipelined turbo
equalizer architecture operating on orthogonal frequency division multiplexing
(OFDM) symbols. One interesting feature of the pipelined turbo equalizer is
that multiple soft-decisions become available at various processing stages. A
tricky issue is that these multiple decisions from different pipeline stages
have varying levels of reliability. This paper establishes an effective
strategy for the channel estimator to track the target channel, while dealing
with observation sets with different qualities. The resulting algorithm is
basically a linear sequential estimation algorithm and, as such, is
Kalman-based in nature. The main difference here, however, is that the proposed
algorithm employs puncturing on observation samples to effectively deal with
the inherent correlation among the multiple demapper/decoder module outputs
that cannot easily be removed by the traditional innovations approach. The
proposed algorithm continuously monitors the quality of the feedback decisions
and incorporates it in the channel estimation process. The proposed channel
estimation scheme shows clear performance advantages relative to existing
channel estimation techniques.Comment: 11 pages; IEEE Transactions on Communications 201
A variable-rate Viterbi decoder in 130-nm CMOS: design, measurements, and cost of flexibility
This paper discusses design and measurements of a flexible Viterbi decoder fabricated in 130-nm digital CMOS. Flexibility was incorporated by providing various code rates and modulation schemes to adjust to varying channel conditions. Based on previous trade-off studies, flexible building blocks were carefully designed to cause as little area penalty as possible. The chip runs down to a minimal core supply of 0.8V. It turns out that striving for more modulation schemes is beneficial in terms of power consumption once the price is paid for accepting different code rates viz. radices in the trellis and survivor path units
Optimization and implementation of a Viterbi decoder under flexibility constraints
This paper discusses the impact of flexibility when designing a Viterbi decoder for both convolutional and TCM codes. Different trade-offs have to be considered in choosing the right architecture for the processing blocks and the resulting hardware penalty is evaluated. We study the impact of symbol quantization that degrades performance and affects the wordlength of the rate-flexible trellis datapath. A radix-2-based architecture for this datapath relaxes the hardware requirements on the branch metric and survivor path blocks substantially. The cost of flexibility in terms of cell area and power consumption is explored by an investigation of synthesized designs that provide different transmission rates. Two designs are fabricated in a digital 0.13- CMOS process. Based on post-layout simulations, a symbol baud rate of 168 Mbaud/s is achieved in TCM mode, equivalent to a maximum throughput of 840 Mbit/s using a 64-QAM constellation
Spectrally Efficient Iterative MU-MIMO Receiver for SC-FDMA Based on EP
A novel multiuser multiple-input multiple-output (MU-MIMO) receiver for single-carrier frequency-division multiple-access (SC-FDMA) is proposed within the expectation propagation (EP) framework. Recent advances in iterative receiver design show the ability of low complexity EP-based iterative algorithms to achieve performance close to maximum a posteriori (MAP) detection. In this paper, based on the exact derivation of frequency-domain EP-based turbo receivers, we propose a novel spectrally and computationally efficient soft interference canceller (IC) with a doubly-iterative architecture. Asymptotic and finite-length analysis show that our proposal outperforms existing schemes with comparable complexity, in various spatial-multiplexing configurations
Datacenter Design for Future Cloud Radio Access Network.
Cloud radio access network (C-RAN), an emerging cloud service that combines the traditional radio access network (RAN) with cloud computing technology, has been proposed as a solution to handle the growing energy consumption and cost of the traditional RAN. Through aggregating baseband units (BBUs) in a centralized cloud datacenter, C-RAN reduces energy and cost, and improves wireless throughput and quality of service. However, designing a datacenter for C-RAN has not yet been studied. In this dissertation, I investigate how a datacenter for C-RAN BBUs should be built on commodity servers.
I first design WiBench, an open-source benchmark suite containing the key signal processing kernels of many mainstream wireless protocols, and study its characteristics. The characterization study shows that there is abundant data level parallelism (DLP) and thread level parallelism (TLP). Based on this result, I then develop high performance software implementations of C-RAN BBU kernels in C++ and CUDA for both CPUs and GPUs. In addition, I generalize the GPU parallelization techniques of the Turbo decoder to the trellis algorithms, an important family of algorithms that are widely used in data compression and channel coding.
Then I evaluate the performance of commodity CPU servers and GPU servers. The study shows that the datacenter with GPU servers can meet the LTE standard throughput with 4Ă— to 16Ă— fewer machines than with CPU servers. A further energy and cost analysis show that GPU servers can save on average 13Ă— more energy and 6Ă— more cost. Thus, I propose the C-RAN datacenter be built using GPUs as a server platform.
Next I study resource management techniques to handle the temporal and spatial traffic imbalance in a C-RAN datacenter. I propose a “hill-climbing” power management that combines powering-off GPUs and DVFS to match the temporal C-RAN traffic pattern. Under a practical traffic model, this technique saves 40% of the BBU energy in a GPU-based C-RAN datacenter. For spatial traffic imbalance, I propose three workload distribution techniques to improve load balance and throughput. Among all three techniques, pipelining packets has the most throughput improvement at 10% and 16% for balanced and unbalanced loads, respectively.PhDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120825/1/qizheng_1.pd
- …