122 research outputs found
Low-Complexity Precoding Design for Massive Multiuser MIMO Systems Using Approximate Message Passing
A practical challenge in the precoding design of massive multiuser multiple-input multiple-output (MIMO) systems is to facilitate hardware-friendly implementation. To achieve this, we propose a low peak-to-average power ratio (PAPR) precoding based on an approximate message passing (AMP) algorithm to minimize multiuser interference (MUI) in massive multiuser MIMO systems. The proposed approach exhibits fast convergence and low complexity characteristics. Compared with a conventional constant-envelope precoding and an annulus-constrained precoding, simulation results demonstrate that the proposed AMP precoding is superior both in terms of computational complexity and average running time. In addition, the proposed AMP precoding exhibits a much desirable tradeoff between MUI suppression and PAPR reduction. These findings indicate that the proposed AMP precoding is a suitable candidate for hardware implementation, which is very appealing for massive MIMO systems
FPGA Implementation of Real-Time Compressive Sensing with Partial Fourier Dictionary
This paper presents a novel real-time compressive sensing (CS) reconstruction which employs high density field-programmable gate array (FPGA) for hardware acceleration. Traditionally, CS can be implemented using a high-level computer language in a personal computer (PC) or multicore platforms, such as graphics processing units (GPUs) and Digital Signal Processors (DSPs). However, reconstruction algorithms are computing demanding and software implementation of these algorithms is extremely slow and power consuming. In this paper, the orthogonal matching pursuit (OMP) algorithm is refined to solve the sparse decomposition optimization for partial Fourier dictionary, which is always adopted in radar imaging and detection application. OMP reconstruction can be divided into two main stages: optimization which finds the closely correlated vectors and least square problem. For large scale dictionary, the implementation of correlation is time consuming since it often requires a large number of matrix multiplications. Also solving the least square problem always needs a scalable matrix decomposition operation. To solve these problems efficiently, the correlation optimization is implemented by fast Fourier transform (FFT) and the large scale least square problem is implemented by Conjugate Gradient (CG) technique, respectively. The proposed method is verified by FPGA (Xilinx Virtex-7 XC7VX690T) realization, revealing its effectiveness in real-time applications
Leveraging Signal Transfer Characteristics and Parasitics of Spintronic Circuits for Area and Energy-Optimized Hybrid Digital and Analog Arithmetic
While Internet of Things (IoT) sensors offer numerous benefits in diverse applications, they are limited by stringent constraints in energy, processing area and memory. These constraints are especially challenging within applications such as Compressive Sensing (CS) and Machine Learning (ML) via Deep Neural Networks (DNNs), which require dot product computations on large data sets. A solution to these challenges has been offered by the development of crossbar array architectures, enabled by recent advances in spintronic devices such as Magnetic Tunnel Junctions (MTJs). Crossbar arrays offer a compact, low-energy and in-memory approach to dot product computation in the analog domain by leveraging intrinsic signal-transfer characteristics of the embedded MTJ devices. The first phase of this dissertation research seeks to build on these benefits by optimizing resource allocation within spintronic crossbar arrays. A hardware approach to non-uniform CS is developed, which dynamically configures sampling rates by deriving necessary control signals using circuit parasitics. Next, an alternate approach to non-uniform CS based on adaptive quantization is developed, which reduces circuit area in addition to energy consumption. Adaptive quantization is then applied to DNNs by developing an architecture allowing for layer-wise quantization based on relative robustness levels. The second phase of this research focuses on extension of the analog computation paradigm by development of an operational amplifier-based arithmetic unit for generalized scalar operations. This approach allows for 95% area reduction in scalar multiplications, compared to the state-of-the-art digital alternative. Moreover, analog computation of enhanced activation functions allows for significant improvement in DNN accuracy, which can be harnessed through triple modular redundancy to yield 81.2% reduction in power at the cost of only 4% accuracy loss, compared to a larger network. Together these results substantiate promising approaches to several challenges facing the design of future IoT sensors within the targeted applications of CS and ML
Low-Complexity Blind Parameter Estimation in Wireless Systems with Noisy Sparse Signals
Baseband processing algorithms often require knowledge of the noise power,
signal power, or signal-to-noise ratio (SNR). In practice, these parameters are
typically unknown and must be estimated. Furthermore, the mean-square error
(MSE) is a desirable metric to be minimized in a variety of estimation and
signal recovery algorithms. However, the MSE cannot directly be used as it
depends on the true signal that is generally unknown to the estimator. In this
paper, we propose novel blind estimators for the average noise power, average
receive signal power, SNR, and MSE. The proposed estimators can be computed at
low complexity and solely rely on the large-dimensional and sparse nature of
the processed data. Our estimators can be used (i) to quickly track some of the
key system parameters while avoiding additional pilot overhead, (ii) to design
low-complexity nonparametric algorithms that require such quantities, and (iii)
to accelerate more sophisticated estimation or recovery algorithms. We conduct
a theoretical analysis of the proposed estimators for a Bernoulli complex
Gaussian (BCG) prior, and we demonstrate their efficacy via synthetic
experiments. We also provide three application examples that deviate from the
BCG prior in millimeter-wave multi-antenna and cell-free wireless systems for
which we develop nonparametric denoising algorithms that improve
channel-estimation accuracy with a performance comparable to denoisers that
assume perfect knowledge of the system parameters
Gaussian Belief Propagation on a Field-Programmable Gate Array for Solving Linear Equation Systems
Solving Linear Equation System (LESs) is a common problem in numerous fields of science. Even though the problem is well studied and powerful solvers are available nowadays, solving LES is still a bottleneck in many numerical applications concerning computation time. This issue especially pertains to applications in mobile robotics constrained by real-time requirements, where on-top power consumption and weight play an important role. This paper provides a general framework to approximately solve large LESs by Gaussian Belief Propagation (GaBP), which is extremely suitable for parallelization and implementation in hardware on a Field-Programmable Gate Array (FPGA). We derive the simple update rules of the Message Passing Algorithm for GaBP and show how to implement the approach efficiently on a System on a Programmable Chip (SoPC). In particular, multiple dedicated co-processors take care of recurring computations in GaBP. Exploiting multiple Direct Memory Access (DMA) controllers in scatter-gather mode and available arithmetic logic slices for numerical calculations accelerate the algorithm. Presented evaluations demonstrate that the approach does not only provide an accurate approximative solution of the LES. It also outperforms traditional solvers with respect to computation time for certain LESs
- …