714 research outputs found
Robust and Sparse M-Estimation of DOA
A robust and sparse Direction of Arrival (DOA) estimator is derived for array
data that follows a Complex Elliptically Symmetric (CES) distribution with
zero-mean and finite second-order moments. The derivation allows to choose the
loss function and four loss functions are discussed in detail: the Gauss loss
which is the Maximum-Likelihood (ML) loss for the circularly symmetric complex
Gaussian distribution, the ML-loss for the complex multivariate
-distribution (MVT) with degrees of freedom, as well as Huber and
Tyler loss functions. For Gauss loss, the method reduces to Sparse Bayesian
Learning (SBL). The root mean square DOA error of the derived estimators is
discussed for Gaussian, MVT, and -contaminated data. The robust SBL
estimators perform well for all cases and nearly identical with classical SBL
for Gaussian noise
Robust L1-norm Singular-Value Decomposition and Estimation
Singular-Value Decomposition (SVD) is a ubiquitous data analysis method in engineering, science, and statistics. Singular-value estimation, in particular, is of critical importance in an array of engineering applications, such as channel estimation in communication systems, EMG signal analysis, and image compression, to name just a few. Conventional SVD of a data matrix coincides with standard Principal-Component Analysis (PCA). The L2-norm (sum of squared values) formulation of PCA promotes peripheral data points and, thus, makes PCA sensitive against outliers. Naturally, SVD inherits this outlier sensitivity. In this work, we present a novel robust method for SVD based on a L1-norm (sum of absolute values) formulation, namely L1-norm compact Singular-Value Decomposition (L1-cSVD). We then propose a closed-form algorithm to solve this problem and find the robust singular values with cost . Accordingly, the proposed method demonstrates sturdy resistance against outliers, especially for singular values estimation, and can facilitate more reliable data analysis and processing in a wide range of engineering applications
EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis
Data clustering has received a lot of attention and numerous methods,
algorithms and software packages are available. Among these techniques,
parametric finite-mixture models play a central role due to their interesting
mathematical properties and to the existence of maximum-likelihood estimators
based on expectation-maximization (EM). In this paper we propose a new mixture
model that associates a weight with each observed point. We introduce the
weighted-data Gaussian mixture and we derive two EM algorithms. The first one
considers a fixed weight for each observation. The second one treats each
weight as a random variable following a gamma distribution. We propose a model
selection method based on a minimum message length criterion, provide a weight
initialization strategy, and validate the proposed algorithms by comparing them
with several state of the art parametric and non-parametric clustering
techniques. We also demonstrate the effectiveness and robustness of the
proposed clustering technique in the presence of heterogeneous data, namely
audio-visual scene analysis.Comment: 14 pages, 4 figures, 4 table
Dynamic Algorithms and Asymptotic Theory for Lp-norm Data Analysis
The focus of this dissertation is the development of outlier-resistant stochastic algorithms for Principal Component Analysis (PCA) and the derivation of novel asymptotic theory for Lp-norm Principal Component Analysis (Lp-PCA). Modern machine learning and signal processing applications employ sensors that collect large volumes of data measurements that are stored in the form of data matrices, that are often massive and need to be efficiently processed in order to enable machine learning algorithms to perform effective underlying pattern discovery. One such commonly used matrix analysis technique is PCA. Over the past century, PCA has been extensively used in areas such as machine learning, deep learning, pattern recognition, and computer vision, just to name a few. PCA\u27s popularity can be attributed to its intuitive formulation on the L2-norm, availability of an elegant solution via the singular-value-decomposition (SVD), and asymptotic convergence guarantees. However, PCA has been shown to be highly sensitive to faulty measurements (outliers) because of its reliance on the outlier-sensitive L2-norm. Arguably, the most straightforward approach to impart robustness against outliers is to replace the outlier-sensitive L2-norm by the outlier-resistant L1-norm, thus formulating what is known as L1-PCA. Exact and approximate solvers are proposed for L1-PCA in the literature. On the other hand, in this big-data era, the data matrix may be very large and/or the data measurements may arrive in streaming fashion. Traditional L1-PCA algorithms are not suitable in this setting. In order to efficiently process streaming data, while being resistant against outliers, we propose a stochastic L1-PCA algorithm that computes the dominant principal component (PC) with formal convergence guarantees. We further generalize our stochastic L1-PCA algorithm to find multiple components by propose a new PCA framework that maximizes the recently proposed Barron loss. Leveraging Barron loss yields a stochastic algorithm with a tunable robustness parameter that allows the user to control the amount of outlier-resistance required in a given application. We demonstrate the efficacy and robustness of our stochastic algorithms on synthetic and real-world datasets. Our experimental studies include online subspace estimation, classification, video surveillance, and image conditioning, among other things. Last, we focus on the development of asymptotic theory for Lp-PCA. In general, Lp-PCA for p\u3c2 has shown to outperform PCA in the presence of outliers owing to its outlier resistance. However, unlike PCA, Lp-PCA is perceived as a ``robust heuristic\u27\u27 by the research community due to the lack of theoretical asymptotic convergence guarantees. In this work, we strive to shed light on the topic by developing asymptotic theory for Lp-PCA. Specifically, we show that, for a broad class of data distributions, the Lp-PCs span the same subspace as the standard PCs asymptotically and moreover, we prove that the Lp-PCs are specific rotated versions of the PCs. Finally, we demonstrate the asymptotic equivalence of PCA and Lp-PCA with a wide variety of experimental studies
Theory and Algorithms for Reliable Multimodal Data Analysis, Machine Learning, and Signal Processing
Modern engineering systems collect large volumes of data measurements across diverse sensing modalities. These measurements can naturally be arranged in higher-order arrays of scalars which are commonly referred to as tensors. Tucker decomposition (TD) is a standard method for tensor analysis with applications in diverse fields of science and engineering. Despite its success, TD exhibits severe sensitivity against outliers —i.e., heavily corrupted entries that appear sporadically in modern datasets. We study L1-norm TD (L1-TD), a reformulation of TD that promotes robustness. For 3-way tensors, we show, for the first time, that L1-TD admits an exact solution via combinatorial optimization and present algorithms for its solution. We propose two novel algorithmic frameworks for approximating the exact solution to L1-TD, for general N-way tensors. We propose a novel algorithm for dynamic L1-TD —i.e., efficient and joint analysis of streaming tensors. Principal-Component Analysis (PCA) (a special case of TD) is also outlier responsive. We consider Lp-quasinorm PCA (Lp-PCA) for
Terahertz-Band Channel and Beam Split Estimation via Array Perturbation Model
For the demonstration of ultra-wideband bandwidth and pencil-beamforming, the
terahertz (THz)-band has been envisioned as one of the key enabling
technologies for the sixth generation networks. However, the acquisition of the
THz channel entails several unique challenges such as severe path loss and
beam-split. Prior works usually employ ultra-massive arrays and additional
hardware components comprised of time-delayers to compensate for these loses.
In order to provide a cost-effective solution, this paper introduces a
sparse-Bayesian-learning (SBL) technique for joint channel and beam-split
estimation. Specifically, we first model the beam-split as an array
perturbation inspired from array signal processing. Next, a low-complexity
approach is developed by exploiting the line-of-sight-dominant feature of THz
channel to reduce the computational complexity involved in the proposed SBL
technique for channel estimation (SBCE). Additionally, based on
federated-learning, we implement a model-free technique to the proposed
model-based SBCE solution. Further to that, we examine the near-field
considerations of THz channel, and introduce the range-dependent near-field
beam-split. The theoretical performance bounds, i.e., Cram\'er-Rao lower
bounds, are derived both for near- and far-field parameters, e.g., user
directions, beam-split and ranges. Numerical simulations demonstrate that SBCE
outperforms the existing approaches and exhibits lower hardware cost.Comment: Accepted Paper in IEEE Open Journal of Communications Societ
Gridless Evolutionary Approach for Line Spectral Estimation with Unknown Model Order
Gridless methods show great superiority in line spectral estimation. These
methods need to solve an atomic norm (i.e., the continuous analog of
norm) minimization problem to estimate frequencies and model order. Since
this problem is NP-hard to compute, relaxations of atomic norm, such as
nuclear norm and reweighted atomic norm, have been employed for promoting
sparsity. However, the relaxations give rise to a resolution limit,
subsequently leading to biased model order and convergence error. To overcome
the above shortcomings of relaxation, we propose a novel idea of simultaneously
estimating the frequencies and model order by means of the atomic norm.
To accomplish this idea, we build a multiobjective optimization model. The
measurment error and the atomic norm are taken as the two optimization
objectives. The proposed model directly exploits the model order via the atomic
norm, thus breaking the resolution limit. We further design a
variable-length evolutionary algorithm to solve the proposed model, which
includes two innovations. One is a variable-length coding and search strategy.
It flexibly codes and interactively searches diverse solutions with different
model orders. These solutions act as steppingstones that help fully exploring
the variable and open-ended frequency search space and provide extensive
potentials towards the optima. Another innovation is a model order pruning
mechanism, which heuristically prunes less contributive frequencies within the
solutions, thus significantly enhancing convergence and diversity. Simulation
results confirm the superiority of our approach in both frequency estimation
and model order selection.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Sparse Bayesian Learning Approach for Discrete Signal Reconstruction
This study addresses the problem of discrete signal reconstruction from the
perspective of sparse Bayesian learning (SBL). Generally, it is intractable to
perform the Bayesian inference with the ideal discretization prior under the
SBL framework. To overcome this challenge, we introduce a novel discretization
enforcing prior to exploit the knowledge of the discrete nature of the
signal-of-interest. By integrating the discretization enforcing prior into the
SBL framework and applying the variational Bayesian inference (VBI)
methodology, we devise an alternating update algorithm to jointly characterize
the finite alphabet feature and reconstruct the unknown signal. When the
measurement matrix is i.i.d. Gaussian per component, we further embed the
generalized approximate message passing (GAMP) into the VBI-based method, so as
to directly adopt the ideal prior and significantly reduce the computational
burden. Simulation results demonstrate substantial performance improvement of
the two proposed methods over existing schemes. Moreover, the GAMP-based
variant outperforms the VBI-based method with an i.i.d. Gaussian measurement
matrix but it fails to work for non i.i.d. Gaussian matrices.Comment: 13 pages, 7 figure
- …