714 research outputs found

    Robust and Sparse M-Estimation of DOA

    Full text link
    A robust and sparse Direction of Arrival (DOA) estimator is derived for array data that follows a Complex Elliptically Symmetric (CES) distribution with zero-mean and finite second-order moments. The derivation allows to choose the loss function and four loss functions are discussed in detail: the Gauss loss which is the Maximum-Likelihood (ML) loss for the circularly symmetric complex Gaussian distribution, the ML-loss for the complex multivariate tt-distribution (MVT) with ν\nu degrees of freedom, as well as Huber and Tyler loss functions. For Gauss loss, the method reduces to Sparse Bayesian Learning (SBL). The root mean square DOA error of the derived estimators is discussed for Gaussian, MVT, and ϵ\epsilon-contaminated data. The robust SBL estimators perform well for all cases and nearly identical with classical SBL for Gaussian noise

    Robust L1-norm Singular-Value Decomposition and Estimation

    Get PDF
    Singular-Value Decomposition (SVD) is a ubiquitous data analysis method in engineering, science, and statistics. Singular-value estimation, in particular, is of critical importance in an array of engineering applications, such as channel estimation in communication systems, EMG signal analysis, and image compression, to name just a few. Conventional SVD of a data matrix coincides with standard Principal-Component Analysis (PCA). The L2-norm (sum of squared values) formulation of PCA promotes peripheral data points and, thus, makes PCA sensitive against outliers. Naturally, SVD inherits this outlier sensitivity. In this work, we present a novel robust method for SVD based on a L1-norm (sum of absolute values) formulation, namely L1-norm compact Singular-Value Decomposition (L1-cSVD). We then propose a closed-form algorithm to solve this problem and find the robust singular values with cost O(N3K2)\mathcal{O}(N^3K^2). Accordingly, the proposed method demonstrates sturdy resistance against outliers, especially for singular values estimation, and can facilitate more reliable data analysis and processing in a wide range of engineering applications

    EM Algorithms for Weighted-Data Clustering with Application to Audio-Visual Scene Analysis

    Get PDF
    Data clustering has received a lot of attention and numerous methods, algorithms and software packages are available. Among these techniques, parametric finite-mixture models play a central role due to their interesting mathematical properties and to the existence of maximum-likelihood estimators based on expectation-maximization (EM). In this paper we propose a new mixture model that associates a weight with each observed point. We introduce the weighted-data Gaussian mixture and we derive two EM algorithms. The first one considers a fixed weight for each observation. The second one treats each weight as a random variable following a gamma distribution. We propose a model selection method based on a minimum message length criterion, provide a weight initialization strategy, and validate the proposed algorithms by comparing them with several state of the art parametric and non-parametric clustering techniques. We also demonstrate the effectiveness and robustness of the proposed clustering technique in the presence of heterogeneous data, namely audio-visual scene analysis.Comment: 14 pages, 4 figures, 4 table

    Dynamic Algorithms and Asymptotic Theory for Lp-norm Data Analysis

    Get PDF
    The focus of this dissertation is the development of outlier-resistant stochastic algorithms for Principal Component Analysis (PCA) and the derivation of novel asymptotic theory for Lp-norm Principal Component Analysis (Lp-PCA). Modern machine learning and signal processing applications employ sensors that collect large volumes of data measurements that are stored in the form of data matrices, that are often massive and need to be efficiently processed in order to enable machine learning algorithms to perform effective underlying pattern discovery. One such commonly used matrix analysis technique is PCA. Over the past century, PCA has been extensively used in areas such as machine learning, deep learning, pattern recognition, and computer vision, just to name a few. PCA\u27s popularity can be attributed to its intuitive formulation on the L2-norm, availability of an elegant solution via the singular-value-decomposition (SVD), and asymptotic convergence guarantees. However, PCA has been shown to be highly sensitive to faulty measurements (outliers) because of its reliance on the outlier-sensitive L2-norm. Arguably, the most straightforward approach to impart robustness against outliers is to replace the outlier-sensitive L2-norm by the outlier-resistant L1-norm, thus formulating what is known as L1-PCA. Exact and approximate solvers are proposed for L1-PCA in the literature. On the other hand, in this big-data era, the data matrix may be very large and/or the data measurements may arrive in streaming fashion. Traditional L1-PCA algorithms are not suitable in this setting. In order to efficiently process streaming data, while being resistant against outliers, we propose a stochastic L1-PCA algorithm that computes the dominant principal component (PC) with formal convergence guarantees. We further generalize our stochastic L1-PCA algorithm to find multiple components by propose a new PCA framework that maximizes the recently proposed Barron loss. Leveraging Barron loss yields a stochastic algorithm with a tunable robustness parameter that allows the user to control the amount of outlier-resistance required in a given application. We demonstrate the efficacy and robustness of our stochastic algorithms on synthetic and real-world datasets. Our experimental studies include online subspace estimation, classification, video surveillance, and image conditioning, among other things. Last, we focus on the development of asymptotic theory for Lp-PCA. In general, Lp-PCA for p\u3c2 has shown to outperform PCA in the presence of outliers owing to its outlier resistance. However, unlike PCA, Lp-PCA is perceived as a ``robust heuristic\u27\u27 by the research community due to the lack of theoretical asymptotic convergence guarantees. In this work, we strive to shed light on the topic by developing asymptotic theory for Lp-PCA. Specifically, we show that, for a broad class of data distributions, the Lp-PCs span the same subspace as the standard PCs asymptotically and moreover, we prove that the Lp-PCs are specific rotated versions of the PCs. Finally, we demonstrate the asymptotic equivalence of PCA and Lp-PCA with a wide variety of experimental studies

    Theory and Algorithms for Reliable Multimodal Data Analysis, Machine Learning, and Signal Processing

    Get PDF
    Modern engineering systems collect large volumes of data measurements across diverse sensing modalities. These measurements can naturally be arranged in higher-order arrays of scalars which are commonly referred to as tensors. Tucker decomposition (TD) is a standard method for tensor analysis with applications in diverse fields of science and engineering. Despite its success, TD exhibits severe sensitivity against outliers —i.e., heavily corrupted entries that appear sporadically in modern datasets. We study L1-norm TD (L1-TD), a reformulation of TD that promotes robustness. For 3-way tensors, we show, for the first time, that L1-TD admits an exact solution via combinatorial optimization and present algorithms for its solution. We propose two novel algorithmic frameworks for approximating the exact solution to L1-TD, for general N-way tensors. We propose a novel algorithm for dynamic L1-TD —i.e., efficient and joint analysis of streaming tensors. Principal-Component Analysis (PCA) (a special case of TD) is also outlier responsive. We consider Lp-quasinorm PCA (Lp-PCA) for

    Terahertz-Band Channel and Beam Split Estimation via Array Perturbation Model

    Get PDF
    For the demonstration of ultra-wideband bandwidth and pencil-beamforming, the terahertz (THz)-band has been envisioned as one of the key enabling technologies for the sixth generation networks. However, the acquisition of the THz channel entails several unique challenges such as severe path loss and beam-split. Prior works usually employ ultra-massive arrays and additional hardware components comprised of time-delayers to compensate for these loses. In order to provide a cost-effective solution, this paper introduces a sparse-Bayesian-learning (SBL) technique for joint channel and beam-split estimation. Specifically, we first model the beam-split as an array perturbation inspired from array signal processing. Next, a low-complexity approach is developed by exploiting the line-of-sight-dominant feature of THz channel to reduce the computational complexity involved in the proposed SBL technique for channel estimation (SBCE). Additionally, based on federated-learning, we implement a model-free technique to the proposed model-based SBCE solution. Further to that, we examine the near-field considerations of THz channel, and introduce the range-dependent near-field beam-split. The theoretical performance bounds, i.e., Cram\'er-Rao lower bounds, are derived both for near- and far-field parameters, e.g., user directions, beam-split and ranges. Numerical simulations demonstrate that SBCE outperforms the existing approaches and exhibits lower hardware cost.Comment: Accepted Paper in IEEE Open Journal of Communications Societ

    Gridless Evolutionary Approach for Line Spectral Estimation with Unknown Model Order

    Full text link
    Gridless methods show great superiority in line spectral estimation. These methods need to solve an atomic l0l_0 norm (i.e., the continuous analog of l0l_0 norm) minimization problem to estimate frequencies and model order. Since this problem is NP-hard to compute, relaxations of atomic l0l_0 norm, such as nuclear norm and reweighted atomic norm, have been employed for promoting sparsity. However, the relaxations give rise to a resolution limit, subsequently leading to biased model order and convergence error. To overcome the above shortcomings of relaxation, we propose a novel idea of simultaneously estimating the frequencies and model order by means of the atomic l0l_0 norm. To accomplish this idea, we build a multiobjective optimization model. The measurment error and the atomic l0l_0 norm are taken as the two optimization objectives. The proposed model directly exploits the model order via the atomic l0l_0 norm, thus breaking the resolution limit. We further design a variable-length evolutionary algorithm to solve the proposed model, which includes two innovations. One is a variable-length coding and search strategy. It flexibly codes and interactively searches diverse solutions with different model orders. These solutions act as steppingstones that help fully exploring the variable and open-ended frequency search space and provide extensive potentials towards the optima. Another innovation is a model order pruning mechanism, which heuristically prunes less contributive frequencies within the solutions, thus significantly enhancing convergence and diversity. Simulation results confirm the superiority of our approach in both frequency estimation and model order selection.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Sparse Bayesian Learning Approach for Discrete Signal Reconstruction

    Full text link
    This study addresses the problem of discrete signal reconstruction from the perspective of sparse Bayesian learning (SBL). Generally, it is intractable to perform the Bayesian inference with the ideal discretization prior under the SBL framework. To overcome this challenge, we introduce a novel discretization enforcing prior to exploit the knowledge of the discrete nature of the signal-of-interest. By integrating the discretization enforcing prior into the SBL framework and applying the variational Bayesian inference (VBI) methodology, we devise an alternating update algorithm to jointly characterize the finite alphabet feature and reconstruct the unknown signal. When the measurement matrix is i.i.d. Gaussian per component, we further embed the generalized approximate message passing (GAMP) into the VBI-based method, so as to directly adopt the ideal prior and significantly reduce the computational burden. Simulation results demonstrate substantial performance improvement of the two proposed methods over existing schemes. Moreover, the GAMP-based variant outperforms the VBI-based method with an i.i.d. Gaussian measurement matrix but it fails to work for non i.i.d. Gaussian matrices.Comment: 13 pages, 7 figure
    • …
    corecore