31,073 research outputs found
Low Rank Approximation in the Presence of Outliers
We consider the problem of principal component analysis (PCA) in the presence of outliers. Given a matrix A (d x n) and parameters k, m, the goal is to remove a set of at most m columns of A (outliers), so as to minimize the rank-k approximation error of the remaining matrix (inliers). While much of the work on this problem has focused on recovery of the rank-k subspace under assumptions on the inliers and outliers, we focus on the approximation problem. Our main result shows that sampling-based methods developed in the outlier-free case give non-trivial guarantees even in the presence of outliers. Using this insight, we develop a simple algorithm that has bi-criteria guarantees. Further, unlike similar formulations for clustering, we show that bi-criteria guarantees are unavoidable for the problem, under appropriate complexity assumptions
Robust Structured Low-Rank Approximation on the Grassmannian
Over the past years Robust PCA has been established as a standard tool for
reliable low-rank approximation of matrices in the presence of outliers.
Recently, the Robust PCA approach via nuclear norm minimization has been
extended to matrices with linear structures which appear in applications such
as system identification and data series analysis. At the same time it has been
shown how to control the rank of a structured approximation via matrix
factorization approaches. The drawbacks of these methods either lie in the lack
of robustness against outliers or in their static nature of repeated
batch-processing. We present a Robust Structured Low-Rank Approximation method
on the Grassmannian that on the one hand allows for fast re-initialization in
an online setting due to subspace identification with manifolds, and that is
robust against outliers due to a smooth approximation of the -norm cost
function on the other hand. The method is evaluated in online time series
forecasting tasks on simulated and real-world data
Bayesian Robust Tensor Factorization for Incomplete Multiway Data
We propose a generative model for robust tensor factorization in the presence
of both missing data and outliers. The objective is to explicitly infer the
underlying low-CP-rank tensor capturing the global information and a sparse
tensor capturing the local information (also considered as outliers), thus
providing the robust predictive distribution over missing entries. The
low-CP-rank tensor is modeled by multilinear interactions between multiple
latent factors on which the column sparsity is enforced by a hierarchical
prior, while the sparse tensor is modeled by a hierarchical view of Student-
distribution that associates an individual hyperparameter with each element
independently. For model learning, we develop an efficient closed-form
variational inference under a fully Bayesian treatment, which can effectively
prevent the overfitting problem and scales linearly with data size. In contrast
to existing related works, our method can perform model selection automatically
and implicitly without need of tuning parameters. More specifically, it can
discover the groundtruth of CP rank and automatically adapt the sparsity
inducing priors to various types of outliers. In addition, the tradeoff between
the low-rank approximation and the sparse representation can be optimized in
the sense of maximum model evidence. The extensive experiments and comparisons
with many state-of-the-art algorithms on both synthetic and real-world datasets
demonstrate the superiorities of our method from several perspectives.Comment: in IEEE Transactions on Neural Networks and Learning Systems, 201
A new Kernel Regression approach for Robustified Boosting
We investigate boosting in the context of kernel regression. Kernel
smoothers, in general, lack appealing traits like symmetry and positive
definiteness, which are critical not only for understanding theoretical aspects
but also for achieving good practical performance. We consider a
projection-based smoother (Huang and Chen, 2008) that is symmetric, positive
definite, and shrinking. Theoretical results based on the orthonormal
decomposition of the smoother reveal additional insights into the boosting
algorithm. In our asymptotic framework, we may replace the full-rank smoother
with a low-rank approximation. We demonstrate that the smoother's low-rank
() is bounded above by , where is the bandwidth. Our
numerical findings show that, in terms of prediction accuracy, low-rank
smoothers may outperform full-rank smoothers. Furthermore, we show that the
boosting estimator with low-rank smoother achieves the optimal convergence
rate. Finally, to improve the performance of the boosting algorithm in the
presence of outliers, we propose a novel robustified boosting algorithm which
can be used with any smoother discussed in the study. We investigate the
numerical performance of the proposed approaches using simulations and a
real-world case
Robust Rotation Synchronization via Low-rank and Sparse Matrix Decomposition
This paper deals with the rotation synchronization problem, which arises in
global registration of 3D point-sets and in structure from motion. The problem
is formulated in an unprecedented way as a "low-rank and sparse" matrix
decomposition that handles both outliers and missing data. A minimization
strategy, dubbed R-GoDec, is also proposed and evaluated experimentally against
state-of-the-art algorithms on simulated and real data. The results show that
R-GoDec is the fastest among the robust algorithms.Comment: The material contained in this paper is part of a manuscript
submitted to CVI
Robust Orthogonal Complement Principal Component Analysis
Recently, the robustification of principal component analysis has attracted
lots of attention from statisticians, engineers and computer scientists. In
this work we study the type of outliers that are not necessarily apparent in
the original observation space but can seriously affect the principal subspace
estimation. Based on a mathematical formulation of such transformed outliers, a
novel robust orthogonal complement principal component analysis (ROC-PCA) is
proposed. The framework combines the popular sparsity-enforcing and low rank
regularization techniques to deal with row-wise outliers as well as
element-wise outliers. A non-asymptotic oracle inequality guarantees the
accuracy and high breakdown performance of ROC-PCA in finite samples. To tackle
the computational challenges, an efficient algorithm is developed on the basis
of Stiefel manifold optimization and iterative thresholding. Furthermore, a
batch variant is proposed to significantly reduce the cost in ultra high
dimensions. The paper also points out a pitfall of a common practice of SVD
reduction in robust PCA. Experiments show the effectiveness and efficiency of
ROC-PCA in both synthetic and real data
Distributed Nonparametric Sequential Spectrum Sensing under Electromagnetic Interference
A nonparametric distributed sequential algorithm for quick detection of
spectral holes in a Cognitive Radio set up is proposed. Two or more local nodes
make decisions and inform the fusion centre (FC) over a reporting Multiple
Access Channel (MAC), which then makes the final decision. The local nodes use
energy detection and the FC uses mean detection in the presence of fading,
heavy-tailed electromagnetic interference (EMI) and outliers. The statistics of
the primary signal, channel gain or the EMI is not known. Different
nonparametric sequential algorithms are compared to choose appropriate
algorithms to be used at the local nodes and the FC. Modification of a recently
developed random walk test is selected for the local nodes for energy detection
as well as at the fusion centre for mean detection. It is shown via simulations
and analysis that the nonparametric distributed algorithm developed performs
well in the presence of fading, EMI and is robust to outliers. The algorithm is
iterative in nature making the computation and storage requirements minimal.Comment: 8 pages; 6 figures; Version 2 has the proofs for the theorems.
Version 3 contains a new section on approximation analysi
- …