132 research outputs found
Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization
Principal component analysis (PCA) is widely used for dimensionality
reduction, with well-documented merits in various applications involving
high-dimensional data, including computer vision, preference measurement, and
bioinformatics. In this context, the fresh look advocated here permeates
benefits from variable selection and compressive sampling, to robustify PCA
against outliers. A least-trimmed squares estimator of a low-rank bilinear
factor analysis model is shown closely related to that obtained from an
-(pseudo)norm-regularized criterion encouraging sparsity in a matrix
explicitly modeling the outliers. This connection suggests robust PCA schemes
based on convex relaxation, which lead naturally to a family of robust
estimators encompassing Huber's optimal M-class as a special case. Outliers are
identified by tuning a regularization parameter, which amounts to controlling
sparsity of the outlier matrix along the whole robustification path of (group)
least-absolute shrinkage and selection operator (Lasso) solutions. Beyond its
neat ties to robust statistics, the developed outlier-aware PCA framework is
versatile to accommodate novel and scalable algorithms to: i) track the
low-rank signal subspace robustly, as new data are acquired in real time; and
ii) determine principal components robustly in (possibly) infinite-dimensional
feature spaces. Synthetic and real data tests corroborate the effectiveness of
the proposed robust PCA schemes, when used to identify aberrant responses in
personality assessment surveys, as well as unveil communities in social
networks, and intruders from video surveillance data.Comment: 30 pages, submitted to IEEE Transactions on Signal Processin
Robust causal structure learning with some hidden variables
We introduce a new method to estimate the Markov equivalence class of a
directed acyclic graph (DAG) in the presence of hidden variables, in settings
where the underlying DAG among the observed variables is sparse, and there are
a few hidden variables that have a direct effect on many of the observed ones.
Building on the so-called low rank plus sparse framework, we suggest a
two-stage approach which first removes the effect of the hidden variables, and
then estimates the Markov equivalence class of the underlying DAG under the
assumption that there are no remaining hidden variables. This approach is
consistent in certain high-dimensional regimes and performs favourably when
compared to the state of the art, both in terms of graphical structure recovery
and total causal effect estimation
Sample Complexity of Dictionary Learning and other Matrix Factorizations
Many modern tools in machine learning and signal processing, such as sparse
dictionary learning, principal component analysis (PCA), non-negative matrix
factorization (NMF), -means clustering, etc., rely on the factorization of a
matrix obtained by concatenating high-dimensional vectors from a training
collection. While the idealized task would be to optimize the expected quality
of the factors over the underlying distribution of training vectors, it is
achieved in practice by minimizing an empirical average over the considered
collection. The focus of this paper is to provide sample complexity estimates
to uniformly control how much the empirical average deviates from the expected
cost function. Standard arguments imply that the performance of the empirical
predictor also exhibit such guarantees. The level of genericity of the approach
encompasses several possible constraints on the factors (tensor product
structure, shift-invariance, sparsity \ldots), thus providing a unified
perspective on the sample complexity of several widely used matrix
factorization schemes. The derived generalization bounds behave proportional to
w.r.t.\ the number of samples for the considered matrix
factorization techniques.Comment: to appea
Non-Convex Rank Minimization via an Empirical Bayesian Approach
In many applications that require matrix solutions of minimal rank, the
underlying cost function is non-convex leading to an intractable, NP-hard
optimization problem. Consequently, the convex nuclear norm is frequently used
as a surrogate penalty term for matrix rank. The problem is that in many
practical scenarios there is no longer any guarantee that we can correctly
estimate generative low-rank matrices of interest, theoretical special cases
notwithstanding. Consequently, this paper proposes an alternative empirical
Bayesian procedure build upon a variational approximation that, unlike the
nuclear norm, retains the same globally minimizing point estimate as the rank
function under many useful constraints. However, locally minimizing solutions
are largely smoothed away via marginalization, allowing the algorithm to
succeed when standard convex relaxations completely fail. While the proposed
methodology is generally applicable to a wide range of low-rank applications,
we focus our attention on the robust principal component analysis problem
(RPCA), which involves estimating an unknown low-rank matrix with unknown
sparse corruptions. Theoretical and empirical evidence are presented to show
that our method is potentially superior to related MAP-based approaches, for
which the convex principle component pursuit (PCP) algorithm (Candes et al.,
2011) can be viewed as a special case.Comment: 10 pages, 6 figures, UAI 2012 pape
Weighted Sparse Partial Least Squares for Joint Sample and Feature Selection
Sparse Partial Least Squares (sPLS) is a common dimensionality reduction
technique for data fusion, which projects data samples from two views by
seeking linear combinations with a small number of variables with the maximum
variance. However, sPLS extracts the combinations between two data sets with
all data samples so that it cannot detect latent subsets of samples. To extend
the application of sPLS by identifying a specific subset of samples and remove
outliers, we propose an -norm constrained weighted sparse
PLS (-wsPLS) method for joint sample and feature selection,
where the -norm constrains are used to select a subset of
samples. We prove that the -norm constrains have the
Kurdyka-\L{ojasiewicz}~property so that a globally convergent algorithm is
developed to solve it. Moreover, multi-view data with a same set of samples can
be available in various real problems. To this end, we extend the
-wsPLS model and propose two multi-view wsPLS models for
multi-view data fusion. We develop an efficient iterative algorithm for each
multi-view wsPLS model and show its convergence property. As well as numerical
and biomedical data experiments demonstrate the efficiency of the proposed
methods
- âŠ