239 research outputs found
Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization
Principal component analysis (PCA) is widely used for dimensionality
reduction, with well-documented merits in various applications involving
high-dimensional data, including computer vision, preference measurement, and
bioinformatics. In this context, the fresh look advocated here permeates
benefits from variable selection and compressive sampling, to robustify PCA
against outliers. A least-trimmed squares estimator of a low-rank bilinear
factor analysis model is shown closely related to that obtained from an
-(pseudo)norm-regularized criterion encouraging sparsity in a matrix
explicitly modeling the outliers. This connection suggests robust PCA schemes
based on convex relaxation, which lead naturally to a family of robust
estimators encompassing Huber's optimal M-class as a special case. Outliers are
identified by tuning a regularization parameter, which amounts to controlling
sparsity of the outlier matrix along the whole robustification path of (group)
least-absolute shrinkage and selection operator (Lasso) solutions. Beyond its
neat ties to robust statistics, the developed outlier-aware PCA framework is
versatile to accommodate novel and scalable algorithms to: i) track the
low-rank signal subspace robustly, as new data are acquired in real time; and
ii) determine principal components robustly in (possibly) infinite-dimensional
feature spaces. Synthetic and real data tests corroborate the effectiveness of
the proposed robust PCA schemes, when used to identify aberrant responses in
personality assessment surveys, as well as unveil communities in social
networks, and intruders from video surveillance data.Comment: 30 pages, submitted to IEEE Transactions on Signal Processin
Projected Randomized Smoothing for Certified Adversarial Robustness
Randomized smoothing is the current state-of-the-art method for producing
provably robust classifiers. While randomized smoothing typically yields robust
-ball certificates, recent research has generalized provable robustness
to different norm balls as well as anisotropic regions. This work considers a
classifier architecture that first projects onto a low-dimensional
approximation of the data manifold and then applies a standard classifier. By
performing randomized smoothing in the low-dimensional projected space, we
characterize the certified region of our smoothed composite classifier back in
the high-dimensional input space and prove a tractable lower bound on its
volume. We show experimentally on CIFAR-10 and SVHN that classifiers without
the initial projection are vulnerable to perturbations that are normal to the
data manifold and yet are captured by the certified regions of our method. We
compare the volume of our certified regions against various baselines and show
that our method improves on the state-of-the-art by many orders of magnitude.Comment: Transactions on Machine Learning Research (TMLR) 202
Timescale effect estimation in time-series studies of air pollution and health: A Singular Spectrum Analysis approach
A wealth of epidemiological data suggests an association between
mortality/morbidity from pulmonary and cardiovascular adverse events and air
pollution, but uncertainty remains as to the extent implied by those
associations although the abundance of the data. In this paper we describe an
SSA (Singular Spectrum Analysis) based approach in order to decompose the
time-series of particulate matter concentration into a set of exposure
variables, each one representing a different timescale. We implement our
methodology to investigate both acute and long-term effects of
exposure on morbidity from respiratory causes within the urban area of Bari,
Italy.Comment: Published in at http://dx.doi.org/10.1214/07-EJS123 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Linear, Deterministic, and Order-Invariant Initialization Methods for the K-Means Clustering Algorithm
Over the past five decades, k-means has become the clustering algorithm of
choice in many application domains primarily due to its simplicity, time/space
efficiency, and invariance to the ordering of the data points. Unfortunately,
the algorithm's sensitivity to the initial selection of the cluster centers
remains to be its most serious drawback. Numerous initialization methods have
been proposed to address this drawback. Many of these methods, however, have
time complexity superlinear in the number of data points, which makes them
impractical for large data sets. On the other hand, linear methods are often
random and/or sensitive to the ordering of the data points. These methods are
generally unreliable in that the quality of their results is unpredictable.
Therefore, it is common practice to perform multiple runs of such methods and
take the output of the run that produces the best results. Such a practice,
however, greatly increases the computational requirements of the otherwise
highly efficient k-means algorithm. In this chapter, we investigate the
empirical performance of six linear, deterministic (non-random), and
order-invariant k-means initialization methods on a large and diverse
collection of data sets from the UCI Machine Learning Repository. The results
demonstrate that two relatively unknown hierarchical initialization methods due
to Su and Dy outperform the remaining four methods with respect to two
objective effectiveness criteria. In addition, a recent method due to Erisoglu
et al. performs surprisingly poorly.Comment: 21 pages, 2 figures, 5 tables, Partitional Clustering Algorithms
(Springer, 2014). arXiv admin note: substantial text overlap with
arXiv:1304.7465, arXiv:1209.196
Sparse recovery by reduced variance stochastic approximation
In this paper, we discuss application of iterative Stochastic Optimization
routines to the problem of sparse signal recovery from noisy observation. Using
Stochastic Mirror Descent algorithm as a building block, we develop a
multistage procedure for recovery of sparse solutions to Stochastic
Optimization problem under assumption of smoothness and quadratic minoration on
the expected objective. An interesting feature of the proposed algorithm is its
linear convergence of the approximate solution during the preliminary phase of
the routine when the component of stochastic error in the gradient observation
which is due to bad initial approximation of the optimal solution is larger
than the "ideal" asymptotic error component owing to observation noise "at the
optimal solution." We also show how one can straightforwardly enhance
reliability of the corresponding solution by using Median-of-Means like
techniques. We illustrate the performance of the proposed algorithms in
application to classical problems of recovery of sparse and low rank signals in
linear regression framework. We show, under rather weak assumption on the
regressor and noise distributions, how they lead to parameter estimates which
obey (up to factors which are logarithmic in problem dimension and confidence
level) the best known to us accuracy bounds
Applied Harmonic Analysis and Sparse Approximation
Efficiently analyzing functions, in particular multivariate functions, is a key problem in applied mathematics. The area of applied harmonic analysis has a significant impact on this problem by providing methodologies both for theoretical questions and for a wide range of applications in technology and science, such as image processing. Approximation theory, in particular the branch of the theory of sparse approximations, is closely intertwined with this area with a lot of recent exciting developments in the intersection of both. Research topics typically also involve related areas such as convex optimization, probability theory, and Banach space geometry. The workshop was the continuation of a first event in 2012 and intended to bring together world leading experts in these areas, to report on recent developments, and to foster new developments and collaborations
- …