37,753 research outputs found
Estimating Sparse Signals Using Integrated Wideband Dictionaries
In this paper, we introduce a wideband dictionary framework for estimating
sparse signals. By formulating integrated dictionary elements spanning bands of
the considered parameter space, one may efficiently find and discard large
parts of the parameter space not active in the signal. After each iteration,
the zero-valued parts of the dictionary may be discarded to allow a refined
dictionary to be formed around the active elements, resulting in a zoomed
dictionary to be used in the following iterations. Implementing this scheme
allows for more accurate estimates, at a much lower computational cost, as
compared to directly forming a larger dictionary spanning the whole parameter
space or performing a zooming procedure using standard dictionary elements.
Different from traditional dictionaries, the wideband dictionary allows for the
use of dictionaries with fewer elements than the number of available samples
without loss of resolution. The technique may be used on both one- and
multi-dimensional signals, and may be exploited to refine several traditional
sparse estimators, here illustrated with the LASSO and the SPICE estimators.
Numerical examples illustrate the improved performance
Noisy Subspace Clustering via Thresholding
We consider the problem of clustering noisy high-dimensional data points into
a union of low-dimensional subspaces and a set of outliers. The number of
subspaces, their dimensions, and their orientations are unknown. A
probabilistic performance analysis of the thresholding-based subspace
clustering (TSC) algorithm introduced recently in [1] shows that TSC succeeds
in the noisy case, even when the subspaces intersect. Our results reveal an
explicit tradeoff between the allowed noise level and the affinity of the
subspaces. We furthermore find that the simple outlier detection scheme
introduced in [1] provably succeeds in the noisy case.Comment: Presented at the IEEE Int. Symp. Inf. Theory (ISIT) 2013, Istanbul,
Turkey. The version posted here corrects a minor error in the published
version. Specifically, the exponent -c n_l in the success probability of
Theorem 1 and in the corresponding proof outline has been corrected to
-c(n_l-1
Robust computation of linear models by convex relaxation
Consider a dataset of vector-valued observations that consists of noisy
inliers, which are explained well by a low-dimensional subspace, along with
some number of outliers. This work describes a convex optimization problem,
called REAPER, that can reliably fit a low-dimensional model to this type of
data. This approach parameterizes linear subspaces using orthogonal projectors,
and it uses a relaxation of the set of orthogonal projectors to reach the
convex formulation. The paper provides an efficient algorithm for solving the
REAPER problem, and it documents numerical experiments which confirm that
REAPER can dependably find linear structure in synthetic and natural data. In
addition, when the inliers lie near a low-dimensional subspace, there is a
rigorous theory that describes when REAPER can approximate this subspace.Comment: Formerly titled "Robust computation of linear models, or How to find
a needle in a haystack
Multiscale principal component analysis
Principal component analysis (PCA) is an important tool in exploring data.
The conventional approach to PCA leads to a solution which favours the
structures with large variances. This is sensitive to outliers and could
obfuscate interesting underlying structures. One of the equivalent definitions
of PCA is that it seeks the subspaces that maximize the sum of squared pairwise
distances between data projections. This definition opens up more flexibility
in the analysis of principal components which is useful in enhancing PCA. In
this paper we introduce scales into PCA by maximizing only the sum of pairwise
distances between projections for pairs of datapoints with distances within a
chosen interval of values [l,u]. The resulting principal component
decompositions in Multiscale PCA depend on point (l,u) on the plane and for
each point we define projectors onto principal components. Cluster analysis of
these projectors reveals the structures in the data at various scales. Each
structure is described by the eigenvectors at the medoid point of the cluster
which represent the structure. We also use the distortion of projections as a
criterion for choosing an appropriate scale especially for data with outliers.
This method was tested on both artificial distribution of data and real data.
For data with multiscale structures, the method was able to reveal the
different structures of the data and also to reduce the effect of outliers in
the principal component analysis.Comment: 24 pages, 22 figure
- …