lp-Recovery of the Most Significant Subspace among Multiple Subspaces
  with Outliers

A Bargiela; DL Donoho; DL Donoho; E Arias-Castro; E Arias-Castro; EJ Candès; EJ Candès; EJ Candès; F Qi; G David; G Golub; G Lerman; GA Watson; Gilad Lerman; H Nyquist; H Späth; HL Harter; J Yan; M Fischler; M McCoy; M Soltanolkotabi; MR Osborne; P Mattila; PHS Torr; PJ Huber; PJ Rousseeuw; RA Maronna; SJ Szarek; T Zhang; Teng Zhang; WE Deming; Y Dodge; Y-C Wong

research

lp-Recovery of the Most Significant Subspace among Multiple Subspaces with Outliers

Authors: A Bargiela
DL Donoho
DL Donoho
E Arias-Castro
E Arias-Castro
EJ Candès
EJ Candès
EJ Candès
F Qi
G David
G Golub
G Lerman
GA Watson
Gilad Lerman
H Nyquist
H Späth
HL Harter
J Yan
M Fischler
M McCoy
M Soltanolkotabi
MR Osborne
P Mattila
PHS Torr
PJ Huber
PJ Rousseeuw
RA Maronna
SJ Szarek
T Zhang
Teng Zhang
WE Deming
Y Dodge
Y-C Wong
Publication date: 13 January 2014
Publisher: 'Springer Science and Business Media LLC'
Doi

Abstract

We assume data sampled from a mixture of d-dimensional linear subspaces with spherically symmetric distributions within each subspace and an additional outlier component with spherically symmetric distribution within the ambient space (for simplicity we may assume that all distributions are uniform on their corresponding unit spheres). We also assume mixture weights for the different components. We say that one of the underlying subspaces of the model is most significant if its mixture weight is higher than the sum of the mixture weights of all other subspaces. We study the recovery of the most significant subspace by minimizing the lp-averaged distances of data points from d-dimensional subspaces, where p>0. Unlike other lp minimization problems, this minimization is non-convex for all p>0 and thus requires different methods for its analysis. We show that if 0<p<=1, then for any fraction of outliers the most significant subspace can be recovered by lp minimization with overwhelming probability (which depends on the generating distribution and its parameters). We show that when adding small noise around the underlying subspaces the most significant subspace can be nearly recovered by lp minimization for any 0<p<=1 with an error proportional to the noise level. On the other hand, if p>1 and there is more than one underlying subspace, then with overwhelming probability the most significant subspace cannot be recovered or nearly recovered. This last result does not require spherically symmetric outliers.Comment: This is a revised version of the part of 1002.1994 that deals with single subspace recovery. V3: Improved estimates (in particular for Lemma 3.1 and for estimates relying on it), asymptotic dependence of probabilities and constants on D and d and further clarifications; for simplicity it assumes uniform distributions on spheres. V4: minor revision for the published versio

Similar works

Full text

Available Versions

Crossref

info:doi/10.1007%2Fs00365-014-...

Last time updated on 03/12/2019