1,987 research outputs found
Robust Online Covariance and Sparse Precision Estimation Under Arbitrary Data Corruption
Gaussian graphical models are widely used to represent correlations among
entities but remain vulnerable to data corruption. In this work, we introduce a
modified trimmed-inner-product algorithm to robustly estimate the covariance in
an online scenario even in the presence of arbitrary and adversarial data
attacks. At each time step, data points, drawn nominally independently and
identically from a multivariate Gaussian distribution, arrive. However, a
certain fraction of these points may have been arbitrarily corrupted. We
propose an online algorithm to estimate the sparse inverse covariance (i.e.,
precision) matrix despite this corruption. We provide the error-bound and
convergence properties of the estimates to the true precision matrix under our
algorithms.Comment: 9 pages, 4 figures, 62nd IEEE Conference on Decision and Control
(CDC
Robust Regression via Hard Thresholding
We study the problem of Robust Least Squares Regression (RLSR) where several
response variables can be adversarially corrupted. More specifically, for a
data matrix X \in R^{p x n} and an underlying model w*, the response vector is
generated as y = X'w* + b where b \in R^n is the corruption vector supported
over at most C.n coordinates. Existing exact recovery results for RLSR focus
solely on L1-penalty based convex formulations and impose relatively strict
model assumptions such as requiring the corruptions b to be selected
independently of X.
In this work, we study a simple hard-thresholding algorithm called TORRENT
which, under mild conditions on X, can recover w* exactly even if b corrupts
the response variables in an adversarial manner, i.e. both the support and
entries of b are selected adversarially after observing X and w*. Our results
hold under deterministic assumptions which are satisfied if X is sampled from
any sub-Gaussian distribution. Finally unlike existing results that apply only
to a fixed w*, generated independently of X, our results are universal and hold
for any w* \in R^p.
Next, we propose gradient descent-based extensions of TORRENT that can scale
efficiently to large scale problems, such as high dimensional sparse recovery
and prove similar recovery guarantees for these extensions. Empirically we find
TORRENT, and more so its extensions, offering significantly faster recovery
than the state-of-the-art L1 solvers. For instance, even on moderate-sized
datasets (with p = 50K) with around 40% corrupted responses, a variant of our
proposed method called TORRENT-HYB is more than 20x faster than the best L1
solver.Comment: 24 pages, 3 figure
- …