Search CORE

138 research outputs found

A note on sparse least-squares regression

Author: Boutsidis Christos
Magdon-Ismail Malik
Publication venue
Publication date: 28/12/2013
Field of study

We compute a \emph{sparse} solution to the classical least-squares problem

\min_x||A x -b||,

where

A

is an arbitrary matrix. We describe a novel algorithm for this sparse least-squares problem. The algorithm operates as follows: first, it selects columns from

A

, and then solves a least-squares problem only with the selected columns. The column selection algorithm that we use is known to perform well for the well studied column subset selection problem. The contribution of this article is to show that it gives favorable results for sparse least-squares as well. Specifically, we prove that the solution vector obtained by our algorithm is close to the solution vector obtained via what is known as the "SVD-truncated regularization approach".Comment: Information Processing Letters, to appea

arXiv.org e-Print Archive

CiteSeerX

Provable Deterministic Leverage Score Sampling

Author: Boutsidis C.
Gittens A.
Hong Y. P.
Kunegis J.
Publication venue
Publication date: 02/06/2014
Field of study

We explain theoretically a curious empirical phenomenon: "Approximating a matrix by deterministically selecting a subset of its columns with the corresponding largest leverage scores results in a good low-rank matrix surrogate". To obtain provable guarantees, previous work requires randomized sampling of the columns with probabilities proportional to their leverage scores. In this work, we provide a novel theoretical analysis of deterministic leverage score sampling. We show that such deterministic sampling can be provably as accurate as its randomized counterparts, if the leverage scores follow a moderately steep power-law decay. We support this power-law assumption by providing empirical evidence that such decay laws are abundant in real-world data sets. We then demonstrate empirically the performance of deterministic leverage score sampling, which many times matches or outperforms the state-of-the-art techniques.Comment: 20th ACM SIGKDD Conference on Knowledge Discovery and Data Minin

arXiv.org e-Print Archive

Crossref