Search CORE

14 research outputs found

Conditioning of Leverage Scores and Computation by QR Decomposition

Author: Holodnak John T.
Ipsen Ilse C. F.
Wentworth Thomas A.
Publication venue
Publication date: 01/01/2015
Field of study

The leverage scores of a full-column rank matrix A are the squared row norms of any orthonormal basis for range(A). We show that corresponding leverage scores of two matrices A and A + \Delta A are close in the relative sense, if they have large magnitude and if all principal angles between the column spaces of A and A + \Delta A are small. We also show three classes of bounds that are based on perturbation results of QR decompositions. They demonstrate that relative differences between individual leverage scores strongly depend on the particular type of perturbation \Delta A. The bounds imply that the relative accuracy of an individual leverage score depends on: its magnitude and the two-norm condition of A, if \Delta A is a general perturbation; the two-norm condition number of A, if \Delta A is a perturbation with the same norm-wise row-scaling as A; (to first order) neither condition number nor leverage score magnitude, if \Delta A is a component-wise row-scaled perturbation. Numerical experiments confirm the qualitative and quantitative accuracy of our bounds.Comment: This version has been accepted to SIMAX but has not yet gone through copy editin

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Iterative Row Sampling

Author: Li Mu
Miller Gary L.
Peng Richard
Publication venue
Publication date: 01/01/2013
Field of study

There has been significant interest and progress recently in algorithms that solve regression problems involving tall and thin matrices in input sparsity time. These algorithms find shorter equivalent of a n*d matrix where n >> d, which allows one to solve a poly(d) sized problem instead. In practice, the best performances are often obtained by invoking these routines in an iterative fashion. We show these iterative methods can be adapted to give theoretical guarantees comparable and better than the current state of the art. Our approaches are based on computing the importances of the rows, known as leverage scores, in an iterative manner. We show that alternating between computing a short matrix estimate and finding more accurate approximate leverage scores leads to a series of geometrically smaller instances. This gives an algorithm that runs in

O(nnz(A) + d^{\omega + \theta} \epsilon^{-2})

time for any

\theta > 0

, where the

d^{\omega + \theta}

term is comparable to the cost of solving a regression problem on the small approximation. Our results are built upon the close connection between randomized matrix algorithms, iterative methods, and graph sparsification.Comment: 26 pages, 2 figure

arXiv.org e-Print Archive

Randomized Dimensionality Reduction for k-means Clustering

Author: Boutsidis Christos
Drineas Petros
Mahoney Michael W.
Zouzias Anastasios
Publication venue
Publication date: 01/01/2013
Field of study

We study the topic of dimensionality reduction for

k

-means clustering. Dimensionality reduction encompasses the union of two approaches: \emph{feature selection} and \emph{feature extraction}. A feature selection based algorithm for

k

-means clustering selects a small subset of the input features and then applies

k

-means clustering on the selected features. A feature extraction based algorithm for

k

-means clustering constructs a small set of new artificial features and then applies

k

-means clustering on the constructed features. Despite the significance of

k

-means clustering as well as the wealth of heuristic methods addressing it, provably accurate feature selection methods for

k

-means clustering are not known. On the other hand, two provably accurate feature extraction methods for

k

-means clustering are known in the literature; one is based on random projections and the other is based on the singular value decomposition (SVD). This paper makes further progress towards a better understanding of dimensionality reduction for

k

-means clustering. Namely, we present the first provably accurate feature selection method for

k

-means clustering and, in addition, we present two feature extraction methods. The first feature extraction method is based on random projections and it improves upon the existing results in terms of time complexity and number of features needed to be extracted. The second feature extraction method is based on fast approximate SVD factorizations and it also improves upon the existing results in terms of time complexity. The proposed algorithms are randomized and provide constant-factor approximation guarantees with respect to the optimal

k

-means objective value.Comment: IEEE Transactions on Information Theory, to appea

arXiv.org e-Print Archive

CiteSeerX

Algorithmic and Statistical Perspectives on Large-Scale Data Analysis

Author: Mahoney Michael W.
Publication venue
Publication date: 08/10/2010
Field of study

In recent years, ideas from statistics and scientific computing have begun to interact in increasingly sophisticated and fruitful ways with ideas from computer science and the theory of algorithms to aid in the development of improved worst-case algorithms that are useful for large-scale scientific and Internet data analysis problems. In this chapter, I will describe two recent examples---one having to do with selecting good columns or features from a (DNA Single Nucleotide Polymorphism) data matrix, and the other having to do with selecting good clusters or communities from a data graph (representing a social or information network)---that drew on ideas from both areas and that may serve as a model for exploiting complementary algorithmic and statistical perspectives in order to solve applied large-scale data analysis problems.Comment: 33 pages. To appear in Uwe Naumann and Olaf Schenk, editors, "Combinatorial Scientific Computing," Chapman and Hall/CRC Press, 201

arXiv.org e-Print Archive

CiteSeerX

Optimal CUR Matrix Decompositions

Author: Boutsidis C.
Drineas P.
Gu M.
Guruswami V.
Wang S.
Publication venue
Publication date: 16/07/2014
Field of study

The CUR decomposition of an

m \times n

matrix

A

finds an

m \times c

matrix

C

with a subset of

c < n

columns of

A,

together with an

r \times n

matrix

R

with a subset of

r < m

rows of

A,

as well as a

c \times r

low-rank matrix

U

such that the matrix

C U R

approximates the matrix

A,

that is,

|| A - CUR ||_F^2 \le (1+\epsilon) || A - A_k||_F^2

, where

||.||_F

denotes the Frobenius norm and

A_k

is the best

m \times n

matrix of rank

k

constructed via the SVD. We present input-sparsity-time and deterministic algorithms for constructing such a CUR decomposition where

c=O(k/\epsilon)

and

r=O(k/\epsilon)

and rank

(U) = k

. Up to constant factors, our algorithms are simultaneously optimal in

c, r,

and rank

(U)

.Comment: small revision in lemma 4.

arXiv.org e-Print Archive

CiteSeerX

Crossref

Fast approximation of matrix coherence and statistical leverage

Author: David P. Woodruff
Malik Magdon-ismail
Mehryar Mohri
Michael W. Mahoney
Petros Drineas
Publication venue
Publication date: 01/01/2011
Field of study

The statistical leverage scores of a matrix

A

are the squared row-norms of the matrix containing its (top) left singular vectors and the coherence is the largest leverage score. These quantities are of interest in recently-popular problems such as matrix completion and Nystr\"{o}m-based low-rank matrix approximation as well as in large-scale statistical data analysis applications more generally; moreover, they are of interest since they define the key structural nonuniformity that must be dealt with in developing fast randomized matrix algorithms. Our main result is a randomized algorithm that takes as input an arbitrary

n \times d

matrix

A

, with

n \gg d

, and that returns as output relative-error approximations to all

n

of the statistical leverage scores. The proposed algorithm runs (under assumptions on the precise values of

n

and

d

) in

O(n d \log n)

time, as opposed to the

O(nd^2)

time required by the na\"{i}ve algorithm that involves computing an orthogonal basis for the range of

A

. Our analysis may be viewed in terms of computing a relative-error approximation to an underconstrained least-squares approximation problem, or, relatedly, it may be viewed as an application of Johnson-Lindenstrauss type ideas. Several practically-important extensions of our basic result are also described, including the approximation of so-called cross-leverage scores, the extension of these ideas to matrices with

n \approx d

, and the extension to streaming environments.Comment: 29 pages; conference version is in ICML; journal version is in JML

arXiv.org e-Print Archive

CiteSeerX