Search CORE

3,334 research outputs found

Online Row Sampling

Author: Cohen Michael B.
Musco Cameron
Pachocki Jakub
Publication venue
Publication date: 01/01/2016
Field of study

Finding a small spectral approximation for a tall

n \times d

matrix

A

is a fundamental numerical primitive. For a number of reasons, one often seeks an approximation whose rows are sampled from those of

A

. Row sampling improves interpretability, saves space when

A

is sparse, and preserves row structure, which is especially important, for example, when

A

represents a graph. However, correctly sampling rows from

A

can be costly when the matrix is large and cannot be stored and processed in memory. Hence, a number of recent publications focus on row sampling in the streaming setting, using little more space than what is required to store the outputted approximation [KL13, KLM+14]. Inspired by a growing body of work on online algorithms for machine learning and data analysis, we extend this work to a more restrictive online setting: we read rows of

A

one by one and immediately decide whether each row should be kept in the spectral approximation or discarded, without ever retracting these decisions. We present an extremely simple algorithm that approximates

A

up to multiplicative error

\epsilon

and additive error

\delta

using

O(d \log d \log(\epsilon||A||_2/\delta)/\epsilon^2)

online samples, with memory overhead proportional to the cost of storing the spectral approximation. We also present an algorithm that uses

O(d^2

) memory but only requires

O(d\log(\epsilon||A||_2/\delta)/\epsilon^2)

samples, which we show is optimal. Our methods are clean and intuitive, allow for lower memory usage than prior work, and expose new theoretical properties of leverage score based matrix approximation

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Fast filtering and animation of large dynamic networks

Author: Aiello Luca Maria
Grabowicz Przemyslaw A.
Menczer Filippo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/10/2014
Field of study

Detecting and visualizing what are the most relevant changes in an evolving network is an open challenge in several domains. We present a fast algorithm that filters subsets of the strongest nodes and edges representing an evolving weighted graph and visualize it by either creating a movie, or by streaming it to an interactive network visualization tool. The algorithm is an approximation of exponential sliding time-window that scales linearly with the number of interactions. We compare the algorithm against rectangular and exponential sliding time-window methods. Our network filtering algorithm: i) captures persistent trends in the structure of dynamic weighted networks, ii) smoothens transitions between the snapshots of dynamic network, and iii) uses limited memory and processor time. The algorithm is publicly available as open-source software.Comment: 6 figures, 2 table

arXiv.org e-Print Archive

Springer - Publisher Connector