Search CORE

249 research outputs found

Quilting Stochastic Kronecker Product Graphs to Generate Multiplicative Attribute Graphs

Author: Vishwanathan S. V. N.
Yun Hyokun
Publication venue
Publication date: 09/02/2012
Field of study

We describe the first sub-quadratic sampling algorithm for the Multiplicative Attribute Graph Model (MAGM) of Kim and Leskovec (2010). We exploit the close connection between MAGM and the Kronecker Product Graph Model (KPGM) of Leskovec et al. (2010), and show that to sample a graph from a MAGM it suffices to sample small number of KPGM graphs and \emph{quilt} them together. Under a restricted set of technical conditions our algorithm runs in

O((\log_2(n))^3 |E|)

time, where

n

is the number of nodes and

|E|

is the number of edges in the sampled graph. We demonstrate the scalability of our algorithm via extensive empirical evaluation; we can sample a MAGM graph with 8 million nodes and 20 billion edges in under 6 hours

arXiv.org e-Print Archive

CiteSeerX

DFacTo: Distributed Factorization of Tensors

Author: Choi Joon Hee
Vishwanathan S. V. N.
Publication venue
Publication date: 17/06/2014
Field of study

We present a technique for significantly speeding up Alternating Least Squares (ALS) and Gradient Descent (GD), two widely used algorithms for tensor factorization. By exploiting properties of the Khatri-Rao product, we show how to efficiently address a computationally challenging sub-step of both algorithms. Our algorithm, DFacTo, only requires two sparse matrix-vector products and is easy to parallelize. DFacTo is not only scalable but also on average 4 to 10 times faster than competing algorithms on a variety of datasets. For instance, DFacTo only takes 480 seconds on 4 machines to perform one iteration of the ALS algorithm and 1,143 seconds to perform one iteration of the GD algorithm on a 6.5 million x 2.5 million x 1.5 million dimensional tensor with 1.2 billion non-zero entries.Comment: Under review for NIPS 201

arXiv.org e-Print Archive

CiteSeerX

Distributed Stochastic Optimization of the Regularized Risk

Author: Matsushima Shin
Vishwanathan S. V. N.
Yun Hyokun
Zhang Xinhua
Publication venue
Publication date: 09/06/2015
Field of study

Many machine learning algorithms minimize a regularized risk, and stochastic optimization is widely used for this task. When working with massive data, it is desirable to perform stochastic optimization in parallel. Unfortunately, many existing stochastic optimization algorithms cannot be parallelized efficiently. In this paper we show that one can rewrite the regularized risk minimization problem as an equivalent saddle-point problem, and propose an efficient distributed stochastic optimization (DSO) algorithm. We prove the algorithm's rate of convergence; remarkably, our analysis shows that the algorithm scales almost linearly with the number of processors. We also verify with empirical evaluations that the proposed algorithm is competitive with other parallel, general purpose stochastic and batch optimization algorithms for regularized risk minimization

arXiv.org e-Print Archive