9,658 research outputs found
On Spectral Graph Embedding: A Non-Backtracking Perspective and Graph Approximation
Graph embedding has been proven to be efficient and effective in facilitating
graph analysis. In this paper, we present a novel spectral framework called
NOn-Backtracking Embedding (NOBE), which offers a new perspective that
organizes graph data at a deep level by tracking the flow traversing on the
edges with backtracking prohibited. Further, by analyzing the non-backtracking
process, a technique called graph approximation is devised, which provides a
channel to transform the spectral decomposition on an edge-to-edge matrix to
that on a node-to-node matrix. Theoretical guarantees are provided by bounding
the difference between the corresponding eigenvalues of the original graph and
its graph approximation. Extensive experiments conducted on various real-world
networks demonstrate the efficacy of our methods on both macroscopic and
microscopic levels, including clustering and structural hole spanner detection.Comment: SDM 2018 (Full version including all proofs
Distributed Machine Learning via Sufficient Factor Broadcasting
Matrix-parametrized models, including multiclass logistic regression and
sparse coding, are used in machine learning (ML) applications ranging from
computer vision to computational biology. When these models are applied to
large-scale ML problems starting at millions of samples and tens of thousands
of classes, their parameter matrix can grow at an unexpected rate, resulting in
high parameter synchronization costs that greatly slow down distributed
learning. To address this issue, we propose a Sufficient Factor Broadcasting
(SFB) computation model for efficient distributed learning of a large family of
matrix-parameterized models, which share the following property: the parameter
update computed on each data sample is a rank-1 matrix, i.e., the outer product
of two "sufficient factors" (SFs). By broadcasting the SFs among worker
machines and reconstructing the update matrices locally at each worker, SFB
improves communication efficiency --- communication costs are linear in the
parameter matrix's dimensions, rather than quadratic --- without affecting
computational correctness. We present a theoretical convergence analysis of
SFB, and empirically corroborate its efficiency on four different
matrix-parametrized ML models
- …