Search CORE

5,176 research outputs found

On Optimizing Distributed Tucker Decomposition for Dense Tensors

Author: Chakaravarthy Venkatesan T
Choi Jee W
Joseph Douglas J
Liu Xing
Murali Prakash
Sabharwal Yogish
Sreedhar Dheeraj
Publication venue
Publication date: 18/07/2017
Field of study

The Tucker decomposition expresses a given tensor as the product of a small core tensor and a set of factor matrices. Apart from providing data compression, the construction is useful in performing analysis such as principal component analysis (PCA)and finds applications in diverse domains such as signal processing, computer vision and text analytics. Our objective is to develop an efficient distributed implementation for the case of dense tensors. The implementation is based on the HOOI (Higher Order Orthogonal Iterator) procedure, wherein the tensor-times-matrix product forms the core routine. Prior work have proposed heuristics for reducing the computational load and communication volume incurred by the routine. We study the two metrics in a formal and systematic manner, and design strategies that are optimal under the two fundamental metrics. Our experimental evaluation on a large benchmark of tensors shows that the optimal strategies provide significant reduction in load and volume compared to prior heuristics, and provide up to 7x speed-up in the overall running time.Comment: Preliminary version of the paper appears in the proceedings of IPDPS'1

arXiv.org e-Print Archive

Crossref

Implicit Decomposition for Write-Efficient Connectivity Algorithms

Author: Ben-David Naama
Blelloch Guy E.
Fineman Jeremy T.
Gibbons Phillip B.
Gu Yan
McGuffey Charles
Shun Julian
Publication venue
Publication date: 07/10/2017
Field of study

The future of main memory appears to lie in the direction of new technologies that provide strong capacity-to-performance ratios, but have write operations that are much more expensive than reads in terms of latency, bandwidth, and energy. Motivated by this trend, we propose sequential and parallel algorithms to solve graph connectivity problems using significantly fewer writes than conventional algorithms. Our primary algorithmic tool is the construction of an

o(n)

-sized "implicit decomposition" of a bounded-degree graph

G

n

nodes, which combined with read-only access to

G

enables fast answers to connectivity and biconnectivity queries on

G

. The construction breaks the linear-write "barrier", resulting in costs that are asymptotically lower than conventional algorithms while adding only a modest cost to querying time. For general non-sparse graphs on

m

edges, we also provide the first

o(m)

writes and

O(m)

operations parallel algorithms for connectivity and biconnectivity. These algorithms provide insight into how applications can efficiently process computations on large graphs in systems with read-write asymmetry

arXiv.org e-Print Archive

Crossref

DSpace@MIT