168 research outputs found
High Performance Parallel Algorithms for the Tucker Decomposition of Sparse Tensors
International audience—We investigate an efficient parallelization of a class of algorithms for the well-known Tucker decomposition of general N-dimensional sparse tensors. The targeted algorithms are iterative and use the alternating least squares method. At each iteration, for each dimension of an N-dimensional input tensor, the following operations are performed: (i) the tensor is multiplied with (N − 1) matrices (TTMc step); (ii) the product is then converted to a matrix; and (iii) a few leading left singular vectors of the resulting matrix are computed (TRSVD step) to update one of the matrices for the next TTMc step. We propose an efficient parallelization of these algorithms for the current parallel platforms with multicore nodes. We discuss a set of preprocessing steps which takes all computational decisions out of the main iteration of the algorithm and provides an intuitive shared-memory parallelism for the TTM and TRSVD steps. We propose a coarse and a fine-grain parallel algorithm in a distributed memory environment, investigate data dependencies, and identify efficient communication schemes. We demonstrate how the computation of singular vectors in the TRSVD step can be carried out efficiently following the TTMc step. Finally, we develop a hybrid MPI-OpenMP implementation of the overall algorithm and report scalability results on up to 4096 cores on 256 nodes of an IBM BlueGene/Q supercomputer
A medium-grain method for fast 2D bipartitioning of sparse matrices
We present a new hypergraph-based method, the medium-grain method, for solving the sparse matrix partitioning problem. This problem arises when distributing data for parallel sparse matrix-vector multiplication. In the medium-grain method, each matrix nonzero is assigned to either a row group or a column group, and these groups are represented by vertices of the hypergraph. For an m x n sparse matrix, the resulting hypergraph has m + n vertices and m + n hyperedges.
Furthermore, we present an iterative refinement procedure for improvement of a given partitioning, based on the medium-grain method, which can be applied as a cheap but effective postprocessing step after any partitioning method.
The medium-grain method is able to produce fully two-dimensional bipartitionings, but its computational complexity equals that of one-dimensional methods. Experimental results for a large set of sparse test matrices show that the medium-grain method with iterative refinement produces bipartitionings with lower communication volume compared to current state-of-the-art methods, and is faster at producing them
Parallel image restoration
Cataloged from PDF version of article.In this thesis, we are concerned with the image restoration problem which has
been formulated in the literature as a system of linear inequalities. With this formulation,
the resulting constraint matrix is an unstructured sparse-matrix and
even with small size images we end up with huge matrices. So, to solve the
restoration problem, we have used the surrogate constraint methods, that can
work efficiently for large size problems and are amenable for parallel implementations.
Among the surrogate constraint methods, the basic method considers all
of the violated constraints in the system and performs a single block projection
in each step. On the other hand, parallel method considers a subset of the constraints,
and makes simultaneous block projections. Using several partitioning
strategies and adopting different communication models we have realized several
parallel implementations of the two methods. We have used the hypergraph partitioning
based decomposition methods in order to minimize the communication
costs while ensuring load balance among the processors. The implementations
are evaluated based on the per iteration performance and on the overall performance.
Besides, the effects of different partitioning strategies on the speed of
convergence are investigated. The experimental results reveal that the proposed
parallelization schemes have practical usage in the restoration problem and in
many other real-world applications which can be modeled as a system of linear
inequalities.Malas, TahirM.S
Recent Advances in Graph Partitioning
We survey recent trends in practical algorithms for balanced graph
partitioning together with applications and future research directions
Web-site-based partitioning techniques for efficient parallelization of the PageRank computation
Cataloged from PDF version of article.Web search engines use ranking techniques to order Web pages in query results.
PageRank is an important technique, which orders Web pages according to the
linkage structure of the Web. The efficiency of the PageRank computation is important
since the constantly evolving nature of the Web requires this computation
to be repeated many times. PageRank computation includes repeated iterative
sparse matrix-vector multiplications. Due to the enormous size of the Web matrix
to be multiplied, PageRank computations are usually carried out on parallel
systems. However, efficiently parallelizing PageRank is not an easy task, because
of the irregular sparsity pattern of the Web matrix. Graph and hypergraphpartitioning-based
techniques are widely used for efficiently parallelizing matrixvector
multiplications. Recently, a hypergraph-partitioning-based decomposition
technique for fast parallel computation of PageRank is proposed. This technique
aims to minimize the communication overhead of the parallel matrix-vector multiplication.
However, the proposed technique has a high prepropocessing time,
which makes the technique impractical. In this work, we propose 1D (rowwise
and columnwise) and 2D (fine-grain and checkerboard) decomposition models
using web-site-based graph and hypergraph-partitioning techniques. Proposed
models minimize the communication overhead of the parallel PageRank computations
with a reasonable preprocessing time. The models encapsulate not only
the matrix-vector multiplication, but the overall iterative algorithm. Conducted
experiments show that the proposed models achieve fast PageRank computation
with low preprocessing time, compared with those in the literature.Cevahir, AliM.S
Open Problems in (Hyper)Graph Decomposition
Large networks are useful in a wide range of applications. Sometimes problem
instances are composed of billions of entities. Decomposing and analyzing these
structures helps us gain new insights about our surroundings. Even if the final
application concerns a different problem (such as traversal, finding paths,
trees, and flows), decomposing large graphs is often an important subproblem
for complexity reduction or parallelization. This report is a summary of
discussions that happened at Dagstuhl seminar 23331 on "Recent Trends in Graph
Decomposition" and presents currently open problems and future directions in
the area of (hyper)graph decomposition
Parpatoh : A 2D-parallel hypergraph partitioning tool
Cataloged from PDF version of article.Hypergraph partitioning is a process that is being used to find solutions for
optimization problems in various areas, including parallel volume rendering, parallel
information retrieval and VLSI circuit design. While the current partitioning
methods are adequate for hypergraphs up to certain size, these methods start to
fail once the problem size exceeds this threshold.
In this thesis we introduce ParPaToH, a parallel p-way hypergraph partitioning
tool that makes use of a 2-D decomposition to reduce the communication
overhead and implements a parallel-computing friendly version of the accepted
multi-level partitioning paradigm to generate its partitioning. We present new
concepts in hypergraph partitioning that lead to a coarse-grain parallel solution.
Finally, we discuss the implementation of the tool in detail and present experimental
results to demonstrate its effectiveness.Karaca, EvrenM.S
- …