4,047 research outputs found
High-Performance Solvers for Dense Hermitian Eigenproblems
We introduce a new collection of solvers - subsequently called EleMRRR - for
large-scale dense Hermitian eigenproblems. EleMRRR solves various types of
problems: generalized, standard, and tridiagonal eigenproblems. Among these,
the last is of particular importance as it is a solver on its own right, as
well as the computational kernel for the first two; we present a fast and
scalable tridiagonal solver based on the Algorithm of Multiple Relatively
Robust Representations - referred to as PMRRR. Like the other EleMRRR solvers,
PMRRR is part of the freely available Elemental library, and is designed to
fully support both message-passing (MPI) and multithreading parallelism (SMP).
As a result, the solvers can equally be used in pure MPI or in hybrid MPI-SMP
fashion. We conducted a thorough performance study of EleMRRR and ScaLAPACK's
solvers on two supercomputers. Such a study, performed with up to 8,192 cores,
provides precise guidelines to assemble the fastest solver within the ScaLAPACK
framework; it also indicates that EleMRRR outperforms even the fastest solvers
built from ScaLAPACK's components
Perron vector optimization applied to search engines
In the last years, Google's PageRank optimization problems have been
extensively studied. In that case, the ranking is given by the invariant
measure of a stochastic matrix. In this paper, we consider the more general
situation in which the ranking is determined by the Perron eigenvector of a
nonnegative, but not necessarily stochastic, matrix, in order to cover
Kleinberg's HITS algorithm. We also give some results for Tomlin's HOTS
algorithm. The problem consists then in finding an optimal outlink strategy
subject to design constraints and for a given search engine.
We study the relaxed versions of these problems, which means that we should
accept weighted hyperlinks. We provide an efficient algorithm for the
computation of the matrix of partial derivatives of the criterion, that uses
the low rank property of this matrix. We give a scalable algorithm that couples
gradient and power iterations and gives a local minimum of the Perron vector
optimization problem. We prove convergence by considering it as an approximate
gradient method.
We then show that optimal linkage stategies of HITS and HOTS optimization
problems verify a threshold property. We report numerical results on fragments
of the real web graph for these search engine optimization problems.Comment: 28 pages, 5 figure
Controllability Metrics, Limitations and Algorithms for Complex Networks
This paper studies the problem of controlling complex networks, that is, the
joint problem of selecting a set of control nodes and of designing a control
input to steer a network to a target state. For this problem (i) we propose a
metric to quantify the difficulty of the control problem as a function of the
required control energy, (ii) we derive bounds based on the system dynamics
(network topology and weights) to characterize the tradeoff between the control
energy and the number of control nodes, and (iii) we propose an open-loop
control strategy with performance guarantees. In our strategy we select control
nodes by relying on network partitioning, and we design the control input by
leveraging optimal and distributed control techniques. Our findings show
several control limitations and properties. For instance, for Schur stable and
symmetric networks: (i) if the number of control nodes is constant, then the
control energy increases exponentially with the number of network nodes, (ii)
if the number of control nodes is a fixed fraction of the network nodes, then
certain networks can be controlled with constant energy independently of the
network dimension, and (iii) clustered networks may be easier to control
because, for sufficiently many control nodes, the control energy depends only
on the controllability properties of the clusters and on their coupling
strength. We validate our results with examples from power networks, social
networks, and epidemics spreading
Minimizing Communication for Eigenproblems and the Singular Value Decomposition
Algorithms have two costs: arithmetic and communication. The latter
represents the cost of moving data, either between levels of a memory
hierarchy, or between processors over a network. Communication often dominates
arithmetic and represents a rapidly increasing proportion of the total cost, so
we seek algorithms that minimize communication. In \cite{BDHS10} lower bounds
were presented on the amount of communication required for essentially all
-like algorithms for linear algebra, including eigenvalue problems and
the SVD. Conventional algorithms, including those currently implemented in
(Sca)LAPACK, perform asymptotically more communication than these lower bounds
require. In this paper we present parallel and sequential eigenvalue algorithms
(for pencils, nonsymmetric matrices, and symmetric matrices) and SVD algorithms
that do attain these lower bounds, and analyze their convergence and
communication costs.Comment: 43 pages, 11 figure
Preconditioned Spectral Clustering for Stochastic Block Partition Streaming Graph Challenge
Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is
demonstrated to efficiently solve eigenvalue problems for graph Laplacians that
appear in spectral clustering. For static graph partitioning, 10-20 iterations
of LOBPCG without preconditioning result in ~10x error reduction, enough to
achieve 100% correctness for all Challenge datasets with known truth
partitions, e.g., for graphs with 5K/.1M (50K/1M) Vertices/Edges in 2 (7)
seconds, compared to over 5,000 (30,000) seconds needed by the baseline Python
code. Our Python code 100% correctly determines 98 (160) clusters from the
Challenge static graphs with 0.5M (2M) vertices in 270 (1,700) seconds using
10GB (50GB) of memory. Our single-precision MATLAB code calculates the same
clusters at half time and memory. For streaming graph partitioning, LOBPCG is
initiated with approximate eigenvectors of the graph Laplacian already computed
for the previous graph, in many cases reducing 2-3 times the number of required
LOBPCG iterations, compared to the static case. Our spectral clustering is
generic, i.e. assuming nothing specific of the block model or streaming, used
to generate the graphs for the Challenge, in contrast to the base code.
Nevertheless, in 10-stage streaming comparison with the base code for the 5K
graph, the quality of our clusters is similar or better starting at stage 4 (7)
for emerging edging (snowballing) streaming, while the computations are over
100-1000 faster.Comment: 6 pages. To appear in Proceedings of the 2017 IEEE High Performance
Extreme Computing Conference. Student Innovation Award Streaming Graph
Challenge: Stochastic Block Partition, see
http://graphchallenge.mit.edu/champion
- …