4,047 research outputs found

    High-Performance Solvers for Dense Hermitian Eigenproblems

    Full text link
    We introduce a new collection of solvers - subsequently called EleMRRR - for large-scale dense Hermitian eigenproblems. EleMRRR solves various types of problems: generalized, standard, and tridiagonal eigenproblems. Among these, the last is of particular importance as it is a solver on its own right, as well as the computational kernel for the first two; we present a fast and scalable tridiagonal solver based on the Algorithm of Multiple Relatively Robust Representations - referred to as PMRRR. Like the other EleMRRR solvers, PMRRR is part of the freely available Elemental library, and is designed to fully support both message-passing (MPI) and multithreading parallelism (SMP). As a result, the solvers can equally be used in pure MPI or in hybrid MPI-SMP fashion. We conducted a thorough performance study of EleMRRR and ScaLAPACK's solvers on two supercomputers. Such a study, performed with up to 8,192 cores, provides precise guidelines to assemble the fastest solver within the ScaLAPACK framework; it also indicates that EleMRRR outperforms even the fastest solvers built from ScaLAPACK's components

    Perron vector optimization applied to search engines

    Full text link
    In the last years, Google's PageRank optimization problems have been extensively studied. In that case, the ranking is given by the invariant measure of a stochastic matrix. In this paper, we consider the more general situation in which the ranking is determined by the Perron eigenvector of a nonnegative, but not necessarily stochastic, matrix, in order to cover Kleinberg's HITS algorithm. We also give some results for Tomlin's HOTS algorithm. The problem consists then in finding an optimal outlink strategy subject to design constraints and for a given search engine. We study the relaxed versions of these problems, which means that we should accept weighted hyperlinks. We provide an efficient algorithm for the computation of the matrix of partial derivatives of the criterion, that uses the low rank property of this matrix. We give a scalable algorithm that couples gradient and power iterations and gives a local minimum of the Perron vector optimization problem. We prove convergence by considering it as an approximate gradient method. We then show that optimal linkage stategies of HITS and HOTS optimization problems verify a threshold property. We report numerical results on fragments of the real web graph for these search engine optimization problems.Comment: 28 pages, 5 figure

    Controllability Metrics, Limitations and Algorithms for Complex Networks

    Full text link
    This paper studies the problem of controlling complex networks, that is, the joint problem of selecting a set of control nodes and of designing a control input to steer a network to a target state. For this problem (i) we propose a metric to quantify the difficulty of the control problem as a function of the required control energy, (ii) we derive bounds based on the system dynamics (network topology and weights) to characterize the tradeoff between the control energy and the number of control nodes, and (iii) we propose an open-loop control strategy with performance guarantees. In our strategy we select control nodes by relying on network partitioning, and we design the control input by leveraging optimal and distributed control techniques. Our findings show several control limitations and properties. For instance, for Schur stable and symmetric networks: (i) if the number of control nodes is constant, then the control energy increases exponentially with the number of network nodes, (ii) if the number of control nodes is a fixed fraction of the network nodes, then certain networks can be controlled with constant energy independently of the network dimension, and (iii) clustered networks may be easier to control because, for sufficiently many control nodes, the control energy depends only on the controllability properties of the clusters and on their coupling strength. We validate our results with examples from power networks, social networks, and epidemics spreading

    Minimizing Communication for Eigenproblems and the Singular Value Decomposition

    Full text link
    Algorithms have two costs: arithmetic and communication. The latter represents the cost of moving data, either between levels of a memory hierarchy, or between processors over a network. Communication often dominates arithmetic and represents a rapidly increasing proportion of the total cost, so we seek algorithms that minimize communication. In \cite{BDHS10} lower bounds were presented on the amount of communication required for essentially all O(n3)O(n^3)-like algorithms for linear algebra, including eigenvalue problems and the SVD. Conventional algorithms, including those currently implemented in (Sca)LAPACK, perform asymptotically more communication than these lower bounds require. In this paper we present parallel and sequential eigenvalue algorithms (for pencils, nonsymmetric matrices, and symmetric matrices) and SVD algorithms that do attain these lower bounds, and analyze their convergence and communication costs.Comment: 43 pages, 11 figure

    Preconditioned Spectral Clustering for Stochastic Block Partition Streaming Graph Challenge

    Full text link
    Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is demonstrated to efficiently solve eigenvalue problems for graph Laplacians that appear in spectral clustering. For static graph partitioning, 10-20 iterations of LOBPCG without preconditioning result in ~10x error reduction, enough to achieve 100% correctness for all Challenge datasets with known truth partitions, e.g., for graphs with 5K/.1M (50K/1M) Vertices/Edges in 2 (7) seconds, compared to over 5,000 (30,000) seconds needed by the baseline Python code. Our Python code 100% correctly determines 98 (160) clusters from the Challenge static graphs with 0.5M (2M) vertices in 270 (1,700) seconds using 10GB (50GB) of memory. Our single-precision MATLAB code calculates the same clusters at half time and memory. For streaming graph partitioning, LOBPCG is initiated with approximate eigenvectors of the graph Laplacian already computed for the previous graph, in many cases reducing 2-3 times the number of required LOBPCG iterations, compared to the static case. Our spectral clustering is generic, i.e. assuming nothing specific of the block model or streaming, used to generate the graphs for the Challenge, in contrast to the base code. Nevertheless, in 10-stage streaming comparison with the base code for the 5K graph, the quality of our clusters is similar or better starting at stage 4 (7) for emerging edging (snowballing) streaming, while the computations are over 100-1000 faster.Comment: 6 pages. To appear in Proceedings of the 2017 IEEE High Performance Extreme Computing Conference. Student Innovation Award Streaming Graph Challenge: Stochastic Block Partition, see http://graphchallenge.mit.edu/champion
    • …
    corecore