12,840 research outputs found
Recommended from our members
Solving large scale linear programming
The interior point method (IPM) is now well established as a competitive technique for solving very large scale linear programming problems. The leading variant of the interior point method is the primal dual - predictor corrector algorithm due to Mehrotra. The main computational steps of this algorithm are the repeated calculation and solution of a large sparse positive definite system of equations.
We describe an implementation of the predictor corrector IPM algorithm on MasPar, a massively parallel SIMD computer. At the heart of the implemen-tation is a parallel Cholesky factorization algorithm for sparse matrices. Our implementation uses a new scheme of mapping the matrix onto the processor grid of the MasPar, that results in a more efficient Cholesky factorization than previously suggested schemes.
The IPM implementation uses the parallel unit of MasPar to speed up the factorization and other computationally intensive parts of the IPM. An impor-tant part of this implementation is the judicious division of data and computation between the front-end computer, that runs the main IPM algorithm, and the par-allel unit. Performanc
Adapting the interior point method for the solution of LPs on serial, coarse grain parallel and massively parallel computers
In this paper we describe a unified scheme for implementing an interior point algorithm (IPM) over a range of computer architectures. In the inner iteration of the IPM a search direction is computed using Newton's method. Computationally this involves solving a sparse symmetric positive definite (SSPD) system of equations. The choice of direct and indirect methods for the solution of this system, and the design of data structures to take advantage of serial, coarse grain parallel and massively parallel computer architectures, are considered in detail. We put forward arguments as to why integration of the system within a sparse simplex solver is important and outline how the system is designed to achieve this integration
A distributed-memory package for dense Hierarchically Semi-Separable matrix computations using randomization
We present a distributed-memory library for computations with dense
structured matrices. A matrix is considered structured if its off-diagonal
blocks can be approximated by a rank-deficient matrix with low numerical rank.
Here, we use Hierarchically Semi-Separable representations (HSS). Such matrices
appear in many applications, e.g., finite element methods, boundary element
methods, etc. Exploiting this structure allows for fast solution of linear
systems and/or fast computation of matrix-vector products, which are the two
main building blocks of matrix computations. The compression algorithm that we
use, that computes the HSS form of an input dense matrix, relies on randomized
sampling with a novel adaptive sampling mechanism. We discuss the
parallelization of this algorithm and also present the parallelization of
structured matrix-vector product, structured factorization and solution
routines. The efficiency of the approach is demonstrated on large problems from
different academic and industrial applications, on up to 8,000 cores.
This work is part of a more global effort, the STRUMPACK (STRUctured Matrices
PACKage) software package for computations with sparse and dense structured
matrices. Hence, although useful on their own right, the routines also
represent a step in the direction of a distributed-memory sparse solver
Solution of partial differential equations on vector and parallel computers
The present status of numerical methods for partial differential equations on vector and parallel computers was reviewed. The relevant aspects of these computers are discussed and a brief review of their development is included, with particular attention paid to those characteristics that influence algorithm selection. Both direct and iterative methods are given for elliptic equations as well as explicit and implicit methods for initial boundary value problems. The intent is to point out attractive methods as well as areas where this class of computer architecture cannot be fully utilized because of either hardware restrictions or the lack of adequate algorithms. Application areas utilizing these computers are briefly discussed
High-performance direct solution of finite element problems on multi-core processors
A direct solution procedure is proposed and developed which exploits the parallelism that exists in current symmetric multiprocessing (SMP) multi-core processors. Several algorithms are proposed and developed to improve the performance of the direct solution of FE problems. A high-performance sparse direct solver is developed which allows experimentation with the newly developed and existing algorithms. The performance of the algorithms is investigated using a large set of FE problems. Furthermore, operation count estimations are developed to further assess various algorithms. An out-of-core version of the solver is developed to reduce the memory requirements for the solution. I/O is performed asynchronously without blocking the thread that makes the I/O request. Asynchronous I/O allows overlapping factorization and triangular solution computations with I/O. The performance of the developed solver is demonstrated on a large number of test problems. A problem with nearly 10 million degree of freedoms is solved on a low price desktop computer using the out-of-core version of the direct solver. Furthermore, the developed solver usually outperforms a commonly used shared memory solver.Ph.D.Committee Chair: Will, Kenneth; Committee Member: Emkin, Leroy; Committee Member: Kurc, Ozgur; Committee Member: Vuduc, Richard; Committee Member: White, Donal
An Efficient Algorithm For Simulating Fracture Using Large Fuse Networks
The high computational cost involved in modeling of the progressive fracture
simulations using large discrete lattice networks stems from the requirement to
solve {\it a new large set of linear equations} every time a new lattice bond
is broken. To address this problem, we propose an algorithm that combines the
multiple-rank sparse Cholesky downdating algorithm with the rank-p inverse
updating algorithm based on the Sherman-Morrison-Woodbury formula for the
simulation of progressive fracture in disordered quasi-brittle materials using
discrete lattice networks. Using the present algorithm, the computational
complexity of solving the new set of linear equations after breaking a bond
reduces to the same order as that of a simple {\it backsolve} (forward
elimination and backward substitution) {\it using the already LU factored
matrix}. That is, the computational cost is , where denotes the number of non-zeros of the Cholesky factorization of
the stiffness matrix . This algorithm using the direct sparse solver
is faster than the Fourier accelerated preconditioned conjugate gradient (PCG)
iterative solvers, and eliminates the {\it critical slowing down} associated
with the iterative solvers that is especially severe close to the critical
points. Numerical results using random resistor networks substantiate the
efficiency of the present algorithm.Comment: 15 pages including 1 figure. On page pp11407 of the original paper
(J. Phys. A: Math. Gen. 36 (2003) 11403-11412), Eqs. 11 and 12 were
misprinted that went unnoticed during the proof reading stag
An efficient null space inexact Newton method for hydraulic simulation of water distribution networks
Null space Newton algorithms are efficient in solving the nonlinear equations
arising in hydraulic analysis of water distribution networks. In this article,
we propose and evaluate an inexact Newton method that relies on partial updates
of the network pipes' frictional headloss computations to solve the linear
systems more efficiently and with numerical reliability. The update set
parameters are studied to propose appropriate values. Different null space
basis generation schemes are analysed to choose methods for sparse and
well-conditioned null space bases resulting in a smaller update set. The Newton
steps are computed in the null space by solving sparse, symmetric positive
definite systems with sparse Cholesky factorizations. By using the constant
structure of the null space system matrices, a single symbolic factorization in
the Cholesky decomposition is used multiple times, reducing the computational
cost of linear solves. The algorithms and analyses are validated using medium
to large-scale water network models.Comment: 15 pages, 9 figures, Preprint extension of Abraham and Stoianov, 2015
(https://dx.doi.org/10.1061/(ASCE)HY.1943-7900.0001089), September 2015.
Includes extended exposition, additional case studies and new simulations and
analysi
- …