Search CORE

48,679 research outputs found

Distributed-memory large deformation diffeomorphic 3D image registration

Author: Biros George
Gholami Amir
Mang Andreas
Publication venue
Publication date: 11/08/2016
Field of study

We present a parallel distributed-memory algorithm for large deformation diffeomorphic registration of volumetric images that produces large isochoric deformations (locally volume preserving). Image registration is a key technology in medical image analysis. Our algorithm uses a partial differential equation constrained optimal control formulation. Finding the optimal deformation map requires the solution of a highly nonlinear problem that involves pseudo-differential operators, biharmonic operators, and pure advection operators both forward and back- ward in time. A key issue is the time to solution, which poses the demand for efficient optimization methods as well as an effective utilization of high performance computing resources. To address this problem we use a preconditioned, inexact, Gauss-Newton- Krylov solver. Our algorithm integrates several components: a spectral discretization in space, a semi-Lagrangian formulation in time, analytic adjoints, different regularization functionals (including volume-preserving ones), a spectral preconditioner, a highly optimized distributed Fast Fourier Transform, and a cubic interpolation scheme for the semi-Lagrangian time-stepping. We demonstrate the scalability of our algorithm on images with resolution of up to

1024^3

on the "Maverick" and "Stampede" systems at the Texas Advanced Computing Center (TACC). The critical problem in the medical imaging application domain is strong scaling, that is, solving registration problems of a moderate size of

256^3

---a typical resolution for medical images. We are able to solve the registration problem for images of this size in less than five seconds on 64 x86 nodes of TACC's "Maverick" system.Comment: accepted for publication at SC16 in Salt Lake City, Utah, USA; November 201

arXiv.org e-Print Archive

Crossref

A domain decomposing parallel sparse linear system solver

Author: Amestoy
Amestoy
Amestoy
Amestoy
Barrett
Benzi
Benzi
Benzi
Berry
Chen
Dongarra
Dongarra
Gravvanis
Gravvanis
Gravvanis
Karypis
Karypis
Lawrie
Lawson
Li
Manguoglu
Manguoglu
Murat Manguoglu
Polizzi
Polizzi
Sameh
Schenk
Schenk
Schenk
Publication venue: 'Elsevier BV'
Publication date: 26/08/2011
Field of study

The solution of large sparse linear systems is often the most time-consuming part of many science and engineering applications. Computational fluid dynamics, circuit simulation, power network analysis, and material science are just a few examples of the application areas in which large sparse linear systems need to be solved effectively. In this paper we introduce a new parallel hybrid sparse linear system solver for distributed memory architectures that contains both direct and iterative components. We show that by using our solver one can alleviate the drawbacks of direct and iterative solvers, achieving better scalability than with direct solvers and more robustness than with classical preconditioned iterative solvers. Comparisons to well-known direct and iterative solvers on a parallel architecture are provided.Comment: To appear in Journal of Computational and Applied Mathematic

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

Recycling BiCGSTAB with an Application to Parametric Model Order Reduction

Author: Ahuja Kapil
Benner Peter
de Sturler Eric
Feng Lihong
Publication venue
Publication date: 01/01/2015
Field of study

Krylov subspace recycling is a process for accelerating the convergence of sequences of linear systems. Based on this technique, the recycling BiCG algorithm has been developed recently. Here, we now generalize and extend this recycling theory to BiCGSTAB. Recycling BiCG focuses on efficiently solving sequences of dual linear systems, while the focus here is on efficiently solving sequences of single linear systems (assuming non-symmetric matrices for both recycling BiCG and recycling BiCGSTAB). As compared with other methods for solving sequences of single linear systems with non-symmetric matrices (e.g., recycling variants of GMRES), BiCG based recycling algorithms, like recycling BiCGSTAB, have the advantage that they involve a short-term recurrence, and hence, do not suffer from storage issues and are also cheaper with respect to the orthogonalizations. We modify the BiCGSTAB algorithm to use a recycle space, which is built from left and right approximate invariant subspaces. Using our algorithm for a parametric model order reduction example gives good results. We show about 40% savings in the number of matrix-vector products and about 35% savings in runtime.Comment: 18 pages, 5 figures, Extended version of Max Planck Institute report (MPIMD/13-21

arXiv.org e-Print Archive

MPG.PuRe