    Distributed-memory large deformation diffeomorphic 3D image registration

    We present a parallel distributed-memory algorithm for large deformation diffeomorphic registration of volumetric images that produces large isochoric deformations (locally volume preserving). Image registration is a key technology in medical image analysis. Our algorithm uses a partial differential equation constrained optimal control formulation. Finding the optimal deformation map requires the solution of a highly nonlinear problem that involves pseudo-differential operators, biharmonic operators, and pure advection operators both forward and back- ward in time. A key issue is the time to solution, which poses the demand for efficient optimization methods as well as an effective utilization of high performance computing resources. To address this problem we use a preconditioned, inexact, Gauss-Newton- Krylov solver. Our algorithm integrates several components: a spectral discretization in space, a semi-Lagrangian formulation in time, analytic adjoints, different regularization functionals (including volume-preserving ones), a spectral preconditioner, a highly optimized distributed Fast Fourier Transform, and a cubic interpolation scheme for the semi-Lagrangian time-stepping. We demonstrate the scalability of our algorithm on images with resolution of up to 102431024^3 on the "Maverick" and "Stampede" systems at the Texas Advanced Computing Center (TACC). The critical problem in the medical imaging application domain is strong scaling, that is, solving registration problems of a moderate size of 2563256^3---a typical resolution for medical images. We are able to solve the registration problem for images of this size in less than five seconds on 64 x86 nodes of TACC's "Maverick" system.Comment: accepted for publication at SC16 in Salt Lake City, Utah, USA; November 201

    Efficient algorithms for geodesic shooting in diffeomorphic image registration

    Diffeomorphic image registration is a common problem in medical image analysis. Here, one searches for a diffeomorphic deformation that maps one image (the moving or template image) onto another image (the fixed or reference image). We can formulate the search for such a map as a PDE constrained optimization problem. These types of problems are computationally expensive. This gives rise to the need for efficient algorithms. After introducing the PDE constrained optimization problem, we derive the first and second order optimality conditions. We discretize the problem using a pseudo-spectral discretization in space and consider Heun's method and the semi-Lagrangian method for the time integration of the PDEs that appear in the optimality system. To solve this optimization problem, we consider an L-BFGS and an inexact Gauss-Newton-Krylov method. To reduce the cost of solving the linear system that arises in Newton-type methods, we investigate different preconditioners. They exploit the structure of the Hessian, and use algorithms to efficiently compute an approximation to its inverse. Further, we build the preconditioners on a coarse grid to further reduce computational costs. The different methods are evaluated for two-dimensional image data (real and synthetic). We study the spectrum of the different building blocks that appear in the Hessian. It is demonstrated that low rank preconditioners are able to significantly reduce the number of iterations needed to solve the linear system in Newton-type optimizers. We then compare different optimization methods based on their overall performance. This includes the accuracy and time-to-solution. L-BFGS turns out to be the best method, in terms of runtime, if we solve solving for large gradient tolerances. If we are interested in computing accurate solutions with a small gradient norm, an inexact Gauss-Newton-Krylov optimizer with the regularization term as preconditioner performs best

    CLAIRE: Scalable GPU-Accelerated Algorithms for Diffeomorphic Image Registration in 3D

    We present our work on scalable, GPU-accelerated algorithms for diffeomorphic image registration. The associated software package is termed CLAIRE. Image registration is a non-linear inverse problem. It is about computing a spatial mapping from one image of the same object or scene to another. In diffeomorphic image registration, the set of admissible spatial transformations is restricted to maps that are smooth, one-to-one, and have a smooth inverse. We formulate diffeomorphic image registration as a variational problem governed by transport equations. We use an inexact, globalized (Gauss--)Newton--Krylov method for numerical optimization. We consider semi-Lagrangian methods for numerical time integration. Our solver features mixed-precision, hardware-accelerated computational kernels for optimal computational throughput. We use the message-passing interface for distributed-memory parallelism and deploy our code on modern high-performance computing architectures. Our solver allows us to solve clinically relevant problems in under four seconds on a single GPU. It can also be applied to large-scale 3D imaging applications with data that is discretized on meshes with billions of voxels. We demonstrate that our numerical framework yields high-fidelity results in only a few seconds, even if we search for an optimal regularization parameter
