69 research outputs found
Composing Scalable Nonlinear Algebraic Solvers
Most efficient linear solvers use composable algorithmic components, with the
most common model being the combination of a Krylov accelerator and one or more
preconditioners. A similar set of concepts may be used for nonlinear algebraic
systems, where nonlinear composition of different nonlinear solvers may
significantly improve the time to solution. We describe the basic concepts of
nonlinear composition and preconditioning and present a number of solvers
applicable to nonlinear partial differential equations. We have developed a
software framework in order to easily explore the possible combinations of
solvers. We show that the performance gains from using composed solvers can be
substantial compared with gains from standard Newton-Krylov methods.Comment: 29 pages, 14 figures, 13 table
Riemannian Acceleration with Preconditioning for symmetric eigenvalue problems
In this paper, we propose a Riemannian Acceleration with Preconditioning
(RAP) for symmetric eigenvalue problems, which is one of the most important
geodesically convex optimization problem on Riemannian manifold, and obtain the
acceleration. Firstly, the preconditioning for symmetric eigenvalue problems
from the Riemannian manifold viewpoint is discussed. In order to obtain the
local geodesic convexity, we develop the leading angle to measure the quality
of the preconditioner for symmetric eigenvalue problems. A new Riemannian
acceleration, called Locally Optimal Riemannian Accelerated Gradient (LORAG)
method, is proposed to overcome the local geodesic convexity for symmetric
eigenvalue problems. With similar techniques for RAGD and analysis of local
convex optimization in Euclidean space, we analyze the convergence of LORAG.
Incorporating the local geodesic convexity of symmetric eigenvalue problems
under preconditioning with the LORAG, we propose the Riemannian Acceleration
with Preconditioning (RAP) and prove its acceleration. Additionally, when the
Schwarz preconditioner, especially the overlapping or non-overlapping domain
decomposition method, is applied for elliptic eigenvalue problems, we also
obtain the rate of convergence as , where is a constant
independent of the mesh sizes and the eigenvalue gap,
, is
the parameter from the stable decomposition, and
are the smallest two eigenvalues of the elliptic operator. Numerical results
show the power of Riemannian acceleration and preconditioning.Comment: Due to the limit in abstract of arXiv, the abstract here is shorter
than in PD
Monolithic Multigrid for Magnetohydrodynamics
The magnetohydrodynamics (MHD) equations model a wide range of plasma physics
applications and are characterized by a nonlinear system of partial
differential equations that strongly couples a charged fluid with the evolution
of electromagnetic fields. After discretization and linearization, the
resulting system of equations is generally difficult to solve due to the
coupling between variables, and the heterogeneous coefficients induced by the
linearization process. In this paper, we investigate multigrid preconditioners
for this system based on specialized relaxation schemes that properly address
the system structure and coupling. Three extensions of Vanka relaxation are
proposed and applied to problems with up to 170 million degrees of freedom and
fluid and magnetic Reynolds numbers up to 400 for stationary problems and up to
20,000 for time-dependent problems
Substructured formulations of nonlinear structure problems - influence of the interface condition
We investigate the use of non-overlapping domain decomposition (DD) methods
for nonlinear structure problems. The classic techniques would combine a global
Newton solver with a linear DD solver for the tangent systems. We propose a
framework where we can swap Newton and DD, so that we solve independent
nonlinear problems for each substructure and linear condensed interface
problems. The objective is to decrease the number of communications between
subdomains and to improve parallelism. Depending on the interface condition, we
derive several formulations which are not equivalent, contrarily to the linear
case. Primal, dual and mixed variants are described and assessed on a simple
plasticity problem.Comment: in International Journal for Numerical Methods in Engineering, Wiley,
201
Distributed-memory parallelization of the aggregated unfitted finite element method
The aggregated unfitted finite element method (AgFEM) is a methodology
recently introduced in order to address conditioning and stability problems
associated with embedded, unfitted, or extended finite element methods. The
method is based on removal of basis functions associated with badly cut cells
by introducing carefully designed constraints, which results in well-posed
systems of linear algebraic equations, while preserving the optimal
approximation order of the underlying finite element spaces. The specific goal
of this work is to present the implementation and performance of the method on
distributed-memory platforms aiming at the efficient solution of large-scale
problems. In particular, we show that, by considering AgFEM, the resulting
systems of linear algebraic equations can be effectively solved using standard
algebraic multigrid preconditioners. This is in contrast with previous works
that consider highly customized preconditioners in order to allow one the usage
of iterative solvers in combination with unfitted techniques. Another novelty
with respect to the methods available in the literature is the problem sizes
that can be handled with the proposed approach. While most of previous
references discussing linear solvers for unfitted methods are based on serial
non-scalable algorithms, we propose a parallel distributed-memory method able
to efficiently solve problems at large scales. This is demonstrated by means of
a weak scaling test defined on complex 3D domains up to 300M degrees of freedom
and one billion cells on 16K CPU cores in the Marenostrum-IV platform. The
parallel implementation of the AgFEM method is available in the large-scale
finite element package FEMPAR
Iterative methods for heterogeneous media
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Distributed-memory parallelization of the aggregated unfitted finite element method
The aggregated unfitted finite element method (AgFEM) is a methodology recently introduced in order to address conditioning and stability problems associated with embedded, unfitted, or extended finite element methods. The method is based on removal of basis functions associated with badly cut cells by introducing carefully designed constraints, which results in well-posed systems of linear algebraic equations, while preserving the optimal approximation order of the underlying finite element spaces. The specific goal of this work is to present the implementation and performance of the method on distributed-memory platforms aiming at the efficient solution of large-scale problems. In particular, we show that, by considering AgFEM, the resulting systems of linear algebraic equations can be effectively solved using standard algebraic multigrid preconditioners. This is in contrast with previous works that consider highly customized preconditioners in order to allow one the usage of iterative solvers in combination with unfitted techniques. Another novelty with respect to the methods available in the literature is the problem sizes that can be handled with the proposed approach. While most of previous references discussing linear solvers for unfitted methods are based on serial non-scalable algorithms, we propose a parallel distributed-memory method able to efficiently solve problems at large scales. This is demonstrated by means of a weak scaling test defined on complex 3D domains up to 300M degrees of freedom and one billion cells on 16K CPU cores in the Marenostrum-IV platform. The parallel implementation of the AgFEM method is available in the large-scale finite element package FEMPAR
Large Scale Kernel Methods for Fun and Profit
Kernel methods are among the most flexible classes of machine learning models with strong theoretical guarantees. Wide classes of functions can be approximated arbitrarily well with kernels, while fast convergence and learning rates have been formally shown to hold. Exact kernel methods are known to scale poorly with increasing dataset size, and we believe that one of the factors limiting their usage in modern machine learning is the lack of scalable and easy to use algorithms and software. The main goal of this thesis is to study kernel methods from the point of view of efficient learning, with particular emphasis on large-scale data, but also on low-latency training, and user efficiency. We improve the state-of-the-art for scaling kernel solvers to datasets with billions of points using the Falkon algorithm, which combines random projections with fast optimization. Running it on GPUs, we show how to fully utilize available computing power for training kernel machines. To boost the ease-of-use of approximate kernel solvers, we propose an algorithm for automated hyperparameter tuning. By minimizing a penalized loss function, a model can be learned together with its hyperparameters, reducing the time needed for user-driven experimentation. In the setting of multi-class learning, we show that – under stringent but realistic assumptions on the separation between classes – a wide set of algorithms needs much fewer data points than in the more general setting (without assumptions on class separation) to reach the same accuracy. The first part of the thesis develops a framework for efficient and scalable kernel machines. This raises the question of whether our approaches can be used successfully in real-world applications, especially compared to alternatives based on deep learning which are often deemed hard to beat. The second part aims to investigate this question on two main applications, chosen because of the paramount importance of having an efficient algorithm. First, we consider the problem of instance segmentation of images taken from the iCub robot. Here Falkon is used as part of a larger pipeline, but the efficiency afforded by our solver is essential to ensure smooth human-robot interactions. In the second instance, we consider time-series forecasting of wind speed, analysing the relevance of different physical variables on the predictions themselves. We investigate different schemes to adapt i.i.d. learning to the time-series setting. Overall, this work aims to demonstrate, through novel algorithms and examples, that kernel methods are up to computationally demanding tasks, and that there are concrete applications in which their use is warranted and more efficient than that of other, more complex, and less theoretically grounded models
- …