945 research outputs found

    Hessian Matrix-Free Lagrange-Newton-Krylov-Schur-Schwarz Methods for Elliptic Inverse Problems

    Get PDF
    This study focuses on the solution of inverse problems for elliptic systems. The inverse problem is constructed as a PDE-constrained optimization, where the cost function is the L2 norm of the difference between the measured data and the predicted state variable, and the constraint is an elliptic PDE. Particular examples of the system considered in this stud, are groundwater flow and radiation transport. The inverse problems are typically ill-posed due to error in measurements of the data. Regularization methods are employed to partially alleviate this problem. The PDE-constrained optimization is formulated as the minimization of a Lagrangian functional, formed from the regularized cost function and the discretized PDE, with respect to the parameters, the state variables, and the Lagrange multipliers. Our approach is known as an all at once method. An algorithm is proposed for an inverse problem that is capable of being extended to large scales. To overcome storage limitations, we develop a parallel preconditioned Newton-Krylov method employed in a Hessian-free manner. The preconditioners have an inner-outer structure, taking the form of a Schur complement (block factorization) at the outer level and Schwarz projections at the inner level. However, building an exact Schur complement is prohibitively expensive. Thus, we use Schur complement approximations, including the identity, probing, the Laplacian, the J operator, and a BFGS operator. For exact data the exact Schur complements are superior to the inexact approximations. However, for data with noise the inexact methods are competitive to or even better than the exact in every computational aspect. We also find that nousymmetric forms of the Karush-Kuhn-Tucker matrices and preconditioners are competitive to or better than the symmetric forms that are commonly used in the optimization community. In this study, iterative Tikhonov and Total Variation regularizations are proposed and compared to the standard regularizations and each other. For exact data with jump discontinuities the standard and iterative Total Variation regulations are superior to the standard and iterative Tikhonov regularizations. However, in the case of noisy data the proposed iterative Tikhonov regularizations are superior to the standard and iterative Total Variation methods. We also show that in some cases the iterative regularizations are better than the noniterative. To demonstrate the performance of the algorithm, including the effectiveness of the preconditioners and regularizations, synthetic one- and two-dimensional elliptic inverse problems are solved, and we also compare with other methodologies that are available in the literature. The proposed algorithm performs well with regard to robustness, reconstructs the parameter models effectively, and is easily implemented in the framework of the available parallel PDE software PETSc and the automatic differentiation software ADIC. The algorithm is also extendable to three-dimensional problems

    Diffusion Approximations for Online Principal Component Estimation and Global Convergence

    Full text link
    In this paper, we propose to adopt the diffusion approximation tools to study the dynamics of Oja's iteration which is an online stochastic gradient descent method for the principal component analysis. Oja's iteration maintains a running estimate of the true principal component from streaming data and enjoys less temporal and spatial complexities. We show that the Oja's iteration for the top eigenvector generates a continuous-state discrete-time Markov chain over the unit sphere. We characterize the Oja's iteration in three phases using diffusion approximation and weak convergence tools. Our three-phase analysis further provides a finite-sample error bound for the running estimate, which matches the minimax information lower bound for principal component analysis under the additional assumption of bounded samples.Comment: Appeared in NIPS 201

    Automatic differentiation in machine learning: a survey

    Get PDF
    Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure
    • …
    corecore