Search CORE

78,803 research outputs found

Automatic differentiation in machine learning: a survey

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Radul Alexey Andreyevich
Siskind Jeffrey Mark
Publication venue
Publication date: 01/01/2018
Field of study

Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Oxford University Research Archive

Automatic linearity detection

Author: Birkisson Asgeir
Driscoll Tobin A.
Publication venue: SICS
Publication date: 01/01/2013
Field of study

Given a function, or more generally an operator, the question "Is it linear?" seems simple to answer. In many applications of scientific computing it might be worth determining the answer to this question in an automated way; some functionality, such as operator exponentiation, is only defined for linear operators, and in other problems, time saving is available if it is known that the problem being solved is linear. Linearity detection is closely connected to sparsity detection of Hessians, so for large-scale applications, memory savings can be made if linearity information is known. However, implementing such an automated detection is not as straightforward as one might expect. This paper describes how automatic linearity detection can be implemented in combination with automatic differentiation, both for standard scientific computing software, and within the Chebfun software system. The key ingredients for the method are the observation that linear operators have constant derivatives, and the propagation of two logical vectors,

\ell

and

c

, as computations are carried out. The values of

\ell

and

c

are determined by whether output variables have constant derivatives and constant values with respect to each input variable. The propagation of their values through an evaluation trace of an operator yields the desired information about the linearity of that operator

Oxford University Research Archive

Differentiable Programming Tensor Networks

Author: Liao Hai-Jun
Liu Jin-Guo
Wang Lei
Xiang Tao
Publication venue: 'American Physical Society (APS)'
Publication date: 12/07/2019
Field of study

Differentiable programming is a fresh programming paradigm which composes parameterized algorithmic components and trains them using automatic differentiation (AD). The concept emerges from deep learning but is not only limited to training neural networks. We present theory and practice of programming tensor network algorithms in a fully differentiable way. By formulating the tensor network algorithm as a computation graph, one can compute higher order derivatives of the program accurately and efficiently using AD. We present essential techniques to differentiate through the tensor networks contractions, including stable AD for tensor decomposition and efficient backpropagation through fixed point iterations. As a demonstration, we compute the specific heat of the Ising model directly by taking the second order derivative of the free energy obtained in the tensor renormalization group calculation. Next, we perform gradient based variational optimization of infinite projected entangled pair states for quantum antiferromagnetic Heisenberg model and obtain start-of-the-art variational energy and magnetization with moderate efforts. Differentiable programming removes laborious human efforts in deriving and implementing analytical gradients for tensor network programs, which opens the door to more innovations in tensor network algorithms and applications.Comment: Typos corrected, discussion and refs added; revised version accepted for publication in PRX. Source code available at https://github.com/wangleiphy/tensorgra

arXiv.org e-Print Archive

Directory of Open Access Journals

Automating embedded analysis capabilities and managing software complexity in multiphysics simulation part I: template-based generic programming

Author: Pawlowski Roger P.
Phipps Eric T.
Salinger Andrew G.
Publication venue
Publication date: 01/01/2012
Field of study

An approach for incorporating embedded simulation and analysis capabilities in complex simulation codes through template-based generic programming is presented. This approach relies on templating and operator overloading within the C++ language to transform a given calculation into one that can compute a variety of additional quantities that are necessary for many state-of-the-art simulation and analysis algorithms. An approach for incorporating these ideas into complex simulation codes through general graph-based assembly is also presented. These ideas have been implemented within a set of packages in the Trilinos framework and are demonstrated on a simple problem from chemical engineering

arXiv.org e-Print Archive

Directory of Open Access Journals

ADF95: Tool for automatic differentiation of a FORTRAN code designed for large numbers of independent variables

Author: Adams
Beck
Christian W. Straka
Ehrig
Metcalf
Stamatiadis
Publication venue: 'Elsevier BV'
Publication date: 04/03/2005
Field of study

ADF95 is a tool to automatically calculate numerical first derivatives for any mathematical expression as a function of user defined independent variables. Accuracy of derivatives is achieved within machine precision. ADF95 may be applied to any FORTRAN 77/90/95 conforming code and requires minimal changes by the user. It provides a new derived data type that holds the value and derivatives and applies forward differencing by overloading all FORTRAN operators and intrinsic functions. An efficient indexing technique leads to a reduced memory usage and a substantially increased performance gain over other available tools with operator overloading. This gain is especially pronounced for sparse systems with large number of independent variables. A wide class of numerical simulations, e.g., those employing implicit solvers, can profit from ADF95.Comment: 24 pages, 2 figures, 4 tables, accepted in Computer Physics Communication

arXiv.org e-Print Archive

Crossref