14,640 research outputs found
Automatic Differentiation of Algorithms for Machine Learning
Automatic differentiation---the mechanical transformation of numeric computer
programs to calculate derivatives efficiently and accurately---dates to the
origin of the computer age. Reverse mode automatic differentiation both
antedates and generalizes the method of backwards propagation of errors used in
machine learning. Despite this, practitioners in a variety of fields, including
machine learning, have been little influenced by automatic differentiation, and
make scant use of available tools. Here we review the technique of automatic
differentiation, describe its two main modes, and explain how it can benefit
machine learning practitioners. To reach the widest possible audience our
treatment assumes only elementary differential calculus, and does not assume
any knowledge of linear algebra.Comment: 7 pages, 1 figur
Automatic Differentiation Variational Inference
Probabilistic modeling is iterative. A scientist posits a simple model, fits
it to her data, refines it according to her analysis, and repeats. However,
fitting complex models to large data is a bottleneck in this process. Deriving
algorithms for new models can be both mathematically and computationally
challenging, which makes it difficult to efficiently cycle through the steps.
To this end, we develop automatic differentiation variational inference (ADVI).
Using our method, the scientist only provides a probabilistic model and a
dataset, nothing else. ADVI automatically derives an efficient variational
inference algorithm, freeing the scientist to refine and explore many models.
ADVI supports a broad class of models-no conjugacy assumptions are required. We
study ADVI across ten different models and apply it to a dataset with millions
of observations. ADVI is integrated into Stan, a probabilistic programming
system; it is available for immediate use
Forward-Mode Automatic Differentiation in Julia
We present ForwardDiff, a Julia package for forward-mode automatic
differentiation (AD) featuring performance competitive with low-level languages
like C++. Unlike recently developed AD tools in other popular high-level
languages such as Python and MATLAB, ForwardDiff takes advantage of
just-in-time (JIT) compilation to transparently recompile AD-unaware user code,
enabling efficient support for higher-order differentiation and differentiation
using custom number types (including complex numbers). For gradient and
Jacobian calculations, ForwardDiff provides a variant of vector-forward mode
that avoids expensive heap allocation and makes better use of memory bandwidth
than traditional vector mode. In our numerical experiments, we demonstrate that
for nontrivially large dimensions, ForwardDiff's gradient computations can be
faster than a reverse-mode implementation from the Python-based autograd
package. We also illustrate how ForwardDiff is used effectively within JuMP, a
modeling language for optimization. According to our usage statistics, 41
unique repositories on GitHub depend on ForwardDiff, with users from diverse
fields such as astronomy, optimization, finite element analysis, and
statistics.
This document is an extended abstract that has been accepted for presentation
at the AD2016 7th International Conference on Algorithmic Differentiation.Comment: 4 page
TMB: Automatic Differentiation and Laplace Approximation
TMB is an open source R package that enables quick implementation of complex
nonlinear random effect (latent variable) models in a manner similar to the
established AD Model Builder package (ADMB, admb-project.org). In addition, it
offers easy access to parallel computations. The user defines the joint
likelihood for the data and the random effects as a C++ template function,
while all the other operations are done in R; e.g., reading in the data. The
package evaluates and maximizes the Laplace approximation of the marginal
likelihood where the random effects are automatically integrated out. This
approximation, and its derivatives, are obtained using automatic
differentiation (up to order three) of the joint likelihood. The computations
are designed to be fast for problems with many random effects (~10^6) and
parameters (~10^3). Computation times using ADMB and TMB are compared on a
suite of examples ranging from simple models to large spatial models where the
random effects are a Gaussian random field. Speedups ranging from 1.5 to about
100 are obtained with increasing gains for large problems. The package and
examples are available at http://tmb-project.org
Automatic Differentiation Tools in Optimization Software
We discuss the role of automatic differentiation tools in optimization
software. We emphasize issues that are important to large-scale optimization
and that have proved useful in the installation of nonlinear solvers in the
NEOS Server. Our discussion centers on the computation of the gradient and
Hessian matrix for partially separable functions and shows that the gradient
and Hessian matrix can be computed with guaranteed bounds in time and memory
requirementsComment: 11 page
On the Equivalence of Forward Mode Automatic Differentiation and Symbolic Differentiation
We show that forward mode automatic differentiation and symbolic
differentiation are equivalent in the sense that they both perform the same
operations when computing derivatives. This is in stark contrast to the common
claim that they are substantially different. The difference is often
illustrated by claiming that symbolic differentiation suffers from "expression
swell" whereas automatic differentiation does not. Here, we show that this
statement is not true. "Expression swell" refers to the phenomenon of a much
larger representation of the derivative as opposed to the representation of the
original function
- …