1,473 research outputs found
Automatic differentiation in machine learning: a survey
Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in
machine learning. Automatic differentiation (AD), also called algorithmic
differentiation or simply "autodiff", is a family of techniques similar to but
more general than backpropagation for efficiently and accurately evaluating
derivatives of numeric functions expressed as computer programs. AD is a small
but established field with applications in areas including computational fluid
dynamics, atmospheric sciences, and engineering design optimization. Until very
recently, the fields of machine learning and AD have largely been unaware of
each other and, in some cases, have independently discovered each other's
results. Despite its relevance, general-purpose AD has been missing from the
machine learning toolbox, a situation slowly changing with its ongoing adoption
under the names "dynamic computational graphs" and "differentiable
programming". We survey the intersection of AD and machine learning, cover
applications where AD has direct relevance, and address the main implementation
techniques. By precisely defining the main differentiation techniques and their
interrelationships, we aim to bring clarity to the usage of the terms
"autodiff", "automatic differentiation", and "symbolic differentiation" as
these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure
Verifying an Effect-Handler-Based Define-By-Run Reverse-Mode AD Library
We apply program verification technology to the problem of specifying and
verifying automatic differentiation (AD) algorithms. We focus on
``define-by-run'', a style of AD where the program that must be differentiated
is executed and monitored by the automatic differentiation algorithm. We begin
by asking, ``what is an implementation of AD?'' and ``what does it mean for an
implementation of AD to be correct?'' We answer these questions both at an
informal level, in precise English prose, and at a formal level, using types
and logical assertions. After answering these broad questions, we focus on a
specific implementation of AD, which involves a number of subtle programming
language features, including dynamically allocated mutable state, first-class
functions, and effect handlers. We present a machine-checked proof, expressed
in a modern variant of Separation Logic, of its correctness. We view this
result as an advanced exercise in program verification, with potential future
applications to the verification of more realistic automatic differentiation
systems and of other software components that exploit delimited control
effects
Higher Order Automatic Differentiation of Higher Order Functions
We present semantic correctness proofs of automatic differentiation (AD). We
consider a forward-mode AD method on a higher order language with algebraic
data types, and we characterise it as the unique structure preserving macro
given a choice of derivatives for basic operations. We describe a rich
semantics for differentiable programming, based on diffeological spaces. We
show that it interprets our language, and we phrase what it means for the AD
method to be correct with respect to this semantics. We show that our
characterisation of AD gives rise to an elegant semantic proof of its
correctness based on a gluing construction on diffeological spaces. We explain
how this is, in essence, a logical relations argument. Throughout, we show how
the analysis extends to AD methods for computing higher order derivatives using
a Taylor approximation.Comment: 34 pages, 5 figures, submitted at LMCS 2020. arXiv admin note:
substantial text overlap with arXiv:2001.0220
Functorial String Diagrams for Reverse-Mode Automatic Differentiation
We formulate a reverse-mode automatic differentiation (RAD) algorithm for (applied) simply typed lambda calculus in the style of Pearlmutter and Siskind [Barak A. Pearlmutter and Jeffrey Mark Siskind, 2008], using the graphical formalism of string diagrams. Thanks to string diagram rewriting, we are able to formally prove for the first time the soundness of such an algorithm. Our approach requires developing a calculus of string diagrams with hierarchical features in the spirit of functorial boxes, in order to model closed monoidal (and cartesian closed) structure. To give an efficient yet principled implementation of the RAD algorithm, we use foliations of our hierarchical string diagrams
Charactarizations of Linear Suboptimality for Mathematical Programs with Equilibrium Constraints
The paper is devoted to the study of a new notion of linear suboptimality in constrained mathematical programming. This concept is different from conventional notions of solutions to optimization-related problems, while seems to be natural and significant from the viewpoint of modern variational analysis and applications. In contrast to standard notions, it admits complete characterizations via appropriate constructions of generalized differentiation in nonconvex settings. In this paper we mainly focus on various classes of mathematical programs with equilibrium constraints (MPECs), whose principal role has been well recognized in optimization theory and its applications. Based on robust generalized differential calculus, we derive new results giving pointwise necessary and sufficient conditions for linear suboptimality in general MPECs and its important specifications involving variational and quasi variational inequalities, implicit complementarity problems, etc
Analytical Differential Calculus with Integration
Differential lambda-calculus was first introduced by Thomas Ehrhard and Laurent Regnier in 2003. Despite more than 15 years of history, little work has been done on a differential calculus with integration. In this paper, we shall propose a differential calculus with integration from a programming point of view. We show its good correspondence with mathematics, which is manifested by how we construct these reduction rules and how we preserve important mathematical theorems in our calculus. Moreover, we highlight applications of the calculus in incremental computation, automatic differentiation, and computation approximation
Programming tools for intelligent systems
Les outils de programmation sont des programmes informatiques qui aident les humains à programmer des ordinateurs. Les outils sont de toutes formes et tailles, par exemple les éditeurs, les compilateurs, les débogueurs et les profileurs. Chacun de ces outils facilite une tâche principale dans le flux de travail de programmation qui consomme des ressources cognitives lorsqu’il est effectué manuellement. Dans cette thèse, nous explorons plusieurs outils qui facilitent le processus de construction de systèmes intelligents et qui réduisent l’effort cognitif requis pour concevoir, développer, tester et déployer des systèmes logiciels intelligents. Tout d’abord, nous introduisons un environnement de développement intégré (EDI) pour la programmation d’applications Robot Operating System (ROS), appelé Hatchery (Chapter 2). Deuxièmement, nous décrivons Kotlin∇, un système de langage et de type pour la programmation différenciable, un paradigme émergent dans l’apprentissage automatique (Chapter 3). Troisièmement, nous proposons un nouvel algorithme pour tester automatiquement les programmes différenciables, en nous inspirant des techniques de tests contradictoires et métamorphiques (Chapter 4), et démontrons son efficacité empirique dans le cadre de la régression. Quatrièmement, nous explorons une infrastructure de conteneurs basée sur Docker, qui permet un déploiement reproductible des applications ROS sur la plateforme Duckietown (Chapter 5). Enfin, nous réfléchissons à l’état actuel des outils de programmation pour ces applications et spéculons à quoi pourrait ressembler la programmation de systèmes intelligents à l’avenir (Chapter 6).Programming tools are computer programs which help humans program computers. Tools come in all shapes and forms, from editors and compilers to debuggers and profilers. Each of these tools facilitates a core task in the programming workflow which consumes cognitive resources when performed manually. In this thesis, we explore several tools that facilitate the process of building intelligent systems, and which reduce the cognitive effort required to design, develop, test and deploy intelligent software systems. First, we introduce an integrated development environment (IDE) for programming Robot Operating System (ROS) applications, called Hatchery (Chapter 2). Second, we describe Kotlin∇, a language and type system for differentiable programming, an emerging paradigm in machine learning (Chapter 3). Third, we propose a new algorithm for automatically testing differentiable programs, drawing inspiration from techniques in adversarial and metamorphic testing (Chapter 4), and demonstrate its empirical efficiency in the regression setting. Fourth, we explore a container infrastructure based on Docker, which enables reproducible deployment of ROS applications on the Duckietown platform (Chapter 5). Finally, we reflect on the current state of programming tools for these applications and speculate what intelligent systems programming might look like in the future (Chapter 6)
- …