1,473 research outputs found

    Automatic differentiation in machine learning: a survey

    Get PDF
    Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure

    Verifying an Effect-Handler-Based Define-By-Run Reverse-Mode AD Library

    Full text link
    We apply program verification technology to the problem of specifying and verifying automatic differentiation (AD) algorithms. We focus on ``define-by-run'', a style of AD where the program that must be differentiated is executed and monitored by the automatic differentiation algorithm. We begin by asking, ``what is an implementation of AD?'' and ``what does it mean for an implementation of AD to be correct?'' We answer these questions both at an informal level, in precise English prose, and at a formal level, using types and logical assertions. After answering these broad questions, we focus on a specific implementation of AD, which involves a number of subtle programming language features, including dynamically allocated mutable state, first-class functions, and effect handlers. We present a machine-checked proof, expressed in a modern variant of Separation Logic, of its correctness. We view this result as an advanced exercise in program verification, with potential future applications to the verification of more realistic automatic differentiation systems and of other software components that exploit delimited control effects

    Higher Order Automatic Differentiation of Higher Order Functions

    Get PDF
    We present semantic correctness proofs of automatic differentiation (AD). We consider a forward-mode AD method on a higher order language with algebraic data types, and we characterise it as the unique structure preserving macro given a choice of derivatives for basic operations. We describe a rich semantics for differentiable programming, based on diffeological spaces. We show that it interprets our language, and we phrase what it means for the AD method to be correct with respect to this semantics. We show that our characterisation of AD gives rise to an elegant semantic proof of its correctness based on a gluing construction on diffeological spaces. We explain how this is, in essence, a logical relations argument. Throughout, we show how the analysis extends to AD methods for computing higher order derivatives using a Taylor approximation.Comment: 34 pages, 5 figures, submitted at LMCS 2020. arXiv admin note: substantial text overlap with arXiv:2001.0220

    Functorial String Diagrams for Reverse-Mode Automatic Differentiation

    Get PDF
    We formulate a reverse-mode automatic differentiation (RAD) algorithm for (applied) simply typed lambda calculus in the style of Pearlmutter and Siskind [Barak A. Pearlmutter and Jeffrey Mark Siskind, 2008], using the graphical formalism of string diagrams. Thanks to string diagram rewriting, we are able to formally prove for the first time the soundness of such an algorithm. Our approach requires developing a calculus of string diagrams with hierarchical features in the spirit of functorial boxes, in order to model closed monoidal (and cartesian closed) structure. To give an efficient yet principled implementation of the RAD algorithm, we use foliations of our hierarchical string diagrams

    Charactarizations of Linear Suboptimality for Mathematical Programs with Equilibrium Constraints

    Get PDF
    The paper is devoted to the study of a new notion of linear suboptimality in constrained mathematical programming. This concept is different from conventional notions of solutions to optimization-related problems, while seems to be natural and significant from the viewpoint of modern variational analysis and applications. In contrast to standard notions, it admits complete characterizations via appropriate constructions of generalized differentiation in nonconvex settings. In this paper we mainly focus on various classes of mathematical programs with equilibrium constraints (MPECs), whose principal role has been well recognized in optimization theory and its applications. Based on robust generalized differential calculus, we derive new results giving pointwise necessary and sufficient conditions for linear suboptimality in general MPECs and its important specifications involving variational and quasi variational inequalities, implicit complementarity problems, etc

    Analytical Differential Calculus with Integration

    Get PDF
    Differential lambda-calculus was first introduced by Thomas Ehrhard and Laurent Regnier in 2003. Despite more than 15 years of history, little work has been done on a differential calculus with integration. In this paper, we shall propose a differential calculus with integration from a programming point of view. We show its good correspondence with mathematics, which is manifested by how we construct these reduction rules and how we preserve important mathematical theorems in our calculus. Moreover, we highlight applications of the calculus in incremental computation, automatic differentiation, and computation approximation

    Programming tools for intelligent systems

    Full text link
    Les outils de programmation sont des programmes informatiques qui aident les humains à programmer des ordinateurs. Les outils sont de toutes formes et tailles, par exemple les éditeurs, les compilateurs, les débogueurs et les profileurs. Chacun de ces outils facilite une tâche principale dans le flux de travail de programmation qui consomme des ressources cognitives lorsqu’il est effectué manuellement. Dans cette thèse, nous explorons plusieurs outils qui facilitent le processus de construction de systèmes intelligents et qui réduisent l’effort cognitif requis pour concevoir, développer, tester et déployer des systèmes logiciels intelligents. Tout d’abord, nous introduisons un environnement de développement intégré (EDI) pour la programmation d’applications Robot Operating System (ROS), appelé Hatchery (Chapter 2). Deuxièmement, nous décrivons Kotlin∇, un système de langage et de type pour la programmation différenciable, un paradigme émergent dans l’apprentissage automatique (Chapter 3). Troisièmement, nous proposons un nouvel algorithme pour tester automatiquement les programmes différenciables, en nous inspirant des techniques de tests contradictoires et métamorphiques (Chapter 4), et démontrons son efficacité empirique dans le cadre de la régression. Quatrièmement, nous explorons une infrastructure de conteneurs basée sur Docker, qui permet un déploiement reproductible des applications ROS sur la plateforme Duckietown (Chapter 5). Enfin, nous réfléchissons à l’état actuel des outils de programmation pour ces applications et spéculons à quoi pourrait ressembler la programmation de systèmes intelligents à l’avenir (Chapter 6).Programming tools are computer programs which help humans program computers. Tools come in all shapes and forms, from editors and compilers to debuggers and profilers. Each of these tools facilitates a core task in the programming workflow which consumes cognitive resources when performed manually. In this thesis, we explore several tools that facilitate the process of building intelligent systems, and which reduce the cognitive effort required to design, develop, test and deploy intelligent software systems. First, we introduce an integrated development environment (IDE) for programming Robot Operating System (ROS) applications, called Hatchery (Chapter 2). Second, we describe Kotlin∇, a language and type system for differentiable programming, an emerging paradigm in machine learning (Chapter 3). Third, we propose a new algorithm for automatically testing differentiable programs, drawing inspiration from techniques in adversarial and metamorphic testing (Chapter 4), and demonstrate its empirical efficiency in the regression setting. Fourth, we explore a container infrastructure based on Docker, which enables reproducible deployment of ROS applications on the Duckietown platform (Chapter 5). Finally, we reflect on the current state of programming tools for these applications and speculate what intelligent systems programming might look like in the future (Chapter 6)
    • …
    corecore