40 research outputs found

    An Analysis of Publication Venues for Automatic Differentiation Research

    Get PDF
    We present the results of our analysis of publication venues for papers on automatic differentiation (AD), covering academic journals and conference proceedings. Our data are collected from the AD publications database maintained by the autodiff.org community website. The database is purpose-built for the AD field and is expanding via submissions by AD researchers. Therefore, it provides a relatively noise-free list of publications relating to the field. However, it does include noise in the form of variant spellings of journal and conference names. We handle this by manually correcting and merging these variants under the official names of corresponding venues. We also share the raw data we get after these corrections.Comment: 6 pages, 3 figure

    Automatic Differentiation of Algorithms for Machine Learning

    Get PDF
    Automatic differentiation---the mechanical transformation of numeric computer programs to calculate derivatives efficiently and accurately---dates to the origin of the computer age. Reverse mode automatic differentiation both antedates and generalizes the method of backwards propagation of errors used in machine learning. Despite this, practitioners in a variety of fields, including machine learning, have been little influenced by automatic differentiation, and make scant use of available tools. Here we review the technique of automatic differentiation, describe its two main modes, and explain how it can benefit machine learning practitioners. To reach the widest possible audience our treatment assumes only elementary differential calculus, and does not assume any knowledge of linear algebra.Comment: 7 pages, 1 figur

    Evolution of Ideas: A Novel Memetic Algorithm Based on Semantic Networks

    Full text link
    This paper presents a new type of evolutionary algorithm (EA) based on the concept of "meme", where the individuals forming the population are represented by semantic networks and the fitness measure is defined as a function of the represented knowledge. Our work can be classified as a novel memetic algorithm (MA), given that (1) it is the units of culture, or information, that are undergoing variation, transmission, and selection, very close to the original sense of memetics as it was introduced by Dawkins; and (2) this is different from existing MA, where the idea of memetics has been utilized as a means of local refinement by individual learning after classical global sampling of EA. The individual pieces of information are represented as simple semantic networks that are directed graphs of concepts and binary relations, going through variation by memetic versions of operators such as crossover and mutation, which utilize knowledge from commonsense knowledge bases. In evaluating this introductory work, as an interesting fitness measure, we focus on using the structure mapping theory of analogical reasoning from psychology to evolve pieces of information that are analogous to a given base information. Considering other possible fitness measures, the proposed representation and algorithm can serve as a computational tool for modeling memetic theories of knowledge, such as evolutionary epistemology and cultural selection theory.Comment: Conference submission, 2012 IEEE Congress on Evolutionary Computation (8 pages, 7 figures

    Automated Generation of Cross-Domain Analogies via Evolutionary Computation

    Full text link
    Analogy plays an important role in creativity, and is extensively used in science as well as art. In this paper we introduce a technique for the automated generation of cross-domain analogies based on a novel evolutionary algorithm (EA). Unlike existing work in computational analogy-making restricted to creating analogies between two given cases, our approach, for a given case, is capable of creating an analogy along with the novel analogous case itself. Our algorithm is based on the concept of "memes", which are units of culture, or knowledge, undergoing variation and selection under a fitness measure, and represents evolving pieces of knowledge as semantic networks. Using a fitness function based on Gentner's structure mapping theory of analogies, we demonstrate the feasibility of spontaneously generating semantic networks that are analogous to a given base network.Comment: Conference submission, International Conference on Computational Creativity 2012 (8 pages, 6 figures

    Using Synthetic Data to Train Neural Networks is Model-Based Reasoning

    Full text link
    We draw a formal connection between using synthetic training data to optimize neural network parameters and approximate, Bayesian, model-based reasoning. In particular, training a neural network using synthetic data can be viewed as learning a proposal distribution generator for approximate inference in the synthetic-data generative model. We demonstrate this connection in a recognition task where we develop a novel Captcha-breaking architecture and train it using synthetic data, demonstrating both state-of-the-art performance and a way of computing task-specific posterior uncertainty. Using a neural network trained this way, we also demonstrate successful breaking of real-world Captchas currently used by Facebook and Wikipedia. Reasoning from these empirical results and drawing connections with Bayesian modeling, we discuss the robustness of synthetic data results and suggest important considerations for ensuring good neural network generalization when training with synthetic data.Comment: 8 pages, 4 figure

    Towards 3D Retrieval of Exoplanet Atmospheres: Assessing Thermochemical Equilibrium Estimation Methods

    Full text link
    Characterizing exoplanetary atmospheres via Bayesian retrievals requires assuming some chemistry model, such as thermochemical equilibrium or parameterized abundances. The higher-resolution data offered by upcoming telescopes enables more complex chemistry models within retrieval frameworks. Yet, many chemistry codes that model more complex processes like photochemistry and vertical transport are computationally expensive, and directly incorporating them into a 1D retrieval model can result in prohibitively long execution times. Additionally, phase-curve observations with upcoming telescopes motivate 2D and 3D retrieval models, further exacerbating the lengthy runtime for retrieval frameworks with complex chemistry models. Here, we compare thermochemical equilibrium approximation methods based on their speed and accuracy with respect to a Gibbs energy-minimization code. We find that, while all methods offer orders of magnitude reductions in computational cost, neural network surrogate models perform more accurately than the other approaches considered, achieving a median absolute dex error <0.03 for the phase space considered. While our results are based on a 1D chemistry model, our study suggests that higher dimensional chemistry models could be incorporated into retrieval models via this surrogate modeling approach.Comment: 22 pages, 14 figures, submitted to PSJ 2022/11/22, revised 2023/3/7, accepted 2023/3/23. Updated to add Zenodo link to Reproducible Research Compendiu

    Automatic differentiation in machine learning: a survey

    Get PDF
    Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure

    Tricks from Deep Learning

    Get PDF
    The deep learning community has devised a diverse set of methods to make gradient optimization, using large datasets, of large and highly complex models with deeply cascaded nonlinearities, practical. Taken as a whole, these methods constitute a breakthrough, allowing computational structures which are quite wide, very deep, and with an enormous number and variety of free parameters to be effectively optimized. The result now dominates much of practical machine learning, with applications in machine translation, computer vision, and speech recognition. Many of these methods, viewed through the lens of algorithmic differentiation (AD), can be seen as either addressing issues with the gradient itself, or finding ways of achieving increased efficiency using tricks that are AD-related, but not provided by current AD systems. The goal of this paper is to explain not just those methods of most relevance to AD, but also the technical constraints and mindset which led to their discovery. After explaining this context, we present a "laundry list" of methods developed by the deep learning community. Two of these are discussed in further mathematical detail: a way to dramatically reduce the size of the tape when performing reverse-mode AD on a (theoretically) time-reversible process like an ODE integrator; and a new mathematical insight that allows for the implementation of a stochastic Newton's method

    DiffSharp: An AD Library for .NET Languages

    Get PDF
    DiffSharp is an algorithmic differentiation or automatic differentiation (AD) library for the .NET ecosystem, which is targeted by the C# and F# languages, among others. The library has been designed with machine learning applications in mind, allowing very succinct implementations of models and optimization routines. DiffSharp is implemented in F# and exposes forward and reverse AD operators as general nestable higher-order functions, usable by any .NET language. It provides high-performance linear algebra primitives---scalars, vectors, and matrices, with a generalization to tensors underway---that are fully supported by all the AD operators, and which use a BLAS/LAPACK backend via the highly optimized OpenBLAS library. DiffSharp currently uses operator overloading, but we are developing a transformation-based version of the library using F#'s "code quotation" metaprogramming facility. Work on a CUDA-based GPU backend is also underway
    corecore