Search CORE

411 research outputs found

Automatic differentiation in machine learning: a survey

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Radul Alexey Andreyevich
Siskind Jeffrey Mark
Publication venue
Publication date: 01/01/2018
Field of study

Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Oxford University Research Archive

The recondite intricacies of Zeeman Doppler mapping

Author: Alecian
Alecian
Alecian
Asplund
Auer
Baschek
Bischof
Brown
C. R. Cowley
Chandrasekhar
Chmielewski
Cowley
F. Leone
Goncharskii
Gonzalez
Hubeny
Hubeny
Hui
Khan
Khokhlova
Kochukhov
Kochukhov
Kochukhov
Kupka
Kurucz
Kuschnig
LeBlanc
Leone
Lüftinger
Lüftinger
Lüftinger
Lüftinger
M. J. Stift
Maitzen
Michaud
Mihalas
Peytremann
Piskunov
Piskunov
Rees
Rice
Stift
Stift
Stift
Stift
Stift
Stift
Stift
Strom
Vogt
Wade
Publication venue: 'Wiley'
Publication date: 06/10/2011
Field of study

We present a detailed analysis of the reliability of abundance and magnetic maps of Ap stars obtained by Zeeman Doppler mapping (ZDM). It is shown how they can be adversely affected by the assumption of a mean stellar atmosphere instead of appropriate "local" atmospheres corresponding to the actual abundances in a given region. The essenceof the difficulties was already shown by Chandrasekhar's picket-fence model. The results obtained with a suite of Stokes codes written in the Ada programming language and based on modern line-blanketed atmospheres are described in detail. We demonstrate that the high metallicity values claimed to have been found in chemically inhomogeneous Ap star atmospheres would lead to local temperature structures, continuum and line intensities, and line shapes that differ significantly from those predicted by a mean stellar atmosphere. Unfortunately, past applications of ZDM have consistently overlooked the intricate aspects of metallicity with their all-pervading effects. The erroneous assumption of a mean atmosphere for a spotted star can lead to phase-dependent errors of uncomfortably large proportions at varying wavelengths both in the Stokes I and V profiles, making precise mapping of abundances and magnetic field vectors largely impossible. The relation between core and wings of the H_beta line changes, too, with possible repercussions on the determination of gravity and effective temperature. Finally, a ZDM analysis of the synthetic Stokes spectra of a spotted star reveals the disturbing differences between the respective abundance maps based on a mean atmosphere on the one hand, and on appropriate "local" atmospheres on the other. We then discuss what this all means for published ZDMresults. Our discussion makes it clear that realistic local atmospheres must be used, especially if credible small-scale structures are to be obtained.Comment: Accepted for publication in MNRA

arXiv.org e-Print Archive

Crossref

Deep Blue Documents at the University of Michigan

Automatic differentiation in machine learning: a survey

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Radul Alexey Andreyevich
Siskind Jeffrey Mark
Publication venue: 'Center for Open Science'
Publication date: 01/01/2015
Field of study

Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD) is a technique for calculating derivatives of numeric functions expressed as computer programs efficiently and accurately, used in fields such as computational fluid dynamics, nuclear engineering, and atmospheric sciences. Despite its advantages and use in other fields, machine learning practitioners have been little influenced by AD and make scant use of available tools. We survey the intersection of AD and machine learning, cover applications where AD has the potential to make a big impact, and report on some recent developments in the adoption of this technique. We aim to dispel some misconceptions that we contend have impeded the use of AD within the machine learning community

MURAL - Maynooth University Research Archive Library

Automatic differentiation in machine learning: a survey

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Radul Alexey Andreyevich
Siskind Jeffrey Mark
Publication venue: 'Center for Open Science'
Publication date: 01/01/2015
Field of study

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Performance analysis of memory transfers and GEMM subroutines on NVIDIA Tesla GPU cluster

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2009
Field of study

Crossref

A survey of program slicing for software engineering

Author: Beck Jon
Publication venue
Publication date
Field of study

This research concerns program slicing which is used as a tool for program maintainence of software systems. Program slicing decreases the level of effort required to understand and maintain complex software systems. It was first designed as a debugging aid, but it has since been generalized into various tools and extended to include program comprehension, module cohesion estimation, requirements verification, dead code elimination, and maintainence of several software systems, including reverse engineering, parallelization, portability, and reuse component generation. This paper seeks to address and define terminology, theoretical concepts, program representation, different program graphs, developments in static slicing, dynamic slicing, and semantics and mathematical models. Applications for conventional slicing are presented, along with a prognosis of future work in this field

NASA Technical Reports Server

An FPGA implementation of an investigative many-core processor, Fynbos : in support of a Fortran autoparallelising software pipeline

Author: Wyngaard Janet Ruth
Publication venue: Department of Electrical Engineering
Publication date: 01/01/2014
Field of study

Includes bibliographical references.In light of the power, memory, ILP, and utilisation walls facing the computing industry, this work examines the hypothetical many-core approach to finding greater compute performance and efficiency. In order to achieve greater efficiency in an environment in which Moore’s law continues but TDP has been capped, a means of deriving performance from dark and dim silicon is needed. The many-core hypothesis is one approach to exploiting these available transistors efficiently. As understood in this work, it involves trading in hardware control complexity for hundreds to thousands of parallel simple processing elements, and operating at a clock speed sufficiently low as to allow the efficiency gains of near threshold voltage operation. Performance is there- fore dependant on exploiting a new degree of fine-grained parallelism such as is currently only found in GPGPUs, but in a manner that is not as restrictive in application domain range. While removing the complex control hardware of traditional CPUs provides space for more arithmetic hardware, a basic level of control is still required. For a number of reasons this work chooses to replace this control largely with static scheduling. This pushes the burden of control primarily to the software and specifically the compiler, rather not to the programmer or to an application specific means of control simplification. An existing legacy tool chain capable of autoparallelising sequential Fortran code to the degree of parallelism necessary for many-core exists. This work implements a many-core architecture to match it. Prototyping the design on an FPGA, it is possible to examine the real world performance of the compiler-architecture system to a greater degree than simulation only would allow. Comparing theoretical peak performance and real performance in a case study application, the system is found to be more efficient than any other reviewed, but to also significantly under perform relative to current competing architectures. This failing is apportioned to taking the need for simple hardware too far, and an inability to implement static scheduling mitigating tactics due to lack of support for such in the compiler

Cape Town University OpenUCT

Improving digital image correlation in the TopoSEM Software Package

Author: Ferreira José Filipe de Sousa Matos
Publication venue
Publication date: 06/03/2023
Field of study

Dissertação de mestrado integrado em Informatics EngineeringTopoSEM is a software package with the aim of reconstructing a 3D surface topography of a microscopic sample from a set of 2D Scanning Electron Microscopy (SEM) images. TopoSEM is also able to produce a stability report on the calibration of the SEM hardware based solely on output images. One of the key steps in both of these workflows is the use of a Digital Image Correlation (DIC) algorithm, a no-contact imaging technique, to measure full-field displacements of an input image. A novel DIC implementation fine-tuned for 3D reconstructions was originally developed in MATLAB to satisfy the feature requirement of this project. However, near real-time usability of the TopoSEM is paramount for its users, and the main barrier towards this goal is the under-performing DIC implementation. This dissertation work ported the original MATLAB implementation of TopoSEM to sequential C++ and its performance was further optimised: (i) to improve memory accesses, (ii) to explore the available vector exten sions in each core of current multiprocessor chips processors to perform computationally intensive operations on vectors and matrices of single and double-precision floating point values, and (iii) to additionally improve the execution performance through parallelization on multi-core devices, by using multiple threads with a front wave propagation scheduler. The initial MATLAB implementation took 3279.4 seconds to compute the full-field displacement of a 2576 pixels by 2086 pixels image on a quad-core laptop. With all added improvements, the new parallel C++ version on the same laptop lowered the execution time to 1.52 seconds, achieving an overall speedup of 2158.TopoSEM é um programa cujo objetivo é reconstruir em 3D a topografia de uma amostra capturada por um mi croscópio electrónico de varrimento. Esta ferramenta é também capaz de gerar um relatório sobre a estabilidade da calibração do microscópio com base apenas em imagens capturadas. Um dos passos chave para ambas as funcionalidades trata-se da utilização de um algoritmo de Correlação Digital de Imagens (DIC), uma técnica de visão por computador que não envolve contacto direto com a amostra e que permite medir deslocamentos e deformações entre imagens. Criou-se uma nova implementação de DIC em MATLAB especialmente formulada para reconstrução 3D. No entanto, a capacidade de utilizar o TopoSEM em quase tempo real é fundamental para os seus utilizadores e a principal barreira para tal são os elevados tempos de execução da implementação em MATLAB. Esta dissertação portou o código de MATLAB para código sequencial em C++ e a sua performance foi melho rada: (i) para otimizar acessos a memória, (ii) para explorar extensões de vetorização disponíveis em hardware moderno para otimizar operações sobre vetores e matrizes, e (iii) para através de paralelização em dispositivos multi-core melhorar ainda mais a performance utilizando para isso vários fios de execução com um escalonador de propagação em onda. A implementação inicial em MATLAB demorava 3279.4 segundos para computar uma imagem com resolução de 2576 pixels por 2086 pixels num portátil quad-core. Com todas as melhorias de performance, a nova imple mentação paralela em C++ reduziu o tempo de execução para 1.52 segundos para as mesmas imagens no mesmo computador, atingindo um speedup de 2158

Universidade do Minho: RepositoriUM