Search CORE

2,144 research outputs found

Automatic Differentiation of Algorithms for Machine Learning

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Publication venue
Publication date: 01/01/2014
Field of study

Automatic differentiation---the mechanical transformation of numeric computer programs to calculate derivatives efficiently and accurately---dates to the origin of the computer age. Reverse mode automatic differentiation both antedates and generalizes the method of backwards propagation of errors used in machine learning. Despite this, practitioners in a variety of fields, including machine learning, have been little influenced by automatic differentiation, and make scant use of available tools. Here we review the technique of automatic differentiation, describe its two main modes, and explain how it can benefit machine learning practitioners. To reach the widest possible audience our treatment assumes only elementary differential calculus, and does not assume any knowledge of linear algebra.Comment: 7 pages, 1 figur

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Automatic differentiation in machine learning: a survey

Author: Baydin Atilim Gunes
Pearlmutter Barak A.
Radul Alexey Andreyevich
Siskind Jeffrey Mark
Publication venue
Publication date: 01/01/2018
Field of study

Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in machine learning. Automatic differentiation (AD), also called algorithmic differentiation or simply "autodiff", is a family of techniques similar to but more general than backpropagation for efficiently and accurately evaluating derivatives of numeric functions expressed as computer programs. AD is a small but established field with applications in areas including computational fluid dynamics, atmospheric sciences, and engineering design optimization. Until very recently, the fields of machine learning and AD have largely been unaware of each other and, in some cases, have independently discovered each other's results. Despite its relevance, general-purpose AD has been missing from the machine learning toolbox, a situation slowly changing with its ongoing adoption under the names "dynamic computational graphs" and "differentiable programming". We survey the intersection of AD and machine learning, cover applications where AD has direct relevance, and address the main implementation techniques. By precisely defining the main differentiation techniques and their interrelationships, we aim to bring clarity to the usage of the terms "autodiff", "automatic differentiation", and "symbolic differentiation" as these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure

arXiv.org e-Print Archive

MURAL - Maynooth University Research Archive Library

NUI Maynooth Eprint Archive

Maynooth University ePrints and eTheses Archive

Oxford University Research Archive

Recommended from our members

An overview of some recent machine learning research and applications

Author: Metfessel Brent A.
Paul
Publication venue: eScholarship, University of California
Publication date: 27/09/1990
Field of study

eScholarship - University of California

Artificial Intelligence in the Context of Human Consciousness

Author: DeFries Hannah
Publication venue: Scholars Crossing
Publication date: 17/04/2019
Field of study

Artificial intelligence (AI) can be defined as the ability of a machine to learn and make decisions based on acquired information. AI’s development has incited rampant public speculation regarding the singularity theory: a futuristic phase in which intelligent machines are capable of creating increasingly intelligent systems. Its implications, combined with the close relationship between humanity and their machines, make achieving understanding both natural and artificial intelligence imperative. Researchers are continuing to discover natural processes responsible for essential human skills like decision-making, understanding language, and performing multiple processes simultaneously. Artificial intelligence attempts to simulate these functions through techniques like artificial neural networks, Markov Decision Processes, Human Language Technology, and Multi-Agent Systems, which rely upon a combination of mathematical models and hardware

Liberty University Digital Commons

Making Presentation Math Computable

Author: Greiner-Petter André
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/01/2023
Field of study

This Open-Access-book addresses the issue of translating mathematical expressions from LaTeX to the syntax of Computer Algebra Systems (CAS). Over the past decades, especially in the domain of Sciences, Technology, Engineering, and Mathematics (STEM), LaTeX has become the de-facto standard to typeset mathematical formulae in publications. Since scientists are generally required to publish their work, LaTeX has become an integral part of today's publishing workflow. On the other hand, modern research increasingly relies on CAS to simplify, manipulate, compute, and visualize mathematics. However, existing LaTeX import functions in CAS are limited to simple arithmetic expressions and are, therefore, insufficient for most use cases. Consequently, the workflow of experimenting and publishing in the Sciences often includes time-consuming and error-prone manual conversions between presentational LaTeX and computational CAS formats. To address the lack of a reliable and comprehensive translation tool between LaTeX and CAS, this thesis makes the following three contributions. First, it provides an approach to semantically enhance LaTeX expressions with sufficient semantic information for translations into CAS syntaxes. Second, it demonstrates the first context-aware LaTeX to CAS translation framework LaCASt. Third, the thesis provides a novel approach to evaluate the performance for LaTeX to CAS translations on large-scaled datasets with an automatic verification of equations in digital mathematical libraries. This is an open access book

Directory of Open Access Books (DOAB)

The design of a neural network compiler

Author: Md. Nasir Sulaiman (7170218)
Publication venue
Publication date: 01/01/1994
Field of study

Computer simulation is a flexible and economical way for rapid prototyping and concept evaluation with Neural Network (NN) models. Increasing research on NNs has led to the development of several simulation programs. Not all simulations have the same scope. Some simulations allow only a fixed network model and some are more general. Designing a simulation program for general purpose NN models has become a current trend nowadays because of its flexibility and efficiency. A proper programming language specifically for NN models is preferred since the existing high-level languages such as C are for NN designers from a strong computer background. The program translations for NN languages come from combinations which are either interpreter and/or compiler. There are also various styles of programming languages such as a procedural, functional, descriptive and object-oriented. The main focus of this thesis is to study the feasibility of using a compiler method for the development of a general-purpose simulator - NEUCOMP that compiles the program written as a list of mathematical specifications of the particular NN model and translates it into a chosen target program. The language supported by NEUCOMP is based on a procedural style. Information regarding the list of mathematical statements required by the NN models are written in the program. The mathematical statements used are represented by scalar, vector and matrix assignments. NEUCOMP translates these expressions into actual program loops. NEUCOMP enables compilation of a simulation program written in the NEUCOMP language for any NN model, contains graphical facilities such as portraying the NN architecture and displaying a graph of the result during training and finally to have a program that can run on a parallel shared memory multi-processor system

Loughborough University Institutional Repository

Robust Computer Algebra, Theorem Proving, and Oracle AI

Author: Hay Nick J.
Sarma Gopal P.
Publication venue
Publication date: 01/01/2017
Field of study

In the context of superintelligent AI systems, the term "oracle" has two meanings. One refers to modular systems queried for domain-specific tasks. Another usage, referring to a class of systems which may be useful for addressing the value alignment and AI control problems, is a superintelligent AI system that only answers questions. The aim of this manuscript is to survey contemporary research problems related to oracles which align with long-term research goals of AI safety. We examine existing question answering systems and argue that their high degree of architectural heterogeneity makes them poor candidates for rigorous analysis as oracles. On the other hand, we identify computer algebra systems (CASs) as being primitive examples of domain-specific oracles for mathematics and argue that efforts to integrate computer algebra systems with theorem provers, systems which have largely been developed independent of one another, provide a concrete set of problems related to the notion of provable safety that has emerged in the AI safety community. We review approaches to interfacing CASs with theorem provers, describe well-defined architectural deficiencies that have been identified with CASs, and suggest possible lines of research and practical software projects for scientists interested in AI safety.Comment: 15 pages, 3 figure

arXiv.org e-Print Archive

Crossref

PhilSci Archive

Recommended from our members

Inventing Intelligence: On the History of Complex Information Processing and Artificial Intelligence in the United States in the Mid-Twentieth Century

Author: Penn Jonathan
Publication venue: University of Cambridge
Publication date: 14/12/2020
Field of study

In the mid-1950s, researchers in the United States melded formal theories of problem solving and intelligence with another powerful new tool for control: the electronic digital computer. Several branches of western mathematical science emerged from this nexus, including computer science (1960s–), data science (1990s–) and artificial intelligence (AI). This thesis offers an account of the origins and politics of AI in the mid-twentieth century United States, which focuses on its imbrications in systems of societal control. In an effort to denaturalize the power relations upon which the field came into being, I situate AI’s canonical origin story in relation to the structural and intellectual priorities of the U.S. military and American industry during the Cold War, circa 1952 to 1961. This thesis offers a detailed and comparative account of the early careers, research interests, and key outputs of four researchers often credited with laying the foundations for AI and machine learning—Herbert A. Simon, Frank Rosenblatt, John McCarthy and Marvin Minsky. It chronicles the distinct ways in which each sought to formalise and simulate human mental behaviour using digital electronic computers. Rather than assess their contributions as discontinuous with what came before, as in mythologies of AI's genesis, I establish continuities with, and borrowings from, management science and operations research (Simon), Hayekian economics and instrumentalist statistics (Rosenblatt), automatic coding techniques and pedagogy (McCarthy), and cybernetics (Minsky), along with the broadscale mobilization of Cold War-era civilian-led military science generally. I assess how Minsky’s 1961 paper 'Steps Toward Artificial Intelligence' simultaneously consolidated and obscured these entanglements as it set in motion an initial research agenda for AI in the following two decades. I argue that mind-computer metaphors, and research in complex information processing generally, played an important role in normalizing the small- and large-scale structuring of social behaviour using mathematics in the United States from the second half of the twentieth century onward

Apollo (Cambridge)

Making Presentation Math Computable

Author: Greiner-Petter André
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library