2,144 research outputs found
Automatic Differentiation of Algorithms for Machine Learning
Automatic differentiation---the mechanical transformation of numeric computer
programs to calculate derivatives efficiently and accurately---dates to the
origin of the computer age. Reverse mode automatic differentiation both
antedates and generalizes the method of backwards propagation of errors used in
machine learning. Despite this, practitioners in a variety of fields, including
machine learning, have been little influenced by automatic differentiation, and
make scant use of available tools. Here we review the technique of automatic
differentiation, describe its two main modes, and explain how it can benefit
machine learning practitioners. To reach the widest possible audience our
treatment assumes only elementary differential calculus, and does not assume
any knowledge of linear algebra.Comment: 7 pages, 1 figur
Automatic differentiation in machine learning: a survey
Derivatives, mostly in the form of gradients and Hessians, are ubiquitous in
machine learning. Automatic differentiation (AD), also called algorithmic
differentiation or simply "autodiff", is a family of techniques similar to but
more general than backpropagation for efficiently and accurately evaluating
derivatives of numeric functions expressed as computer programs. AD is a small
but established field with applications in areas including computational fluid
dynamics, atmospheric sciences, and engineering design optimization. Until very
recently, the fields of machine learning and AD have largely been unaware of
each other and, in some cases, have independently discovered each other's
results. Despite its relevance, general-purpose AD has been missing from the
machine learning toolbox, a situation slowly changing with its ongoing adoption
under the names "dynamic computational graphs" and "differentiable
programming". We survey the intersection of AD and machine learning, cover
applications where AD has direct relevance, and address the main implementation
techniques. By precisely defining the main differentiation techniques and their
interrelationships, we aim to bring clarity to the usage of the terms
"autodiff", "automatic differentiation", and "symbolic differentiation" as
these are encountered more and more in machine learning settings.Comment: 43 pages, 5 figure
Artificial Intelligence in the Context of Human Consciousness
Artificial intelligence (AI) can be defined as the ability of a machine to learn and make decisions based on acquired information. AI’s development has incited rampant public speculation regarding the singularity theory: a futuristic phase in which intelligent machines are capable of creating increasingly intelligent systems. Its implications, combined with the close relationship between humanity and their machines, make achieving understanding both natural and artificial intelligence imperative. Researchers are continuing to discover natural processes responsible for essential human skills like decision-making, understanding language, and performing multiple processes simultaneously. Artificial intelligence attempts to simulate these functions through techniques like artificial neural networks, Markov Decision Processes, Human Language Technology, and Multi-Agent Systems, which rely upon a combination of mathematical models and hardware
Making Presentation Math Computable
This Open-Access-book addresses the issue of translating mathematical expressions from LaTeX to the syntax of Computer Algebra Systems (CAS). Over the past decades, especially in the domain of Sciences, Technology, Engineering, and Mathematics (STEM), LaTeX has become the de-facto standard to typeset mathematical formulae in publications. Since scientists are generally required to publish their work, LaTeX has become an integral part of today's publishing workflow. On the other hand, modern research increasingly relies on CAS to simplify, manipulate, compute, and visualize mathematics. However, existing LaTeX import functions in CAS are limited to simple arithmetic expressions and are, therefore, insufficient for most use cases. Consequently, the workflow of experimenting and publishing in the Sciences often includes time-consuming and error-prone manual conversions between presentational LaTeX and computational CAS formats. To address the lack of a reliable and comprehensive translation tool between LaTeX and CAS, this thesis makes the following three contributions. First, it provides an approach to semantically enhance LaTeX expressions with sufficient semantic information for translations into CAS syntaxes. Second, it demonstrates the first context-aware LaTeX to CAS translation framework LaCASt. Third, the thesis provides a novel approach to evaluate the performance for LaTeX to CAS translations on large-scaled datasets with an automatic verification of equations in digital mathematical libraries. This is an open access book
The design of a neural network compiler
Computer simulation is a flexible and economical way for
rapid prototyping and concept evaluation with Neural
Network (NN) models. Increasing research on NNs has led
to the development of several simulation programs. Not
all simulations have the same scope. Some simulations
allow only a fixed network model and some are more
general. Designing a simulation program for general
purpose NN models has become a current trend nowadays
because of its flexibility and efficiency. A proper
programming language specifically for NN models is
preferred since the existing high-level languages such as
C are for NN designers from a strong computer background.
The program translations for NN languages come from
combinations which are either interpreter and/or
compiler. There are also various styles of programming
languages such as a procedural, functional, descriptive
and object-oriented.
The main focus of this thesis is to study the
feasibility of using a compiler method for the
development of a general-purpose simulator - NEUCOMP that
compiles the program written as a list of mathematical
specifications of the particular NN model and translates
it into a chosen target program. The language supported
by NEUCOMP is based on a procedural style. Information
regarding the list of mathematical statements required by
the NN models are written in the program. The
mathematical statements used are represented by scalar,
vector and matrix assignments. NEUCOMP translates these
expressions into actual program loops.
NEUCOMP enables compilation of a simulation program
written in the NEUCOMP language for any NN model,
contains graphical facilities such as portraying the NN
architecture and displaying a graph of the result during
training and finally to have a program that can run on a
parallel shared memory multi-processor system
Robust Computer Algebra, Theorem Proving, and Oracle AI
In the context of superintelligent AI systems, the term "oracle" has two
meanings. One refers to modular systems queried for domain-specific tasks.
Another usage, referring to a class of systems which may be useful for
addressing the value alignment and AI control problems, is a superintelligent
AI system that only answers questions. The aim of this manuscript is to survey
contemporary research problems related to oracles which align with long-term
research goals of AI safety. We examine existing question answering systems and
argue that their high degree of architectural heterogeneity makes them poor
candidates for rigorous analysis as oracles. On the other hand, we identify
computer algebra systems (CASs) as being primitive examples of domain-specific
oracles for mathematics and argue that efforts to integrate computer algebra
systems with theorem provers, systems which have largely been developed
independent of one another, provide a concrete set of problems related to the
notion of provable safety that has emerged in the AI safety community. We
review approaches to interfacing CASs with theorem provers, describe
well-defined architectural deficiencies that have been identified with CASs,
and suggest possible lines of research and practical software projects for
scientists interested in AI safety.Comment: 15 pages, 3 figure
Recommended from our members
Inventing Intelligence: On the History of Complex Information Processing and Artificial Intelligence in the United States in the Mid-Twentieth Century
In the mid-1950s, researchers in the United States melded formal theories of problem solving and intelligence with another powerful new tool for control: the electronic digital computer. Several branches of western mathematical science emerged from this nexus, including computer science (1960s–), data science (1990s–) and artificial intelligence (AI). This thesis offers an account of the origins and politics of AI in the mid-twentieth century United States, which focuses on its imbrications in systems of societal control. In an effort to denaturalize the power relations upon which the field came into being, I situate AI’s canonical origin story in relation to the structural and intellectual priorities of the U.S. military and American industry during the Cold War, circa 1952 to 1961.
This thesis offers a detailed and comparative account of the early careers, research interests, and key outputs of four researchers often credited with laying the foundations for AI and machine learning—Herbert A. Simon, Frank Rosenblatt, John McCarthy and Marvin Minsky. It chronicles the distinct ways in which each sought to formalise and simulate human mental behaviour using digital electronic computers. Rather than assess their contributions as discontinuous with what came before, as in mythologies of AI's genesis, I establish continuities with, and borrowings from, management science and operations research (Simon), Hayekian economics and instrumentalist statistics (Rosenblatt), automatic coding techniques and pedagogy (McCarthy), and cybernetics (Minsky), along with the broadscale mobilization of Cold War-era civilian-led military science generally.
I assess how Minsky’s 1961 paper 'Steps Toward Artificial Intelligence' simultaneously consolidated and obscured these entanglements as it set in motion an initial research agenda for AI in the following two decades. I argue that mind-computer metaphors, and research in complex information processing generally, played an important role in normalizing the small- and large-scale structuring of social behaviour using mathematics in the United States from the second half of the twentieth century onward
Making Presentation Math Computable
This Open-Access-book addresses the issue of translating mathematical expressions from LaTeX to the syntax of Computer Algebra Systems (CAS). Over the past decades, especially in the domain of Sciences, Technology, Engineering, and Mathematics (STEM), LaTeX has become the de-facto standard to typeset mathematical formulae in publications. Since scientists are generally required to publish their work, LaTeX has become an integral part of today's publishing workflow. On the other hand, modern research increasingly relies on CAS to simplify, manipulate, compute, and visualize mathematics. However, existing LaTeX import functions in CAS are limited to simple arithmetic expressions and are, therefore, insufficient for most use cases. Consequently, the workflow of experimenting and publishing in the Sciences often includes time-consuming and error-prone manual conversions between presentational LaTeX and computational CAS formats. To address the lack of a reliable and comprehensive translation tool between LaTeX and CAS, this thesis makes the following three contributions. First, it provides an approach to semantically enhance LaTeX expressions with sufficient semantic information for translations into CAS syntaxes. Second, it demonstrates the first context-aware LaTeX to CAS translation framework LaCASt. Third, the thesis provides a novel approach to evaluate the performance for LaTeX to CAS translations on large-scaled datasets with an automatic verification of equations in digital mathematical libraries. This is an open access book
- …