1,897 research outputs found
Learning The Sequential Temporal Information with Recurrent Neural Networks
Recurrent Networks are one of the most powerful and promising artificial
neural network algorithms to processing the sequential data such as natural
languages, sound, time series data. Unlike traditional feed-forward network,
Recurrent Network has a inherent feed back loop that allows to store the
temporal context information and pass the state of information to the entire
sequences of the events. This helps to achieve the state of art performance in
many important tasks such as language modeling, stock market prediction, image
captioning, speech recognition, machine translation and object tracking etc.,
However, training the fully connected RNN and managing the gradient flow are
the complicated process. Many studies are carried out to address the mentioned
limitation. This article is intent to provide the brief details about recurrent
neurons, its variances and trips & tricks to train the fully recurrent neural
network. This review work is carried out as a part of our IPO studio software
module 'Multiple Object Tracking'.Comment: 17 page
Memory and attention in deep learning
Intelligence necessitates memory. Without memory, humans fail to perform
various nontrivial tasks such as reading novels, playing games or solving
maths. As the ultimate goal of machine learning is to derive intelligent
systems that learn and act automatically just like human, memory construction
for machine is inevitable. Artificial neural networks model neurons and
synapses in the brain by interconnecting computational units via weights, which
is a typical class of machine learning algorithms that resembles memory
structure. Their descendants with more complicated modeling techniques (a.k.a
deep learning) have been successfully applied to many practical problems and
demonstrated the importance of memory in the learning process of machinery
systems. Recent progresses on modeling memory in deep learning have revolved
around external memory constructions, which are highly inspired by
computational Turing models and biological neuronal systems. Attention
mechanisms are derived to support acquisition and retention operations on the
external memory. Despite the lack of theoretical foundations, these approaches
have shown promises to help machinery systems reach a higher level of
intelligence. The aim of this thesis is to advance the understanding on memory
and attention in deep learning. Its contributions include: (i) presenting a
collection of taxonomies for memory, (ii) constructing new memory-augmented
neural networks (MANNs) that support multiple control and memory units, (iii)
introducing variability via memory in sequential generative models, (iv)
searching for optimal writing operations to maximise the memorisation capacity
in slot-based memory networks, and (v) simulating the Universal Turing Machine
via Neural Stored-program Memory-a new kind of external memory for neural
networks.Comment: PHD Thesi
Convolutional Bipartite Attractor Networks
In human perception and cognition, a fundamental operation that brains
perform is interpretation: constructing coherent neural states from noisy,
incomplete, and intrinsically ambiguous evidence. The problem of interpretation
is well matched to an early and often overlooked architecture, the attractor
network---a recurrent neural net that performs constraint satisfaction,
imputation of missing features, and clean up of noisy data via energy
minimization dynamics. We revisit attractor nets in light of modern deep
learning methods and propose a convolutional bipartite architecture with a
novel training loss, activation function, and connectivity constraints. We
tackle larger problems than have been previously explored with attractor nets
and demonstrate their potential for image completion and super-resolution. We
argue that this architecture is better motivated than ever-deeper feedforward
models and is a viable alternative to more costly sampling-based generative
methods on a range of supervised and unsupervised tasks
An Approximate Backpropagation Learning Rule for Memristor Based Neural Networks Using Synaptic Plasticity
We describe an approximation to backpropagation algorithm for training deep
neural networks, which is designed to work with synapses implemented with
memristors. The key idea is to represent the values of both the input signal
and the backpropagated delta value with a series of pulses that trigger
multiple positive or negative updates of the synaptic weight, and to use the
min operation instead of the product of the two signals. In computational
simulations, we show that the proposed approximation to backpropagation is well
converged and may be suitable for memristor implementations of multilayer
neural networks.Comment: 21 pages, 6 figures, 1 table, title changed, manuscript thoroughly
rewritte
Deep Learning in Neural Networks: An Overview
In recent years, deep artificial neural networks (including recurrent ones)
have won numerous contests in pattern recognition and machine learning. This
historical survey compactly summarises relevant work, much of it from the
previous millennium. Shallow and deep learners are distinguished by the depth
of their credit assignment paths, which are chains of possibly learnable,
causal links between actions and effects. I review deep supervised learning
(also recapitulating the history of backpropagation), unsupervised learning,
reinforcement learning & evolutionary computation, and indirect search for
short programs encoding deep and large networks.Comment: 88 pages, 888 reference
Machine learning \& artificial intelligence in the quantum domain
Quantum information technologies, and intelligent learning systems, are both
emergent technologies that will likely have a transforming impact on our
society. The respective underlying fields of research -- quantum information
(QI) versus machine learning (ML) and artificial intelligence (AI) -- have
their own specific challenges, which have hitherto been investigated largely
independently. However, in a growing body of recent work, researchers have been
probing the question to what extent these fields can learn and benefit from
each other. QML explores the interaction between quantum computing and ML,
investigating how results and techniques from one field can be used to solve
the problems of the other. Recently, we have witnessed breakthroughs in both
directions of influence. For instance, quantum computing is finding a vital
application in providing speed-ups in ML, critical in our "big data" world.
Conversely, ML already permeates cutting-edge technologies, and may become
instrumental in advanced quantum technologies. Aside from quantum speed-up in
data analysis, or classical ML optimization used in quantum experiments,
quantum enhancements have also been demonstrated for interactive learning,
highlighting the potential of quantum-enhanced learning agents. Finally, works
exploring the use of AI for the very design of quantum experiments, and for
performing parts of genuine research autonomously, have reported their first
successes. Beyond the topics of mutual enhancement, researchers have also
broached the fundamental issue of quantum generalizations of ML/AI concepts.
This deals with questions of the very meaning of learning and intelligence in a
world that is described by quantum mechanics. In this review, we describe the
main ideas, recent developments, and progress in a broad spectrum of research
investigating machine learning and artificial intelligence in the quantum
domain.Comment: Review paper. 106 pages. 16 figure
Analog Photonics Computing for Information Processing, Inference and Optimisation
This review presents an overview of the current state-of-the-art in photonics
computing, which leverages photons, photons coupled with matter, and
optics-related technologies for effective and efficient computational purposes.
It covers the history and development of photonics computing and modern
analogue computing platforms and architectures, focusing on optimization tasks
and neural network implementations. The authors examine special-purpose
optimizers, mathematical descriptions of photonics optimizers, and their
various interconnections. Disparate applications are discussed, including
direct encoding, logistics, finance, phase retrieval, machine learning, neural
networks, probabilistic graphical models, and image processing, among many
others. The main directions of technological advancement and associated
challenges in photonics computing are explored, along with an assessment of its
efficiency. Finally, the paper discusses prospects and the field of optical
quantum computing, providing insights into the potential applications of this
technology.Comment: Invited submission by Journal of Advanced Quantum Technologies;
accepted version 5/06/202
The Performance of Associative Memory Models with Biologically Inspired Connectivity
This thesis is concerned with one important question in artificial neural networks, that is, how biologically inspired connectivity of a network affects its associative memory performance.
In recent years, research on the mammalian cerebral cortex, which has the main
responsibility for the associative memory function in the brains, suggests that
the connectivity of this cortical network is far from fully connected, which is
commonly assumed in traditional associative memory models. It is found to
be a sparse network with interesting connectivity characteristics such as the
“small world network” characteristics, represented by short Mean Path Length,
high Clustering Coefficient, and high Global and Local Efficiency. Most of the networks in this thesis are therefore sparsely connected.
There is, however, no conclusive evidence of how these different connectivity
characteristics affect the associative memory performance of a network. This
thesis addresses this question using networks with different types of
connectivity, which are inspired from biological evidences.
The findings of this programme are unexpected and important. Results show
that the performance of a non-spiking associative memory model is found to be
predicted by its linear correlation with the Clustering Coefficient of the network,
regardless of the detailed connectivity patterns. This is particularly important
because the Clustering Coefficient is a static measure of one aspect of
connectivity, whilst the associative memory performance reflects the result of a
complex dynamic process.
On the other hand, this research reveals that improvements in the performance
of a network do not necessarily directly rely on an increase in the network’s
wiring cost. Therefore it is possible to construct networks with high
associative memory performance but relatively low wiring cost. Particularly,
Gaussian distributed connectivity in a network is found to achieve the best
performance with the lowest wiring cost, in all examined connectivity models.
Our results from this programme also suggest that a modular network with an
appropriate configuration of Gaussian distributed connectivity, both internal to
each module and across modules, can perform nearly as well as the Gaussian
distributed non-modular network.
Finally, a comparison between non-spiking and spiking associative memory
models suggests that in terms of associative memory performance, the
implication of connectivity seems to transcend the details of the actual neural
models, that is, whether they are spiking or non-spiking neurons
- …