10,478 research outputs found
Direct Feedback Alignment with Sparse Connections for Local Learning
Recent advances in deep neural networks (DNNs) owe their success to training
algorithms that use backpropagation and gradient-descent. Backpropagation,
while highly effective on von Neumann architectures, becomes inefficient when
scaling to large networks. Commonly referred to as the weight transport
problem, each neuron's dependence on the weights and errors located deeper in
the network require exhaustive data movement which presents a key problem in
enhancing the performance and energy-efficiency of machine-learning hardware.
In this work, we propose a bio-plausible alternative to backpropagation drawing
from advances in feedback alignment algorithms in which the error computation
at a single synapse reduces to the product of three scalar values. Using a
sparse feedback matrix, we show that a neuron needs only a fraction of the
information previously used by the feedback alignment algorithms. Consequently,
memory and compute can be partitioned and distributed whichever way produces
the most efficient forward pass so long as a single error can be delivered to
each neuron. Our results show orders of magnitude improvement in data movement
and improvement in multiply-and-accumulate operations over
backpropagation. Like previous work, we observe that any variant of feedback
alignment suffers significant losses in classification accuracy on deep
convolutional neural networks. By transferring trained convolutional layers and
training the fully connected layers using direct feedback alignment, we
demonstrate that direct feedback alignment can obtain results competitive with
backpropagation. Furthermore, we observe that using an extremely sparse
feedback matrix, rather than a dense one, results in a small accuracy drop
while yielding hardware advantages. All the code and results are available
under https://github.com/bcrafton/ssdfa.Comment: 15 pages, 8 figure
A Cognitive Model of an Epistemic Community: Mapping the Dynamics of Shallow Lake Ecosystems
We used fuzzy cognitive mapping (FCM) to develop a generic shallow lake
ecosystem model by augmenting the individual cognitive maps drawn by 8
scientists working in the area of shallow lake ecology. We calculated graph
theoretical indices of the individual cognitive maps and the collective
cognitive map produced by augmentation. The graph theoretical indices revealed
internal cycles showing non-linear dynamics in the shallow lake ecosystem. The
ecological processes were organized democratically without a top-down
hierarchical structure. The steady state condition of the generic model was a
characteristic turbid shallow lake ecosystem since there were no dynamic
environmental changes that could cause shifts between a turbid and a clearwater
state, and the generic model indicated that only a dynamic disturbance regime
could maintain the clearwater state. The model developed herein captured the
empirical behavior of shallow lakes, and contained the basic model of the
Alternative Stable States Theory. In addition, our model expanded the basic
model by quantifying the relative effects of connections and by extending it.
In our expanded model we ran 4 simulations: harvesting submerged plants,
nutrient reduction, fish removal without nutrient reduction, and
biomanipulation. Only biomanipulation, which included fish removal and nutrient
reduction, had the potential to shift the turbid state into clearwater state.
The structure and relationships in the generic model as well as the outcomes of
the management simulations were supported by actual field studies in shallow
lake ecosystems. Thus, fuzzy cognitive mapping methodology enabled us to
understand the complex structure of shallow lake ecosystems as a whole and
obtain a valid generic model based on tacit knowledge of experts in the field.Comment: 24 pages, 5 Figure
Quantum Generative Adversarial Networks for Learning and Loading Random Distributions
Quantum algorithms have the potential to outperform their classical
counterparts in a variety of tasks. The realization of the advantage often
requires the ability to load classical data efficiently into quantum states.
However, the best known methods require gates to
load an exact representation of a generic data structure into an -qubit
state. This scaling can easily predominate the complexity of a quantum
algorithm and, thereby, impair potential quantum advantage. Our work presents a
hybrid quantum-classical algorithm for efficient, approximate quantum state
loading. More precisely, we use quantum Generative Adversarial Networks (qGANs)
to facilitate efficient learning and loading of generic probability
distributions -- implicitly given by data samples -- into quantum states.
Through the interplay of a quantum channel, such as a variational quantum
circuit, and a classical neural network, the qGAN can learn a representation of
the probability distribution underlying the data samples and load it into a
quantum state. The loading requires
gates and can, thus, enable the
use of potentially advantageous quantum algorithms, such as Quantum Amplitude
Estimation. We implement the qGAN distribution learning and loading method with
Qiskit and test it using a quantum simulation as well as actual quantum
processors provided by the IBM Q Experience. Furthermore, we employ quantum
simulation to demonstrate the use of the trained quantum channel in a quantum
finance application.Comment: 14 pages, 13 figure
Information theory, complexity and neural networks
Some of the main results in the mathematical evaluation of neural networks as information processing systems are discussed. The basic operation of feedback and feed-forward neural networks is described. Their memory capacity and computing power are considered. The concept of learning by example as it applies to neural networks is examined
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
- …