3,095 research outputs found
Stochastic Synapses Enable Efficient Brain-Inspired Learning Machines
Recent studies have shown that synaptic unreliability is a robust and
sufficient mechanism for inducing the stochasticity observed in cortex. Here,
we introduce Synaptic Sampling Machines, a class of neural network models that
uses synaptic stochasticity as a means to Monte Carlo sampling and unsupervised
learning. Similar to the original formulation of Boltzmann machines, these
models can be viewed as a stochastic counterpart of Hopfield networks, but
where stochasticity is induced by a random mask over the connections. Synaptic
stochasticity plays the dual role of an efficient mechanism for sampling, and a
regularizer during learning akin to DropConnect. A local synaptic plasticity
rule implementing an event-driven form of contrastive divergence enables the
learning of generative models in an on-line fashion. Synaptic sampling machines
perform equally well using discrete-timed artificial units (as in Hopfield
networks) or continuous-timed leaky integrate & fire neurons. The learned
representations are remarkably sparse and robust to reductions in bit precision
and synapse pruning: removal of more than 75% of the weakest connections
followed by cursory re-learning causes a negligible performance loss on
benchmark classification tasks. The spiking neuron-based synaptic sampling
machines outperform existing spike-based unsupervised learners, while
potentially offering substantial advantages in terms of power and complexity,
and are thus promising models for on-line learning in brain-inspired hardware
Inherent Weight Normalization in Stochastic Neural Networks
Multiplicative stochasticity such as Dropout improves the robustness and
generalizability of deep neural networks. Here, we further demonstrate that
always-on multiplicative stochasticity combined with simple threshold neurons
are sufficient operations for deep neural networks. We call such models Neural
Sampling Machines (NSM). We find that the probability of activation of the NSM
exhibits a self-normalizing property that mirrors Weight Normalization, a
previously studied mechanism that fulfills many of the features of Batch
Normalization in an online fashion. The normalization of activities during
training speeds up convergence by preventing internal covariate shift caused by
changes in the input distribution. The always-on stochasticity of the NSM
confers the following advantages: the network is identical in the inference and
learning phases, making the NSM suitable for online learning, it can exploit
stochasticity inherent to a physical substrate such as analog non-volatile
memories for in-memory computing, and it is suitable for Monte Carlo sampling,
while requiring almost exclusively addition and comparison operations. We
demonstrate NSMs on standard classification benchmarks (MNIST and CIFAR) and
event-based classification benchmarks (N-MNIST and DVS Gestures). Our results
show that NSMs perform comparably or better than conventional artificial neural
networks with the same architecture
Enhancement of synchronization in a hybrid neural circuit by spike timing dependent plasticity
Synchronization of neural activity is fundamental for many functions of the brain. We demonstrate that spike-timing dependent plasticity (STDP) enhances synchronization (entrainment) in a hybrid circuit composed of a spike generator, a dynamic clamp emulating an excitatory plastic synapse, and a chemically isolated neuron from the Aplysia abdominal ganglion. Fixed-phase entrainment of the Aplysia neuron to the spike generator is possible for a much wider range of frequency ratios and is more precise and more robust with the plastic synapse than with a nonplastic synapse of comparable strength. Further analysis in a computational model of HodgkinHuxley-type neurons reveals the mechanism behind this significant enhancement in synchronization. The experimentally observed STDP plasticity curve appears to be designed to adjust synaptic strength to a value suitable for stable entrainment of the postsynaptic neuron. One functional role of STDP might therefore be to facilitate synchronization or entrainment of nonidentical neurons
Multiplicative versus additive noise in multi-state neural networks
The effects of a variable amount of random dilution of the synaptic couplings
in Q-Ising multi-state neural networks with Hebbian learning are examined. A
fraction of the couplings is explicitly allowed to be anti-Hebbian. Random
dilution represents the dying or pruning of synapses and, hence, a static
disruption of the learning process which can be considered as a form of
multiplicative noise in the learning rule. Both parallel and sequential
updating of the neurons can be treated. Symmetric dilution in the statics of
the network is studied using the mean-field theory approach of statistical
mechanics. General dilution, including asymmetric pruning of the couplings, is
examined using the generating functional (path integral) approach of disordered
systems. It is shown that random dilution acts as additive gaussian noise in
the Hebbian learning rule with a mean zero and a variance depending on the
connectivity of the network and on the symmetry. Furthermore, a scaling factor
appears that essentially measures the average amount of anti-Hebbian couplings.Comment: 15 pages, 5 figures, to appear in the proceedings of the Conference
on Noise in Complex Systems and Stochastic Dynamics II (SPIE International
Towards Interpretable Deep Learning Models for Knowledge Tracing
As an important technique for modeling the knowledge states of learners, the
traditional knowledge tracing (KT) models have been widely used to support
intelligent tutoring systems and MOOC platforms. Driven by the fast
advancements of deep learning techniques, deep neural network has been recently
adopted to design new KT models for achieving better prediction performance.
However, the lack of interpretability of these models has painfully impeded
their practical applications, as their outputs and working mechanisms suffer
from the intransparent decision process and complex inner structures. We thus
propose to adopt the post-hoc method to tackle the interpretability issue for
deep learning based knowledge tracing (DLKT) models. Specifically, we focus on
applying the layer-wise relevance propagation (LRP) method to interpret
RNN-based DLKT model by backpropagating the relevance from the model's output
layer to its input layer. The experiment results show the feasibility using the
LRP method for interpreting the DLKT model's predictions, and partially
validate the computed relevance scores from both question level and concept
level. We believe it can be a solid step towards fully interpreting the DLKT
models and promote their practical applications in the education domain
- …