1,490 research outputs found
Inherent Weight Normalization in Stochastic Neural Networks
Multiplicative stochasticity such as Dropout improves the robustness and
generalizability of deep neural networks. Here, we further demonstrate that
always-on multiplicative stochasticity combined with simple threshold neurons
are sufficient operations for deep neural networks. We call such models Neural
Sampling Machines (NSM). We find that the probability of activation of the NSM
exhibits a self-normalizing property that mirrors Weight Normalization, a
previously studied mechanism that fulfills many of the features of Batch
Normalization in an online fashion. The normalization of activities during
training speeds up convergence by preventing internal covariate shift caused by
changes in the input distribution. The always-on stochasticity of the NSM
confers the following advantages: the network is identical in the inference and
learning phases, making the NSM suitable for online learning, it can exploit
stochasticity inherent to a physical substrate such as analog non-volatile
memories for in-memory computing, and it is suitable for Monte Carlo sampling,
while requiring almost exclusively addition and comparison operations. We
demonstrate NSMs on standard classification benchmarks (MNIST and CIFAR) and
event-based classification benchmarks (N-MNIST and DVS Gestures). Our results
show that NSMs perform comparably or better than conventional artificial neural
networks with the same architecture
Stochastic Neural Networks with the Weighted Hebb Rule
Neural networks with synaptic weights constructed according to the weighted
Hebb rule, a variant of the familiar Hebb rule, are studied in the presence of
noise(finite temperature), when the number of stored patterns is finite and in
the limit that the number of neurons . The fact that different patterns enter the synaptic rule with
different weights changes the configuration of the free energy surface. For a
general choice of weights not all of the patterns are stored as {\sl global}
minima of the free energy function. However, as for the case of the usual Hebb
rule, there exists a temperature range in which only the stored patterns are
minima of the free energy. In particular, in the presence of a single extra
pattern stored with an appropriate weight in the synaptic rule, the temperature
at which the spurious minima of the free energy are eliminated is significantly
lower than for a similar network without this extra pattern. The convergence
time of the network, together with the overlaps of the equilibria of the
network with the stored patterns, can thereby be improved considerably.Comment: 14 pages, OKHEP 93-00
Learning in stochastic neural networks for constraint satisfaction problems
Researchers describe a newly-developed artificial neural network algorithm for solving constraint satisfaction problems (CSPs) which includes a learning component that can significantly improve the performance of the network from run to run. The network, referred to as the Guarded Discrete Stochastic (GDS) network, is based on the discrete Hopfield network but differs from it primarily in that auxiliary networks (guards) are asymmetrically coupled to the main network to enforce certain types of constraints. Although the presence of asymmetric connections implies that the network may not converge, it was found that, for certain classes of problems, the network often quickly converges to find satisfactory solutions when they exist. The network can run efficiently on serial machines and can find solutions to very large problems (e.g., N-queens for N as large as 1024). One advantage of the network architecture is that network connection strengths need not be instantiated when the network is established: they are needed only when a participating neural element transitions from off to on. They have exploited this feature to devise a learning algorithm, based on consistency techniques for discrete CSPs, that updates the network biases and connection strengths and thus improves the network performance
Simple and Effective Stochastic Neural Networks
Stochastic neural networks (SNNs) are currently topical, with several paradigms being actively investigated including dropout, Bayesian neural networks, variational information bottleneck (VIB) and noise regularized learning. These neural network variants impact several major considerations, including generalization, network compression, robustness against adversarial attack and label noise, and model calibration. However, many existing networks are complicated and expensive to train, and/or only address one or two of these practical considerations. In this paper we propose a simple and effective stochastic neural network (SE-SNN) architecture for discriminative learning by directly modeling activation uncertainty and encouraging high activation variability. Compared to existing SNNs, our SE-SNN is simpler to implement and faster to train, and produces state of the art results on network compression by pruning, adversarial defense, learning with label noise, and model calibration
Implicit Simulations using Messaging Protocols
A novel algorithm for performing parallel, distributed computer simulations
on the Internet using IP control messages is introduced. The algorithm employs
carefully constructed ICMP packets which enable the required computations to be
completed as part of the standard IP communication protocol. After providing a
detailed description of the algorithm, experimental applications in the areas
of stochastic neural networks and deterministic cellular automata are
discussed. As an example of the algorithms potential power, a simulation of a
deterministic cellular automaton involving 10^5 Internet connected devices was
performed.Comment: 14 pages, 3 figure
Stochastic neural networks
Artificial neural networks are brain-like models of parallel computations and cognitive phenomena. We sample some basic results about neural networks as they relate to stochastic and statistical processes. Given the explosivo amount of material, only models bearing a stochastic component in the function or analysis are presented, such as Hopfield and feedforward nets, Boltzman machines and some recurrent networks. Basic algorithms for learning such as backpropagation and gradient descent are sketched. A handful of applications (associative memories, pattem recognition, time series forecast) aredescribed. Finally, some current trends in the field are discussed
- …