5,905 research outputs found

    A Unified Coded Deep Neural Network Training Strategy Based on Generalized PolyDot Codes for Matrix Multiplication

    Full text link
    This paper has two contributions. First, we propose a novel coded matrix multiplication technique called Generalized PolyDot codes that advances on existing methods for coded matrix multiplication under storage and communication constraints. This technique uses "garbage alignment," i.e., aligning computations in coded computing that are not a part of the desired output. Generalized PolyDot codes bridge between Polynomial codes and MatDot codes, trading off between recovery threshold and communication costs. Second, we demonstrate that Generalized PolyDot can be used for training large Deep Neural Networks (DNNs) on unreliable nodes prone to soft-errors. This requires us to address three additional challenges: (i) prohibitively large overhead of coding the weight matrices in each layer of the DNN at each iteration; (ii) nonlinear operations during training, which are incompatible with linear coding; and (iii) not assuming presence of an error-free master node, requiring us to architect a fully decentralized implementation without any "single point of failure." We allow all primary DNN training steps, namely, matrix multiplication, nonlinear activation, Hadamard product, and update steps as well as the encoding/decoding to be error-prone. We consider the case of mini-batch size B=1B=1, as well as B>1B>1, leveraging coded matrix-vector products, and matrix-matrix products respectively. The problem of DNN training under soft-errors also motivates an interesting, probabilistic error model under which a real number (P,Q)(P,Q) MDS code is shown to correct Pāˆ’Qāˆ’1P-Q-1 errors with probability 11 as compared to āŒŠPāˆ’Q2āŒ‹\lfloor \frac{P-Q}{2} \rfloor for the more conventional, adversarial error model. We also demonstrate that our proposed strategy can provide unbounded gains in error tolerance over a competing replication strategy and a preliminary MDS-code-based strategy for both these error models.Comment: Presented in part at the IEEE International Symposium on Information Theory 2018 (Submission Date: Jan 12 2018); Currently under review at the IEEE Transactions on Information Theor

    Nearly extensive sequential memory lifetime achieved by coupled nonlinear neurons

    Full text link
    Many cognitive processes rely on the ability of the brain to hold sequences of events in short-term memory. Recent studies have revealed that such memory can be read out from the transient dynamics of a network of neurons. However, the memory performance of such a network in buffering past information has only been rigorously estimated in networks of linear neurons. When signal gain is kept low, so that neurons operate primarily in the linear part of their response nonlinearity, the memory lifetime is bounded by the square root of the network size. In this work, I demonstrate that it is possible to achieve a memory lifetime almost proportional to the network size, "an extensive memory lifetime", when the nonlinearity of neurons is appropriately utilized. The analysis of neural activity revealed that nonlinear dynamics prevented the accumulation of noise by partially removing noise in each time step. With this error-correcting mechanism, I demonstrate that a memory lifetime of order N/logā”NN/\log N can be achieved.Comment: 21 pages, 5 figures, the manuscript has been accepted for publication in Neural Computatio

    Learning to Discriminate Through Long-Term Changes of Dynamical Synaptic Transmission

    Get PDF
    Short-term synaptic plasticity is modulated by long-term synaptic changes. There is, however, no general agreement on the computational role of this interaction. Here, we derive a learning rule for the release probability and the maximal synaptic conductance in a circuit model with combined recurrent and feedforward connections that allows learning to discriminate among natural inputs. Short-term synaptic plasticity thereby provides a nonlinear expansion of the input space of a linear classifier, whereas the random recurrent network serves to decorrelate the expanded input space. Computer simulations reveal that the twofold increase in the number of input dimensions through short-term synaptic plasticity improves the performance of a standard perceptron up to 100%. The distributions of release probabilities and maximal synaptic conductances at the capacity limit strongly depend on the balance between excitation and inhibition. The model also suggests a new computational interpretation of spikes evoked by stimuli outside the classical receptive field. These neuronal activitiesmay reflect decorrelation of the expanded stimulus space by intracortical synaptic connections

    Towards a learning-theoretic analysis of spike-timing dependent plasticity

    Full text link
    This paper suggests a learning-theoretic perspective on how synaptic plasticity benefits global brain functioning. We introduce a model, the selectron, that (i) arises as the fast time constant limit of leaky integrate-and-fire neurons equipped with spiking timing dependent plasticity (STDP) and (ii) is amenable to theoretical analysis. We show that the selectron encodes reward estimates into spikes and that an error bound on spikes is controlled by a spiking margin and the sum of synaptic weights. Moreover, the efficacy of spikes (their usefulness to other reward maximizing selectrons) also depends on total synaptic strength. Finally, based on our analysis, we propose a regularized version of STDP, and show the regularization improves the robustness of neuronal learning when faced with multiple stimuli.Comment: To appear in Adv. Neural Inf. Proc. System

    Optimally adapted multi-state neural networks trained with noise

    Full text link
    The principle of adaptation in a noisy retrieval environment is extended here to a diluted attractor neural network of Q-state neurons trained with noisy data. The network is adapted to an appropriate noisy training overlap and training activity which are determined self-consistently by the optimized retrieval attractor overlap and activity. The optimized storage capacity and the corresponding retriever overlap are considerably enhanced by an adequate threshold in the states. Explicit results for improved optimal performance and new retriever phase diagrams are obtained for Q=3 and Q=4, with coexisting phases over a wide range of thresholds. Most of the interesting results are stable to replica-symmetry-breaking fluctuations.Comment: 22 pages, 5 figures, accepted for publication in PR

    Stability of the replica symmetric solution for the information conveyed by by a neural network

    Get PDF
    The information that a pattern of firing in the output layer of a feedforward network of threshold-linear neurons conveys about the network's inputs is considered. A replica-symmetric solution is found to be stable for all but small amounts of noise. The region of instability depends on the contribution of the threshold and the sparseness: for distributed pattern distributions, the unstable region extends to higher noise variances than for very sparse distributions, for which it is almost nonexistant.Comment: 19 pages, LaTeX, 5 figures. Also available at http://www.mrc-bbc.ox.ac.uk/~schultz/papers.html . Submitted to Phys. Rev. E Minor change
    • ā€¦
    corecore