2,035 research outputs found

    Why and When Can Deep -- but Not Shallow -- Networks Avoid the Curse of Dimensionality: a Review

    Get PDF
    The paper characterizes classes of functions for which deep learning can be exponentially better than shallow learning. Deep convolutional networks are a special case of these conditions, though weight sharing is not the main reason for their exponential advantage

    The laminar integration of sensory inputs with feedback signals in human cortex

    Get PDF
    The cortex constitutes the largest area of the human brain. Yet we have only a basic understanding of how the cortex performs one vital function: the integration of sensory signals (carried by feedforward pathways) with internal representations (carried by feedback pathways). A multi-scale, multi-species approach is essential for understanding the site of integration, computational mechanism and functional role of this processing. To improve our knowledge we must rely on brain imaging with improved spatial and temporal resolution and paradigms which can measure internal processes in the human brain, and on the bridging of disciplines in order to characterize this processing at cellular and circuit levels. We highlight apical amplification as one potential mechanism for integrating feedforward and feedback inputs within pyramidal neurons in the rodent brain. We reflect on the challenges and progress in applying this model neuronal process to the study of human cognition. We conclude that cortical-layer specific measures in humans will be an essential contribution for better understanding the landscape of information in cortical feedback, helping to bridge the explanatory gap

    Equilibrium Propagation: Bridging the Gap Between Energy-Based Models and Backpropagation

    Full text link
    We introduce Equilibrium Propagation, a learning framework for energy-based models. It involves only one kind of neural computation, performed in both the first phase (when the prediction is made) and the second phase of training (after the target or prediction error is revealed). Although this algorithm computes the gradient of an objective function just like Backpropagation, it does not need a special computation or circuit for the second phase, where errors are implicitly propagated. Equilibrium Propagation shares similarities with Contrastive Hebbian Learning and Contrastive Divergence while solving the theoretical issues of both algorithms: our algorithm computes the gradient of a well defined objective function. Because the objective function is defined in terms of local perturbations, the second phase of Equilibrium Propagation corresponds to only nudging the prediction (fixed point, or stationary distribution) towards a configuration that reduces prediction error. In the case of a recurrent multi-layer supervised network, the output units are slightly nudged towards their target in the second phase, and the perturbation introduced at the output layer propagates backward in the hidden layers. We show that the signal 'back-propagated' during this second phase corresponds to the propagation of error derivatives and encodes the gradient of the objective function, when the synaptic update corresponds to a standard form of spike-timing dependent plasticity. This work makes it more plausible that a mechanism similar to Backpropagation could be implemented by brains, since leaky integrator neural computation performs both inference and error back-propagation in our model. The only local difference between the two phases is whether synaptic changes are allowed or not

    Power Optimizations in MTJ-based Neural Networks through Stochastic Computing

    Full text link
    Artificial Neural Networks (ANNs) have found widespread applications in tasks such as pattern recognition and image classification. However, hardware implementations of ANNs using conventional binary arithmetic units are computationally expensive, energy-intensive and have large area overheads. Stochastic Computing (SC) is an emerging paradigm which replaces these conventional units with simple logic circuits and is particularly suitable for fault-tolerant applications. Spintronic devices, such as Magnetic Tunnel Junctions (MTJs), are capable of replacing CMOS in memory and logic circuits. In this work, we propose an energy-efficient use of MTJs, which exhibit probabilistic switching behavior, as Stochastic Number Generators (SNGs), which forms the basis of our NN implementation in the SC domain. Further, error resilient target applications of NNs allow us to introduce Approximate Computing, a framework wherein accuracy of computations is traded-off for substantial reductions in power consumption. We propose approximating the synaptic weights in our MTJ-based NN implementation, in ways brought about by properties of our MTJ-SNG, to achieve energy-efficiency. We design an algorithm that can perform such approximations within a given error tolerance in a single-layer NN in an optimal way owing to the convexity of the problem formulation. We then use this algorithm and develop a heuristic approach for approximating multi-layer NNs. To give a perspective of the effectiveness of our approach, a 43% reduction in power consumption was obtained with less than 1% accuracy loss on a standard classification problem, with 26% being brought about by the proposed algorithm.Comment: Accepted in the 2017 IEEE/ACM International Conference on Low Power Electronics and Desig

    Continuous-variable quantum neural networks

    Full text link
    We introduce a general method for building neural networks on quantum computers. The quantum neural network is a variational quantum circuit built in the continuous-variable (CV) architecture, which encodes quantum information in continuous degrees of freedom such as the amplitudes of the electromagnetic field. This circuit contains a layered structure of continuously parameterized gates which is universal for CV quantum computation. Affine transformations and nonlinear activation functions, two key elements in neural networks, are enacted in the quantum network using Gaussian and non-Gaussian gates, respectively. The non-Gaussian gates provide both the nonlinearity and the universality of the model. Due to the structure of the CV model, the CV quantum neural network can encode highly nonlinear transformations while remaining completely unitary. We show how a classical network can be embedded into the quantum formalism and propose quantum versions of various specialized model such as convolutional, recurrent, and residual networks. Finally, we present numerous modeling experiments built with the Strawberry Fields software library. These experiments, including a classifier for fraud detection, a network which generates Tetris images, and a hybrid classical-quantum autoencoder, demonstrate the capability and adaptability of CV quantum neural networks

    Sampling-based probabilistic inference emerges from learning in neural circuits with a cost on reliability

    Full text link
    Neural responses in the cortex change over time both systematically, due to ongoing plasticity and learning, and seemingly randomly, due to various sources of noise and variability. Most previous work considered each of these processes, learning and variability, in isolation -- here we study neural networks exhibiting both and show that their interaction leads to the emergence of powerful computational properties. We trained neural networks on classical unsupervised learning tasks, in which the objective was to represent their inputs in an efficient, easily decodable form, with an additional cost for neural reliability which we derived from basic biophysical considerations. This cost on reliability introduced a tradeoff between energetically cheap but inaccurate representations and energetically costly but accurate ones. Despite the learning tasks being non-probabilistic, the networks solved this tradeoff by developing a probabilistic representation: neural variability represented samples from statistically appropriate posterior distributions that would result from performing probabilistic inference over their inputs. We provide an analytical understanding of this result by revealing a connection between the cost of reliability, and the objective for a state-of-the-art Bayesian inference strategy: variational autoencoders. We show that the same cost leads to the emergence of increasingly accurate probabilistic representations as networks become more complex, from single-layer feed-forward, through multi-layer feed-forward, to recurrent architectures. Our results provide insights into why neural responses in sensory areas show signatures of sampling-based probabilistic representations, and may inform future deep learning algorithms and their implementation in stochastic low-precision computing systems
    corecore