4,206 research outputs found

    Weighted Contrastive Divergence

    Get PDF
    Learning algorithms for energy based Boltzmann architectures that rely on gradient descent are in general computationally prohibitive, typically due to the exponential number of terms involved in computing the partition function. In this way one has to resort to approximation schemes for the evaluation of the gradient. This is the case of Restricted Boltzmann Machines (RBM) and its learning algorithm Contrastive Divergence (CD). It is well-known that CD has a number of shortcomings, and its approximation to the gradient has several drawbacks. Overcoming these defects has been the basis of much research and new algorithms have been devised, such as persistent CD. In this manuscript we propose a new algorithm that we call Weighted CD (WCD), built from small modifications of the negative phase in standard CD. However small these modifications may be, experimental work reported in this paper suggest that WCD provides a significant improvement over standard CD and persistent CD at a small additional computational cost

    In All Likelihood, Deep Belief Is Not Enough

    Full text link
    Statistical models of natural stimuli provide an important tool for researchers in the fields of machine learning and computational neuroscience. A canonical way to quantitatively assess and compare the performance of statistical models is given by the likelihood. One class of statistical models which has recently gained increasing popularity and has been applied to a variety of complex data are deep belief networks. Analyses of these models, however, have been typically limited to qualitative analyses based on samples due to the computationally intractable nature of the model likelihood. Motivated by these circumstances, the present article provides a consistent estimator for the likelihood that is both computationally tractable and simple to apply in practice. Using this estimator, a deep belief network which has been suggested for the modeling of natural image patches is quantitatively investigated and compared to other models of natural image patches. Contrary to earlier claims based on qualitative results, the results presented in this article provide evidence that the model under investigation is not a particularly good model for natural image

    A Theory of Cheap Control in Embodied Systems

    Full text link
    We present a framework for designing cheap control architectures for embodied agents. Our derivation is guided by the classical problem of universal approximation, whereby we explore the possibility of exploiting the agent's embodiment for a new and more efficient universal approximation of behaviors generated by sensorimotor control. This embodied universal approximation is compared with the classical non-embodied universal approximation. To exemplify our approach, we present a detailed quantitative case study for policy models defined in terms of conditional restricted Boltzmann machines. In contrast to non-embodied universal approximation, which requires an exponential number of parameters, in the embodied setting we are able to generate all possible behaviors with a drastically smaller model, thus obtaining cheap universal approximation. We test and corroborate the theory experimentally with a six-legged walking machine. The experiments show that the sufficient controller complexity predicted by our theory is tight, which means that the theory has direct practical implications. Keywords: cheap design, embodiment, sensorimotor loop, universal approximation, conditional restricted Boltzmann machineComment: 27 pages, 10 figure

    Neural Network Operations and Susuki-Trotter evolution of Neural Network States

    Full text link
    It was recently proposed to leverage the representational power of artificial neural networks, in particular Restricted Boltzmann Machines, in order to model complex quantum states of many-body systems [Science, 355(6325), 2017]. States represented in this way, called Neural Network States (NNSs), were shown to display interesting properties like the ability to efficiently capture long-range quantum correlations. However, identifying an optimal neural network representation of a given state might be challenging, and so far this problem has been addressed with stochastic optimization techniques. In this work we explore a different direction. We study how the action of elementary quantum operations modifies NNSs. We parametrize a family of many body quantum operations that can be directly applied to states represented by Unrestricted Boltzmann Machines, by just adding hidden nodes and updating the network parameters. We show that this parametrization contains a set of universal quantum gates, from which it follows that the state prepared by any quantum circuit can be expressed as a Neural Network State with a number of hidden nodes that grows linearly with the number of elementary operations in the circuit. This is a powerful representation theorem (which was recently obtained with different methods) but that is not directly useful, since there is no general and efficient way to extract information from this unrestricted description of quantum states. To circumvent this problem, we propose a step-wise procedure based on the projection of Unrestricted quantum states to Restricted quantum states. In turn, two approximate methods to perform this projection are discussed. In this way, we show that it is in principle possible to approximately optimize or evolve Neural Network States without relying on stochastic methods such as Variational Monte Carlo, which are computationally expensive

    Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units

    Full text link
    We generalize recent theoretical work on the minimal number of layers of narrow deep belief networks that can approximate any probability distribution on the states of their visible units arbitrarily well. We relax the setting of binary units (Sutskever and Hinton, 2008; Le Roux and Bengio, 2008, 2010; Mont\'ufar and Ay, 2011) to units with arbitrary finite state spaces, and the vanishing approximation error to an arbitrary approximation error tolerance. For example, we show that a qq-ary deep belief network with L2+qmδ1q1L\geq 2+\frac{q^{\lceil m-\delta \rceil}-1}{q-1} layers of width nm+logq(m)+1n \leq m + \log_q(m) + 1 for some mNm\in \mathbb{N} can approximate any probability distribution on {0,1,,q1}n\{0,1,\ldots,q-1\}^n without exceeding a Kullback-Leibler divergence of δ\delta. Our analysis covers discrete restricted Boltzmann machines and na\"ive Bayes models as special cases.Comment: 19 pages, 5 figures, 1 tabl

    Quantum machine learning: a classical perspective

    Get PDF
    Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning techniques to impressive results in regression, classification, data-generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets are motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed-up classical machine learning algorithms. Here we review the literature in quantum machine learning and discuss perspectives for a mixed readership of classical machine learning and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in machine learning are identified as promising directions for the field. Practical questions, like how to upload classical data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde
    corecore