183,622 research outputs found
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
A central problem in machine learning involves modeling complex data-sets
using highly flexible families of probability distributions in which learning,
sampling, inference, and evaluation are still analytically or computationally
tractable. Here, we develop an approach that simultaneously achieves both
flexibility and tractability. The essential idea, inspired by non-equilibrium
statistical physics, is to systematically and slowly destroy structure in a
data distribution through an iterative forward diffusion process. We then learn
a reverse diffusion process that restores structure in data, yielding a highly
flexible and tractable generative model of the data. This approach allows us to
rapidly learn, sample from, and evaluate probabilities in deep generative
models with thousands of layers or time steps, as well as to compute
conditional and posterior probabilities under the learned model. We
additionally release an open source reference implementation of the algorithm
Attentive Tensor Product Learning
This paper proposes a new architecture - Attentive Tensor Product Learning
(ATPL) - to represent grammatical structures in deep learning models. ATPL is a
new architecture to bridge this gap by exploiting Tensor Product
Representations (TPR), a structured neural-symbolic model developed in
cognitive science, aiming to integrate deep learning with explicit language
structures and rules. The key ideas of ATPL are: 1) unsupervised learning of
role-unbinding vectors of words via TPR-based deep neural network; 2) employing
attention modules to compute TPR; and 3) integration of TPR with typical deep
learning architectures including Long Short-Term Memory (LSTM) and Feedforward
Neural Network (FFNN). The novelty of our approach lies in its ability to
extract the grammatical structure of a sentence by using role-unbinding
vectors, which are obtained in an unsupervised manner. This ATPL approach is
applied to 1) image captioning, 2) part of speech (POS) tagging, and 3)
constituency parsing of a sentence. Experimental results demonstrate the
effectiveness of the proposed approach
- …
