1,812 research outputs found
Attentive Tensor Product Learning
This paper proposes a new architecture - Attentive Tensor Product Learning
(ATPL) - to represent grammatical structures in deep learning models. ATPL is a
new architecture to bridge this gap by exploiting Tensor Product
Representations (TPR), a structured neural-symbolic model developed in
cognitive science, aiming to integrate deep learning with explicit language
structures and rules. The key ideas of ATPL are: 1) unsupervised learning of
role-unbinding vectors of words via TPR-based deep neural network; 2) employing
attention modules to compute TPR; and 3) integration of TPR with typical deep
learning architectures including Long Short-Term Memory (LSTM) and Feedforward
Neural Network (FFNN). The novelty of our approach lies in its ability to
extract the grammatical structure of a sentence by using role-unbinding
vectors, which are obtained in an unsupervised manner. This ATPL approach is
applied to 1) image captioning, 2) part of speech (POS) tagging, and 3)
constituency parsing of a sentence. Experimental results demonstrate the
effectiveness of the proposed approach
Tensor Product Generation Networks for Deep NLP Modeling
We present a new approach to the design of deep networks for natural language
processing (NLP), based on the general technique of Tensor Product
Representations (TPRs) for encoding and processing symbol structures in
distributed neural networks. A network architecture --- the Tensor Product
Generation Network (TPGN) --- is proposed which is capable in principle of
carrying out TPR computation, but which uses unconstrained deep learning to
design its internal representations. Instantiated in a model for image-caption
generation, TPGN outperforms LSTM baselines when evaluated on the COCO dataset.
The TPR-capable structure enables interpretation of internal representations
and operations, which prove to contain considerable grammatical content. Our
caption-generation model can be interpreted as generating sequences of
grammatical categories and retrieving words by their categories from a plan
encoded as a distributed representation
- …