1,913 research outputs found
RNNs Implicitly Implement Tensor Product Representations
Recurrent neural networks (RNNs) can learn continuous vector representations
of symbolic structures such as sequences and sentences; these representations
often exhibit linear regularities (analogies). Such regularities motivate our
hypothesis that RNNs that show such regularities implicitly compile symbolic
structures into tensor product representations (TPRs; Smolensky, 1990), which
additively combine tensor products of vectors representing roles (e.g.,
sequence positions) and vectors representing fillers (e.g., particular words).
To test this hypothesis, we introduce Tensor Product Decomposition Networks
(TPDNs), which use TPRs to approximate existing vector representations. We
demonstrate using synthetic data that TPDNs can successfully approximate linear
and tree-based RNN autoencoder representations, suggesting that these
representations exhibit interpretable compositional structure; we explore the
settings that lead RNNs to induce such structure-sensitive representations. By
contrast, further TPDN experiments show that the representations of four models
trained to encode naturally-occurring sentences can be largely approximated
with a bag of words, with only marginal improvements from more sophisticated
structures. We conclude that TPDNs provide a powerful method for interpreting
vector representations, and that standard RNNs can induce compositional
sequence representations that are remarkably well approximated by TPRs; at the
same time, existing training tasks for sentence representation learning may not
be sufficient for inducing robust structural representations.Comment: Accepted to ICLR 201
Three-order normalized PMI and other lessons in tensor analysis of verbal selectional preferences
We investigate several questions in transitive verb structure representation by decomposing tensors populated with different subject-verb-object association measures, including a novel generalization of normalized pointwise mutual information to the higher-order (>2) case. Which association measure works the best in modeling verb structures? Should we include occurrences with unfilled arguments in our statistics? We also investigate qualitatively the latent dimensions, and the difference between each noun as a subject versus an object
MorphTE: Injecting Morphology in Tensorized Embeddings
In the era of deep learning, word embeddings are essential when dealing with
text tasks. However, storing and accessing these embeddings requires a large
amount of space. This is not conducive to the deployment of these models on
resource-limited devices. Combining the powerful compression capability of
tensor products, we propose a word embedding compression method with
morphological augmentation, Morphologically-enhanced Tensorized Embeddings
(MorphTE). A word consists of one or more morphemes, the smallest units that
bear meaning or have a grammatical function. MorphTE represents a word
embedding as an entangled form of its morpheme vectors via the tensor product,
which injects prior semantic and grammatical knowledge into the learning of
embeddings. Furthermore, the dimensionality of the morpheme vector and the
number of morphemes are much smaller than those of words, which greatly reduces
the parameters of the word embeddings. We conduct experiments on tasks such as
machine translation and question answering. Experimental results on four
translation datasets of different languages show that MorphTE can compress word
embedding parameters by about 20 times without performance loss and
significantly outperforms related embedding compression methods.Comment: 20 pages, 6 figures, 18 tables. Published at NeurIPS 202
- …