356,372 research outputs found
Tensor Product Generation Networks for Deep NLP Modeling
We present a new approach to the design of deep networks for natural language
processing (NLP), based on the general technique of Tensor Product
Representations (TPRs) for encoding and processing symbol structures in
distributed neural networks. A network architecture --- the Tensor Product
Generation Network (TPGN) --- is proposed which is capable in principle of
carrying out TPR computation, but which uses unconstrained deep learning to
design its internal representations. Instantiated in a model for image-caption
generation, TPGN outperforms LSTM baselines when evaluated on the COCO dataset.
The TPR-capable structure enables interpretation of internal representations
and operations, which prove to contain considerable grammatical content. Our
caption-generation model can be interpreted as generating sequences of
grammatical categories and retrieving words by their categories from a plan
encoded as a distributed representation
Analytical and Numerical Study of Internal Representations in Multilayer Neural Networks with Binary Weights
We study the weight space structure of the parity machine with binary weights
by deriving the distribution of volumes associated to the internal
representations of the learning examples. The learning behaviour and the
symmetry breaking transition are analyzed and the results are found to be in
very good agreement with extended numerical simulations.Comment: revtex, 20 pages + 9 figures, to appear in Phys. Rev.
Learning and generalization theories of large committee--machines
The study of the distribution of volumes associated to the internal
representations of learning examples allows us to derive the critical learning
capacity () of large committee machines,
to verify the stability of the solution in the limit of a large number of
hidden units and to find a Bayesian generalization cross--over at .Comment: 14 pages, revte
Recommended from our members
The CHREST model of active perception and its role in problem solving
We discuss the relation of TEC to a computational model of expert perception, CHREST, based on the chunking theory. TEC’s status as a verbal theory leaves several questions unanswerable, such as the precise nature of internal representations used, or the degree of learning required to obtain a particular level of competence: CHREST may help answer such questions
- …