9 research outputs found
Training neural networks to encode symbols enables combinatorial generalization
Combinatorial generalization - the ability to understand and produce novel
combinations of already familiar elements - is considered to be a core capacity
of the human mind and a major challenge to neural network models. A significant
body of research suggests that conventional neural networks can't solve this
problem unless they are endowed with mechanisms specifically engineered for the
purpose of representing symbols. In this paper we introduce a novel way of
representing symbolic structures in connectionist terms - the vectors approach
to representing symbols (VARS), which allows training standard neural
architectures to encode symbolic knowledge explicitly at their output layers.
In two simulations, we show that neural networks not only can learn to produce
VARS representations, but in doing so they achieve combinatorial generalization
in their symbolic and non-symbolic output. This adds to other recent work that
has shown improved combinatorial generalization under specific training
conditions, and raises the question of whether specific mechanisms or training
routines are needed to support symbolic processing
The role of recurrent networks in neural architectures of grounded cognition: learning of control
Recurrent networks have been used as neural models of language processing, with mixed results. Here, we discuss the role of recurrent networks in a neural architecture of grounded cognition. In particular, we discuss how the control of binding in this architecture can be learned. We trained a simple recurrent network (SRN) and a feedforward network (FFN) for this task. The results show that information from the architecture is needed as input for these networks to learn control of binding. Thus, both control systems are recurrent. We found that the recurrent system consisting of the architecture and an SRN or an FFN as a "core" can learn basic (but recursive) sentence structures. Problems with control of binding arise when the system with the SRN is tested on number of new sentence structures. In contrast, control of binding for these structures succeeds with the FFN. Yet, for some structures with (unlimited) embeddings, difficulties arise due to dynamical binding conflicts in the architecture itself. In closing, we discuss potential future developments of the architecture presented here
A neural blackboard architecture of sentence structure
We present a neural architecture for sentence representation. Sentences are represented in terms of word representations as constituents. A word representation consists of a neural assembly distributed over the brain. Sentence representation does not result from associations between neural word assemblies. Instead, word assemblies are embedded in a neural architecture, in which the structural (thematic) relations between words can be represented. Arbitrary thematic relations between arguments and verbs can be represented. Arguments can consist of nouns and phrases, as in sentences with relative clauses. A number of sentences can be stored simultaneously in this architecture. We simulate how probe questions about thematic relations can be answered. We discuss how differences in sentence complexity, such as the difference between subject-extracted versus object-extracted relative clauses and the difference between right-branching versus center-embedded structures, can be related to the underlying neural dynamics of the model. Finally, we illustrate how memory capacity for sentence representation can be related to the nature of reverberating neural activity, which is used to store information temporarily in this architecture
The Unification Space implemented as a localist neural net: predictions and error-tolerance in a constraint-based parser
We introduce a novel computer implementation of the Unification-Space parser (Vosse and Kempen in Cognition 75:105–143, 2000) in the form of a localist neural network whose dynamics is based on interactive activation and inhibition. The wiring of the network is determined by Performance Grammar (Kempen and Harbusch in Verb constructions in German and Dutch. Benjamins, Amsterdam, 2003), a lexicalist formalism with feature unification as binding operation. While the network is processing input word strings incrementally, the evolving shape of parse trees is represented in the form of changing patterns of activation in nodes that code for syntactic properties of words and phrases, and for the grammatical functions they fulfill. The system is capable, at least qualitatively and rudimentarily, of simulating several important dynamic aspects of human syntactic parsing, including garden-path phenomena and reanalysis, effects of complexity (various types of clause embeddings), fault-tolerance in case of unification failures and unknown words, and predictive parsing (expectation-based analysis, surprisal effects). English is the target language of the parser described
Neural blackboard architectures of combinatorial structures in cognition
Human cognition is unique in the way in which it relies on combinatorial (or compositional) structures. Language provides ample evidence for the existence of combinatorial structures, but they can also be found in visual cognition. To understand the neural basis of human cognition, it is therefore essential to understand how combinatorial structures can be instantiated in neural terms. In his recent book on the foundations of language, Jackendoff described four fundamental problems for a neural instantiation of combinatorial structures: the massiveness of the binding problem, the problem of 2, the problem of variables and the transformation of combinatorial structures from working memory to long-term memory. This paper aims to show that these problems can be solved by means of neural ‘blackboard’ architectures. For this purpose, a neural blackboard architecture for sentence structure is presented. In this architecture, neural structures that encode for words are temporarily bound in a manner that preserves the structure of the sentence. It is shown that the architecture solves the four problems presented by Jackendoff. The ability of the architecture to instantiate sentence structures is illustrated with examples of sentence complexity observed in human language performance. Similarities exist between the architecture for sentence structure and blackboard architectures for combinatorial structures in visual cognition, derived from the structure of the visual cortex. These architectures are briefly discussed, together with an example of a combinatorial structure in which the blackboard architectures for language and vision are combined. In this way, the architecture for language is grounded in perception
Recommended from our members
On the induction of temporal structure by recurrent neural networks
Language acquisition is one of the core problems in artificial intelligence (AI) and it is generally accepted that any successful AI account of the mind will stand or fall depending on its ability to model human language. Simple Recurrent Networks (SRNs) are a class of so-called artificial neural networks that have a long history in language modelling via learning to predict the next word in a sentence. However, SRNs have also been shown to suffer from catastrophic forgetting, lack of syntactic systematicity and an inability to represent more than three levels of centre-embedding, due to the so-called 'vanishing gradients' problem. This problem is caused by the decay of past input information encoded within the error-gradients which vanish exponentially as additional input information is encountered and passed through the recurrent connections. That said, a number of architectural variations have been applied which may compensate for this issue, such as the Nonlinear Autoregressive Network with exogenous inputs (NARX) network and the multi-recurrent network (MRN). In addition to this, Echo State Networks (ESNs) are a relatively new class of recurrent neural network that do not suffer from the vanishing gradients problem and have been shown to exhibit state-of-the-art performance in tasks such as motor control, dynamic time series prediction, and more recently language processing. This research re-explores the class of SRNs and evaluates them against the state-of-the-art ESN to identify which model class is best able to induce the underlying finite-state automaton of the target grammar implicitly through the next word prediction task. In order to meet its aim, the research analyses the internal representations formed by each of the different models and explores the conditions under which they are able to carry information about long term sequential dependencies beyond what is found in the training data. The findings of the research are significant. It reveals that the traditional class of SRNs, trained with backpropagation through time, are superior to ESNs for the grammar prediction task. More specifically, the MRN, with its state-based memory of varying rigidity, is more able to learn the underlying grammar than any other model. An analysis of the MRN’s internal state reveals that this is due to its ability to maintain a constant variance within its state-based representation of the embedded aspects (or finite state machines) of the target grammar. The investigations show that in order to successfully induce complex context free grammars directly from sentence examples, then not only are a hidden layer and output layer recurrency required, but so is self-recurrency on the context layer to enable varying degrees of current and past state information, that are integrated over time