393 research outputs found

    A theory of relation learning and cross-domain generalization

    Get PDF
    People readily generalize knowledge to novel domains and stimuli. We present a theory, instantiated in a computational model, based on the idea that cross-domain generalization in humans is a case of analogical inference over structured (i.e., symbolic) relational representations. The model is an extension of the LISA and DORA models of relational inference and learning. The resulting model learns both the content and format (i.e., structure) of relational representations from non-relational inputs without supervision, when augmented with the capacity for reinforcement learning, leverages these representations to learn individual domains, and then generalizes to new domains on the first exposure (i.e., zero-shot learning) via analogical inference. We demonstrate the capacity of the model to learn structured relational representations from a variety of simple visual stimuli, and to perform cross-domain generalization between video games (Breakout and Pong) and between several psychological tasks. We demonstrate that the model's trajectory closely mirrors the trajectory of children as they learn about relations, accounting for phenomena from the literature on the development of children's reasoning and analogy making. The model's ability to generalize between domains demonstrates the flexibility afforded by representing domains in terms of their underlying relational structure, rather than simply in terms of the statistical relations between their inputs and outputs.Comment: Includes supplemental materia

    A theory of relation learning and cross-domain generalization

    Get PDF
    People readily generalize knowledge to novel domains and stimuli. We present a theory, instantiated in a computational model, based on the idea that cross-domain generalization in humans is a case of analogical inference over structured (i.e., symbolic) relational representations. The model is an extension of the Learning and Inference with Schemas and Analogy (LISA; Hummel & Holyoak, 1997, 2003) and Discovery of Relations by Analogy (DORA; Doumas et al., 2008) models of relational inference and learning. The resulting model learns both the content and format (i.e., structure) of relational representations from nonrelational inputs without supervision, when augmented with the capacity for reinforcement learning it leverages these representations to learn about individual domains, and then generalizes to new domains on the first exposure (i.e., zero-shot learning) via analogical inference. We demonstrate the capacity of the model to learn structured relational representations from a variety of simple visual stimuli, and to perform cross-domain generalization between video games (Breakout and Pong) and between several psychological tasks. We demonstrate that the model’s trajectory closely mirrors the trajectory of children as they learn about relations, accounting for phenomena from the literature on the development of children’s reasoning and analogy making. The model’s ability to generalize between domains demonstrates the flexibility afforded by representing domains in terms of their underlying relational structure, rather than simply in terms of the statistical relations between their inputs and outputs

    A brainwide atlas of synapses across the mouse life span

    Get PDF
    Synapses connect neurons together to form the circuits of the brain, and their molecular composition controls innate and learned behavior. We analyzed the molecular and morphological diversity of 5 billion excitatory synapses at single-synapse resolution across the mouse brain from birth to old age. A continuum of changes alters synapse composition in all brain regions across the life span. Expansion in synapse diversity produces differentiation of brain regions until early adulthood, and compositional changes cause dedifferentiation in old age. The spatiotemporal synaptome architecture of the brain potentially accounts for life-span transitions in intellectual ability, memory, and susceptibility to behavioral disorders

    Artificial ontogenesis: a connectionist model of development

    Get PDF
    This thesis suggests that ontogenetic adaptive processes are important for generating intelligent beha- viour. It is thus proposed that such processes, as they occur in nature, need to be modelled and that such a model could be used for generating artificial intelligence, and specifically robotic intelligence. Hence, this thesis focuses on how mechanisms of intelligence are specified.A major problem in robotics is the need to predefine the behaviour to be followed by the robot. This makes design intractable for all but the simplest tasks and results in controllers that are specific to that particular task and are brittle when faced with unforeseen circumstances. These problems can be resolved by providing the robot with the ability to adapt the rules it follows and to autonomously create new rules for controlling behaviour. This solution thus depends on the predefinition of how rules to control behaviour are to be learnt rather than the predefinition of rules for behaviour themselves.Learning new rules for behaviour occurs during the developmental process in biology. Changes in the structure of the cerebral 'cortex underly behavioural and cognitive development throughout infancy and beyond. The uniformity of the neocortex suggests that there is significant computational uniformity across the cortex resulting from uniform mechanisms of development, and holds out the possibility of a general model of development. Development is an interactive process between genetic predefinition and environmental influences. This interactive process is constructive: qualitatively new behaviours are learnt by using simple abilities as a basis for learning more complex ones. The progressive increase in competence, provided by development, may be essential to make tractable the process of acquiring higher -level abilities.While simple behaviours can be triggered by direct sensory cues, more complex behaviours require the use of more abstract representations. There is thus a need to find representations at the correct level of abstraction appropriate to controlling each ability. In addition, finding the correct level of abstrac- tion makes tractable the task of associating sensory representations with motor actions. Hence, finding appropriate representations is important both for learning behaviours and for controlling behaviours. Representations can be found by recording regularities in the world or by discovering re- occurring pat- terns through repeated sensory -motor interactions. By recording regularities within the representations thus formed, more abstract representations can be found. Simple, non -abstract, representations thus provide the basis for learning more complex, abstract, representations.A modular neural network architecture is presented as a basis for a model of development. The pat- tern of activity of the neurons in an individual network constitutes a representation of the input to that network. This representation is formed through a novel, unsupervised, learning algorithm which adjusts the synaptic weights to improve the representation of the input data. Representations are formed by neurons learning to respond to correlated sets of inputs. Neurons thus became feature detectors or pat- tern recognisers. Because the nodes respond to patterns of inputs they encode more abstract features of the input than are explicitly encoded in the input data itself. In this way simple representations provide the basis for learning more complex representations. The algorithm allows both more abstract represent- ations to be formed by associating correlated, coincident, features together, and invariant representations to be formed by associating correlated, sequential, features together.The algorithm robustly learns accurate and stable representations, in a format most appropriate to the structure of the input data received: it can represent both single and multiple input features in both the discrete and continuous domains, using either topologically or non -topologically organised nodes. The output of one neural network is used to provide inputs for other networks. The robustness of the algorithm enables each neural network to be implemented using an identical algorithm. This allows a modular `assembly' of neural networks to be used for learning more complex abilities: the output activations of a network can be used as the input to other networks which can then find representations of more abstract information within the same input data; and, by defining the output activations of neurons in certain networks to have behavioural consequences it is possible to learn sensory -motor associations, to enable sensory representations to be used to control behaviour

    Deep Learning Models of Learning in the Brain

    Get PDF
    This thesis considers deep learning theories of brain function, and in particular biologically plausible deep learning. The idea is to treat a standard deep network as a high-level model of a neural circuit (e.g., the visual stream), adding biological constraints to some clearly artificial features. Two big questions are possible. First, how to train deep networks in a biologically realistic manner? The standard approach, supervised training via backpropagation, needs overly complicated machinery for backpropagation and precise labels (that are somewhat scarce in the real world). The first result in this thesis approaches the first problem, backpropagation, by avoiding it completely. A layer-wise objective is proposed, which results in local, Hebbian weight updates that use a global error signal. The second result approaches the need for precise labels. It is focused on a principled approach to self-supervised learning, framing the problem as dependence maximisation using kernel methods. Although this is a deep learning study, it is relevant to neuroscience: self-supervised learning appears to be a suitable learning paradigm for the brain as it only requires binary (same source or not) teaching signals for pairs of inputs. Second, how realistic is the architecture itself? For instance, most well-performing networks have some form of weight sharing - having the same weights for different neurons at all times. Convolutional networks share filter weights among neurons, and transformers do so for matrix-matrix products. While the operation is biologically implausible, the third result of this thesis shows that it can be successfully approximated with a separate phase of weight-sharing-inducing Hebbian learning

    Relation learning and reasoning on computational models of high level cognition

    Get PDF
    Relational reasoning is central to many cognitive processes, ranging from “lower” processes like object recognition to “higher” processes such as analogy-making and sequential decision-making. The first chapter of this thesis gives an overview of relational reasoning and the computational demands that it imposes on a system that performs relational reasoning. These demands are characterized in terms of the binding problem in neural networks. There has been a longstanding debate in the literature regarding whether neural network models of cognition are, in principle, capable of relation-base processing. In the second chapter I investigated the relational reasoning capabilities of the Story Gestalt model (St. John, 1992), a classic connectionist model of text comprehension, and a Seq-to-Seq model, a deep neural network of text processing (Bahdanau, Cho, & Bengio, 2015). In both cases I found that the purportedly relational behavior of the models was explainable by the statistics of their training datasets. We propose that both models fail at relational processing because of the binding problem in neural networks. In the third chapter of this thesis, I present an updated version of the DORA architecture (Doumas, Hummel, & Sandhofer, 2008), a symbolic-connectionist model of relation learning and inference that uses temporal synchrony to solve the binding problem. We use this model to perform relational policy transfer between two Atari games. Finally, in the fourth chapter I present a model of relational reinforcement that is able to select relevant relations, from a potentially large pool of applicable relations, to characterize a problem and learn simple rules from the reward signal, helping to bridge the gap between reinforcement learning and relational reasoning

    Backwards is the way forward: feedback in the cortical hierarchy predicts the expected future

    Get PDF
    Clark offers a powerful description of the brain as a prediction machine, which offers progress on two distinct levels. First, on an abstract conceptual level, it provides a unifying framework for perception, action, and cognition (including subdivisions such as attention, expectation, and imagination). Second, hierarchical prediction offers progress on a concrete descriptive level for testing and constraining conceptual elements and mechanisms of predictive coding models (estimation of predictions, prediction errors, and internal models)

    Biologically Plausible Cortical Hierarchical-Classifier Circuit Extensions in Spiking Neurons

    Get PDF
    Hierarchical categorization inter-leaved with sequence recognition of incoming stimuli in the mammalian brain is theorized to be performed by circuits composed of the thalamus and the six-layer cortex. Using these circuits, the cortex is thought to learn a ‘brain grammar’ composed of recursive sequences of categories. A thalamo-cortical, hierarchical classification and sequence learning “Core” circuit implemented as a linear matrix simulation and was published by Rodriguez, Whitson & Granger in 2004. In the brain, these functions are implemented by cortical and thalamic circuits composed of recurrently-connected, spiking neurons. The Neural Engineering Framework (NEF) (Eliasmith & Anderson, 2003) allows for the construction of large-scale biologically plausible neural networks. Existing NEF models of the basal-ganglia and the thalamus exist but to the best of our knowledge there does not exist an integrated, spiking-neuron, cortical-thalamic-Core network model. We construct a more biologically-plausible version of the hierarchical-classification function of the Core circuit using leaky-integrate-and-fire neurons which performs progressive visual classification of static image sequences relying on the neural activity levels to trigger the progressive classification of the stimulus. We proceed by implementing a recurrent NEF model of the cortical-thalamic Core circuit and then test the resulting model on the hierarchical categorization of images

    What auto-encoders could learn from brains - Generation as feedback in deep unsupervised learning and inference

    Get PDF
    This thesis explores fundamental improvements in unsupervised deep learning algorithms. Taking a theoretical perspective on the purpose of unsupervised learning, and choosing learnt approximate inference in a jointly learnt directed generative model as the approach, the main question is how existing implementations of this approach, in particular auto-encoders, could be improved by simultaneously rethinking the way they learn and the way they perform inference. In such network architectures, the availability of two opposing pathways, one for inference and one for generation, allows to exploit the symmetry between them and to let either provide feedback signals to the other. The signals can be used to determine helpful updates for the connection weights from only locally available information, removing the need for the conventional back-propagation path and mitigating the issues associated with it. Moreover, feedback loops can be added to the usual usual feed-forward network to improve inference itself. The reciprocal connectivity between regions in the brain's neocortex provides inspiration for how the iterative revision and verification of proposed interpretations could result in a fair approximation to optimal Bayesian inference. While extracting and combining underlying ideas from research in deep learning and cortical functioning, this thesis walks through the concepts of generative models, approximate inference, local learning rules, target propagation, recirculation, lateral and biased competition, predictive coding, iterative and amortised inference, and other related topics, in an attempt to build up a complex of insights that could provide direction to future research in unsupervised deep learning methods

    A Unified Cognitive Model of Visual Filling-In Based on an Emergic Network Architecture

    Get PDF
    The Emergic Cognitive Model (ECM) is a unified computational model of visual filling-in based on the Emergic Network architecture. The Emergic Network was designed to help realize systems undergoing continuous change. In this thesis, eight different filling-in phenomena are demonstrated under a regime of continuous eye movement (and under static eye conditions as well). ECM indirectly demonstrates the power of unification inherent with Emergic Networks when cognition is decomposed according to finer-grained functions supporting change. These can interact to raise additional emergent behaviours via cognitive re-use, hence the Emergic prefix throughout. Nevertheless, the model is robust and parameter free. Differential re-use occurs in the nature of model interaction with a particular testing paradigm. ECM has a novel decomposition due to the requirements of handling motion and of supporting unified modelling via finer functional grains. The breadth of phenomenal behaviour covered is largely to lend credence to our novel decomposition. The Emergic Network architecture is a hybrid between classical connectionism and classical computationalism that facilitates the construction of unified cognitive models. It helps cutting up of functionalism into finer-grains distributed over space (by harnessing massive recurrence) and over time (by harnessing continuous change), yet simplifies by using standard computer code to focus on the interaction of information flows. Thus while the structure of the network looks neurocentric, the dynamics are best understood in flowcentric terms. Surprisingly, dynamic system analysis (as usually understood) is not involved. An Emergic Network is engineered much like straightforward software or hardware systems that deal with continuously varying inputs. Ultimately, this thesis addresses the problem of reduction and induction over complex systems, and the Emergic Network architecture is merely a tool to assist in this epistemic endeavour. ECM is strictly a sensory model and apart from perception, yet it is informed by phenomenology. It addresses the attribution problem of how much of a phenomenon is best explained at a sensory level of analysis, rather than at a perceptual one. As the causal information flows are stable under eye movement, we hypothesize that they are the locus of consciousness, howsoever it is ultimately realized
    • 

    corecore