180 research outputs found

    Neuroplasticity, neural reuse, and the language module

    Get PDF
    What conception of mental architecture can survive the evidence of neuroplasticity and neural reuse in the human brain? In particular, what sorts of modules are compatible with this evidence? I aim to show how developmental and adult neuroplasticity, as well as evidence of pervasive neural reuse, forces us to revise the standard conception of modularity and spells the end of a hardwired and dedicated language module. I argue from principles of both neural reuse and neural redundancy that language is facilitated by a composite of modules (or module-like entities), few if any of which are likely to be linguistically special, and that neuroplasticity provides evidence that (in key respects and to an appreciable extent) few if any of them ought to be considered developmentally robust, though their development does seem to be constrained by features intrinsic to particular regions of cortex (manifesting as domain-specific predispositions or acquisition biases). In the course of doing so I articulate a schematically and neurobiologically precise framework for understanding modules and their supramodular interactions

    Center for Aeronautics and Space Information Sciences

    Get PDF
    This report summarizes the research done during 1991/92 under the Center for Aeronautics and Space Information Science (CASIS) program. The topics covered are computer architecture, networking, and neural nets

    On the application of neural networks to symbol systems.

    Get PDF
    While for many years two alternative approaches to building intelligent systems, symbolic AI and neural networks, have each demonstrated specific advantages and also revealed specific weaknesses, in recent years a number of researchers have sought methods of combining the two into a unified methodology which embodies the benefits of each while attenuating the disadvantages. This work sets out to identify the key ideas from each discipline and combine them into an architecture which would be practically scalable for very large network applications. The architecture is based on a relational database structure and forms the environment for an investigation into the necessary properties of a symbol encoding which will permit the singlepresentation learning of patterns and associations, the development of categories and features leading to robust generalisation and the seamless integration of a range of memory persistencies from short to long term. It is argued that if, as proposed by many proponents of symbolic AI, the symbol encoding must be causally related to its syntactic meaning, then it must also be mutable as the network learns and grows, adapting to the growing complexity of the relationships in which it is instantiated. Furthermore, it is argued that in order to create an efficient and coherent memory structure, the symbolic encoding itself must have an underlying structure which is not accessible symbolically; this structure would provide the framework permitting structurally sensitive processes to act upon symbols without explicit reference to their content. Such a structure must dictate how new symbols are created during normal operation. The network implementation proposed is based on K-from-N codes, which are shown to possess a number of desirable qualities and are well matched to the requirements of the symbol encoding. Several networks are developed and analysed to exploit these codes, based around a recurrent version of the non-holographic associati ve memory of Willshaw, et al. The simplest network is shown to have properties similar to those of a Hopfield network, but the storage capacity is shown to be greater, though at a cost of lower signal to noise ratio. Subsequent network additions break each K-from-N pattern into L subsets, each using D-from-N coding, creating cyclic patterns of period L. This step increases the capacity still further but at a cost of lower signal to noise ratio. The use of the network in associating pairs of input patterns with any given output pattern, an architectural requirement, is verified. The use of complex synaptic junctions is investigated as a means to increase storage capacity, to address the stability-plasticity dilemma and to implement the hierarchical aspects of the symbol encoding defined in the architecture. A wide range of options is developed which allow a number of key global parameters to be traded-off. One scheme is analysed and simulated. A final section examines some of the elements that need to be added to our current understanding of neural network-based reasoning systems to make general purpose intelligent systems possible. It is argued that the sections of this work represent pieces of the whole in this regard and that their integration will provide a sound basis for making such systems a reality

    Dynamical Systems in Spiking Neuromorphic Hardware

    Get PDF
    Dynamical systems are universal computers. They can perceive stimuli, remember, learn from feedback, plan sequences of actions, and coordinate complex behavioural responses. The Neural Engineering Framework (NEF) provides a general recipe to formulate models of such systems as coupled sets of nonlinear differential equations and compile them onto recurrently connected spiking neural networks – akin to a programming language for spiking models of computation. The Nengo software ecosystem supports the NEF and compiles such models onto neuromorphic hardware. In this thesis, we analyze the theory driving the success of the NEF, and expose several core principles underpinning its correctness, scalability, completeness, robustness, and extensibility. We also derive novel theoretical extensions to the framework that enable it to far more effectively leverage a wide variety of dynamics in digital hardware, and to exploit the device-level physics in analog hardware. At the same time, we propose a novel set of spiking algorithms that recruit an optimal nonlinear encoding of time, which we call the Delay Network (DN). Backpropagation across stacked layers of DNs dramatically outperforms stacked Long Short-Term Memory (LSTM) networks—a state-of-the-art deep recurrent architecture—in accuracy and training time, on a continuous-time memory task, and a chaotic time-series prediction benchmark. The basic component of this network is shown to function on state-of-the-art spiking neuromorphic hardware including Braindrop and Loihi. This implementation approaches the energy-efficiency of the human brain in the former case, and the precision of conventional computation in the latter case

    On challenges in training recurrent neural networks

    Full text link
    Dans un problème de prédiction à multiples pas discrets, la prédiction à chaque instant peut dépendre de l’entrée à n’importe quel moment dans un passé lointain. Modéliser une telle dépendance à long terme est un des problèmes fondamentaux en apprentissage automatique. En théorie, les Réseaux de Neurones Récurrents (RNN) peuvent modéliser toute dépendance à long terme. En pratique, puisque la magnitude des gradients peut croître ou décroître exponentiellement avec la durée de la séquence, les RNNs ne peuvent modéliser que les dépendances à court terme. Cette thèse explore ce problème dans les réseaux de neurones récurrents et propose de nouvelles solutions pour celui-ci. Le chapitre 3 explore l’idée d’utiliser une mémoire externe pour stocker les états cachés d’un réseau à Mémoire Long et Court Terme (LSTM). En rendant l’opération d’écriture et de lecture de la mémoire externe discrète, l’architecture proposée réduit le taux de décroissance des gradients dans un LSTM. Ces opérations discrètes permettent également au réseau de créer des connexions dynamiques sur de longs intervalles de temps. Le chapitre 4 tente de caractériser cette décroissance des gradients dans un réseau de neurones récurrent et propose une nouvelle architecture récurrente qui, grâce à sa conception, réduit ce problème. L’Unité Récurrente Non-saturante (NRUs) proposée n’a pas de fonction d’activation saturante et utilise la mise à jour additive de cellules au lieu de la mise à jour multiplicative. Le chapitre 5 discute des défis de l’utilisation de réseaux de neurones récurrents dans un contexte d’apprentissage continuel, où de nouvelles tâches apparaissent au fur et à mesure. Les dépendances dans l’apprentissage continuel ne sont pas seulement contenues dans une tâche, mais sont aussi présentes entre les tâches. Ce chapitre discute de deux problèmes fondamentaux dans l’apprentissage continuel: (i) l’oubli catastrophique d’anciennes tâches et (ii) la capacité de saturation du réseau. De plus, une solution est proposée pour régler ces deux problèmes lors de l’entraînement d’un réseau de neurones récurrent.In a multi-step prediction problem, the prediction at each time step can depend on the input at any of the previous time steps far in the past. Modelling such long-term dependencies is one of the fundamental problems in machine learning. In theory, Recurrent Neural Networks (RNNs) can model any long-term dependency. In practice, they can only model short-term dependencies due to the problem of vanishing and exploding gradients. This thesis explores the problem of vanishing gradient in recurrent neural networks and proposes novel solutions for the same. Chapter 3 explores the idea of using external memory to store the hidden states of a Long Short Term Memory (LSTM) network. By making the read and write operations of the external memory discrete, the proposed architecture reduces the rate of gradients vanishing in an LSTM. These discrete operations also enable the network to create dynamic skip connections across time. Chapter 4 attempts to characterize all the sources of vanishing gradients in a recurrent neural network and proposes a new recurrent architecture which has significantly better gradient flow than state-of-the-art recurrent architectures. The proposed Non-saturating Recurrent Units (NRUs) have no saturating activation functions and use additive cell updates instead of multiplicative cell updates. Chapter 5 discusses the challenges of using recurrent neural networks in the context of lifelong learning. In the lifelong learning setting, the network is expected to learn a series of tasks over its lifetime. The dependencies in lifelong learning are not just within a task, but also across the tasks. This chapter discusses the two fundamental problems in lifelong learning: (i) catastrophic forgetting of old tasks, and (ii) network capacity saturation. Further, it proposes a solution to solve both these problems while training a recurrent neural network

    On microelectronic self-learning cognitive chip systems

    Get PDF
    After a brief review of machine learning techniques and applications, this Ph.D. thesis examines several approaches for implementing machine learning architectures and algorithms into hardware within our laboratory. From this interdisciplinary background support, we have motivations for novel approaches that we intend to follow as an objective of innovative hardware implementations of dynamically self-reconfigurable logic for enhanced self-adaptive, self-(re)organizing and eventually self-assembling machine learning systems, while developing this new particular area of research. And after reviewing some relevant background of robotic control methods followed by most recent advanced cognitive controllers, this Ph.D. thesis suggests that amongst many well-known ways of designing operational technologies, the design methodologies of those leading-edge high-tech devices such as cognitive chips that may well lead to intelligent machines exhibiting conscious phenomena should crucially be restricted to extremely well defined constraints. Roboticists also need those as specifications to help decide upfront on otherwise infinitely free hardware/software design details. In addition and most importantly, we propose these specifications as methodological guidelines tightly related to ethics and the nowadays well-identified workings of the human body and of its psyche

    Artificial general intelligence: Proceedings of the Second Conference on Artificial General Intelligence, AGI 2009, Arlington, Virginia, USA, March 6-9, 2009

    Get PDF
    Artificial General Intelligence (AGI) research focuses on the original and ultimate goal of AI – to create broad human-like and transhuman intelligence, by exploring all available paths, including theoretical and experimental computer science, cognitive science, neuroscience, and innovative interdisciplinary methodologies. Due to the difficulty of this task, for the last few decades the majority of AI researchers have focused on what has been called narrow AI – the production of AI systems displaying intelligence regarding specific, highly constrained tasks. In recent years, however, more and more researchers have recognized the necessity – and feasibility – of returning to the original goals of the field. Increasingly, there is a call for a transition back to confronting the more difficult issues of human level intelligence and more broadly artificial general intelligence
    • …
    corecore