8 research outputs found

    Reservoir Memory Machines as Neural Computers

    Full text link
    Differentiable neural computers extend artificial neural networks with an explicit memory without interference, thus enabling the model to perform classic computation tasks such as graph traversal. However, such models are difficult to train, requiring long training times and large datasets. In this work, we achieve some of the computational capabilities of differentiable neural computers with a model that can be trained very efficiently, namely an echo state network with an explicit memory without interference. This extension enables echo state networks to recognize all regular languages, including those that contractive echo state networks provably can not recognize. Further, we demonstrate experimentally that our model performs comparably to its fully-trained deep version on several typical benchmark tasks for differentiable neural computers.Comment: In print at the special issue 'New Frontiers in Extremely Efficient Reservoir Computing' of IEEE TNNL

    Reservoir based spiking models for univariate Time Series Classification

    Get PDF
    A variety of advanced machine learning and deep learning algorithms achieve state-of-the-art performance on various temporal processing tasks. However, these methods are heavily energy inefficient—they run mainly on the power hungry CPUs and GPUs. Computing with Spiking Networks, on the other hand, has shown to be energy efficient on specialized neuromorphic hardware, e.g., Loihi, TrueNorth, SpiNNaker, etc. In this work, we present two architectures of spiking models, inspired from the theory of Reservoir Computing and Legendre Memory Units, for the Time Series Classification (TSC) task. Our first spiking architecture is closer to the general Reservoir Computing architecture and we successfully deploy it on Loihi; the second spiking architecture differs from the first by the inclusion of non-linearity in the readout layer. Our second model (trained with Surrogate Gradient Descent method) shows that non-linear decoding of the linearly extracted temporal features through spiking neurons not only achieves promising results, but also offers low computation-overhead by significantly reducing the number of neurons compared to the popular LSM based models—more than 40x reduction with respect to the recent spiking model we compare with. We experiment on five TSC datasets and achieve new SoTA spiking results (—as much as 28.607% accuracy improvement on one of the datasets), thereby showing the potential of our models to address the TSC tasks in a green energy-efficient manner. In addition, we also do energy profiling and comparison on Loihi and CPU to support our claims

    Learned Legendre Predictive State Estimator for Control

    Get PDF
    This thesis introduces a novel method for system model identification, specifically for state estimation. The method uses a 2 or 3 layer neural network developed and trained with the methods of the Neural Engineering Framework (NEF). Using the NEF allows for direct control of what the different layers represent with white-box modelling of the layers. NEF networks also have the added benefit of being compilable onto neuromorphic hardware, which can run on an order of magnitude or more less power than conventional computing hardware. The first layer of the network is optional and uses a Legendre Delay Network (LDN). The LDN implements a linear operation that performs a mathematically optimal compression of a time series of data, which in this context is the input signal to the network. This allows for temporal information to be encoded and passed into the network. The LDN frames the problem of memory as delaying a signal by some length ξ seconds. Using the linear transfer function for a continuous-time delay, F(s) = e−ξs, the LDN compression is considered optimal as it uses Pad®e approximants to represent the delay, which has been proven optimal for this purpose. The LDN has been shown to outperform other memory cells, such as long short-term memory (LSTM) and gated recurrent units (GRU), by several orders of magnitude, and is capable of representing over 1,000,000 timesteps of data. The LDN forms a polynomial representation of a sliding window of length ξ, allowing for a continuous representation of the time series. The second layer uses the Learned Legendre Predictor (LLP) to make predictions of how a subset of the input signal to this layer will evolve over a future window of time. In the case of model estimation, using the system states and control signal (at minimum), the LLP layer predicts how the system states will evolve over a continuous window into the future. The LLP uses a similar time series compression as the LDN, but of the representation of the layer prediction into the future. The weights for the LLP layer can be trained online or offline. The third layer of the network performs the transformation out of the Legendre domain into the units of the input signal to be predicted. Since the second layer outputs a polynomial representation of the state prediction, the state at any time in the prediction window can be extracted with a linear operation. Combined, the three layer network is referred to as the Learned Legendre Predictive State Estimator (LLPSE). The 2 layer version, without LDN context encoding, is tested online on a single link inverted pendulum and is able to predict the angle of the arm 30 timesteps into the future while learning the system dynamics online. The 3 layer LLPSE is trained offline to predict the future position of a simulated quadrotor over a continuous window of 1 second in length. The training, validation, and test data is generated in AirSim with Unreal Engine 4. The LLPSE is able to predict the future second of a simulated quadrotor’s position with an average RMSE of 0.0067 on the network’s normalized representation space of position (normalized from a 30x30x15 meter volume). Future work is discussed, with initial steps provided for using the LLPSE for model predictive control (MPC). A controller, the Learned Legendre Predictive Controller (LLPC), is designed and tested for state estimation across the control space. The design and future steps of the LLPC are discussed in the final chapter. A preliminary LLPC is designed and was integrated into the test suite, and is available along with all of the code for simulator interfacing, controllers, path planning, the LLP systems, and various utility functions at https://github.com/p3jawors/masters thesis

    Harnessing Neural Dynamics as a Computational Resource

    Get PDF
    Researchers study nervous systems at levels of scale spanning several orders of magnitude, both in terms of time and space. While some parts of the brain are well understood at specific levels of description, there are few overarching theories that systematically bridge low-level mechanism and high-level function. The Neural Engineering Framework (NEF) is an attempt at providing such a theory. The NEF enables researchers to systematically map dynamical systems—corresponding to some hypothesised brain function—onto biologically constrained spiking neural networks. In this thesis, we present several extensions to the NEF that broaden both the range of neural resources that can be harnessed for spatiotemporal computation and the range of available biological constraints. Specifically, we suggest a method for harnessing the dynamics inherent in passive dendritic trees for computation, allowing us to construct single-layer spiking neural networks that, for some functions, achieve substantially lower errors than larger multi-layer networks. Furthermore, we suggest “temporal tuning” as a unifying approach to harnessing temporal resources for computation through time. This allows modellers to directly constrain networks to temporal tuning observed in nature, in ways not previously well-supported by the NEF. We then explore specific examples of neurally plausible dynamics using these techniques. In particular, we propose a new “information erasure” technique for constructing LTI systems generating temporal bases. Such LTI systems can be used to establish an optimal basis for spatiotemporal computation. We demonstrate how this captures “time cells” that have been observed throughout the brain. As well, we demonstrate the viability of our extensions by constructing an adaptive filter model of the cerebellum that successfully reproduces key features of eyeblink conditioning observed in neurobiological experiments. Outside the cognitive sciences, our work can help exploit resources available on existing neuromorphic computers, and inform future neuromorphic hardware design. In machine learning, our spatiotemporal NEF populations map cleanly onto the Legendre Memory Unit (LMU), a promising artificial neural network architecture for stream-to-stream processing that outperforms competing approaches. We find that one of our LTI systems derived through “information erasure” may serve as a computationally less expensive alternative to the LTI system commonly used in the LMU

    Dynamical Systems in Spiking Neuromorphic Hardware

    Get PDF
    Dynamical systems are universal computers. They can perceive stimuli, remember, learn from feedback, plan sequences of actions, and coordinate complex behavioural responses. The Neural Engineering Framework (NEF) provides a general recipe to formulate models of such systems as coupled sets of nonlinear differential equations and compile them onto recurrently connected spiking neural networks – akin to a programming language for spiking models of computation. The Nengo software ecosystem supports the NEF and compiles such models onto neuromorphic hardware. In this thesis, we analyze the theory driving the success of the NEF, and expose several core principles underpinning its correctness, scalability, completeness, robustness, and extensibility. We also derive novel theoretical extensions to the framework that enable it to far more effectively leverage a wide variety of dynamics in digital hardware, and to exploit the device-level physics in analog hardware. At the same time, we propose a novel set of spiking algorithms that recruit an optimal nonlinear encoding of time, which we call the Delay Network (DN). Backpropagation across stacked layers of DNs dramatically outperforms stacked Long Short-Term Memory (LSTM) networks—a state-of-the-art deep recurrent architecture—in accuracy and training time, on a continuous-time memory task, and a chaotic time-series prediction benchmark. The basic component of this network is shown to function on state-of-the-art spiking neuromorphic hardware including Braindrop and Loihi. This implementation approaches the energy-efficiency of the human brain in the former case, and the precision of conventional computation in the latter case

    Identifying Social Signals from Human Body Movements for Intelligent Technologies

    Get PDF
    Numerous Human-Computer Interaction (HCI) contexts require the identification of human internal states such as emotions, intentions, and states such as confusion and task engagement. Recognition of these states allows for artificial agents and interactive systems to provide appropriate responses to their human interaction partner. Whilst numerous solutions have been developed, many of these have been designed to classify internal states in a binary fashion, i.e. stating whether or not an internal state is present. One of the potential drawbacks of these approaches is that they provide a restricted, reductionist view of the internal states being experienced by a human user. As a result, an interactive agent which makes response decisions based on such a binary recognition system would be restricted in terms of the flexibility and appropriateness of its responses. Thus, in many settings, internal state recognition systems would benefit from being able to recognize multiple different ‘intensities’ of an internal state. However, for most classical machine learning approaches, this requires that a recognition system be trained on examples from every intensity (e.g. high, medium and low intensity task engagement). Obtaining such a training data-set can be both time- and resource-intensive. This project set out to explore whether this data requirement could be reduced whilst still providing an artificial recognition system able to provide multiple classification labels. To this end, this project first identified a set of internal states that could be recognized from human behaviour information available in a pre-existing data set. These explorations revealed that states relating to task engagement could be identified, by human observers, from human movement and posture information. A second set of studies was then dedicated to developing and testing different approaches to classifying three intensities of task engagement (high, intermediate and low) after training only on examples from the high and low task engagement data sets. The result of these studies was the development of an approach which incorporated the recently developed Legendre Memory Units, and was shown to produce an output which could be used to distinguish between all three task engagement intensities after being trained on only examples of high and low intensity task engagement. Thus this project presents the foundation work for internal state recognition systems which require less data whilst providing more classification labels

    Learning and Decision Making in Social Contexts: Neural and Computational Models

    Get PDF
    Social interaction is one of humanity's defining features. Through it, we develop ideas, express emotions, and form relationships. In this thesis, we explore the topic of social cognition by building biologically-plausible computational models of learning and decision making. Our goal is to develop mechanistic explanations for how the brain performs a variety of social tasks, to test those theories by simulating neural networks, and to validate our models by comparing to human and animal data. We begin by introducing social cognition from functional and anatomical perspectives, then present the Neural Engineering Framework, which we use throughout the thesis to specify functional brain models. Over the course of four chapters, we investigate many aspects of social cognition using these models. We begin by studying fear conditioning using an anatomically accurate model of the amygdala. We validate this model by comparing the response properties of our simulated neurons with real amygdala neurons, showing that simulated behavior is consistent with animal data, and exploring how simulated fear generalization relates to normal and anxious humans. Next, we show that biologically-detailed networks may realize cognitive operations that are essential for social cognition. We validate this approach by constructing a working memory network from multi-compartment cells and conductance-based synapses, then show that its mnemonic performance is comparable to animals performing a delayed match-to-sample task. In the next chapter, we study decision making and the tradeoffs between speed and accuracy: our network gathers information from the environment and tracks the value of choice alternatives, making a decision once certain criteria are met. We apply this model to a two-choice decision task, fit model parameters to recreate the behavior of individual humans, and reproduce the speed-accuracy tradeoff evident in the human population. Finally, we combine our networks for learning, working memory, and decision making into a cognitive agent that uses reinforcement learning to play a simple social game. We compare this model with two other cognitive architectures and with human data from an experiment we ran, and show that our three cognitive agents recreate important patterns in the human data, especially those related to social value orientation and cooperative behavior. Our concluding chapter summarizes our contributions to the field of social cognition and proposes directions for further research. The main contribution of this thesis is the demonstration that a diverse set of social cognitive abilities may be explained, simulated, and validated using a functionally-descriptive, biologically-plausible theoretical framework. Our models lay a foundation for studying increasingly-sophisticated forms of social cognition in future work

    Estimating levels of engagement for social human-robot interaction using Legendre memory units

    No full text
    Contains fulltext : 230942.pdf (Publisher’s version ) (Closed access)In this study, we examine whether the data requirements associated with training a system to recognize multiple 'levels' of an internal state can be reduced by training systems on the 'extremes' in a way that allows them to estimate "intermediate" classes as falling in-between the trained extremes. Specifically, this study explores whether a novel recurrent neural network, the Legendre Delay Network, added as a pre-processing step to a Multi-Layer Perception, produces an output which can be used to separate an untrained intermediate class of task engagement from the trained extreme classes. The results showed that identifying untrained classes after training on the extremes is feasible, particularly when using the Legendre Delay Network.HRI '21: ACM/IEEE International Conference on Human-Robot Interaction (March 2021
    corecore