203,960 research outputs found

    Particle-filtering approaches for nonlinear Bayesian decoding of neuronal spike trains

    Full text link
    The number of neurons that can be simultaneously recorded doubles every seven years. This ever increasing number of recorded neurons opens up the possibility to address new questions and extract higher dimensional stimuli from the recordings. Modeling neural spike trains as point processes, this task of extracting dynamical signals from spike trains is commonly set in the context of nonlinear filtering theory. Particle filter methods relying on importance weights are generic algorithms that solve the filtering task numerically, but exhibit a serious drawback when the problem dimensionality is high: they are known to suffer from the 'curse of dimensionality' (COD), i.e. the number of particles required for a certain performance scales exponentially with the observable dimensions. Here, we first briefly review the theory on filtering with point process observations in continuous time. Based on this theory, we investigate both analytically and numerically the reason for the COD of weighted particle filtering approaches: Similarly to particle filtering with continuous-time observations, the COD with point-process observations is due to the decay of effective number of particles, an effect that is stronger when the number of observable dimensions increases. Given the success of unweighted particle filtering approaches in overcoming the COD for continuous- time observations, we introduce an unweighted particle filter for point-process observations, the spike-based Neural Particle Filter (sNPF), and show that it exhibits a similar favorable scaling as the number of dimensions grows. Further, we derive rules for the parameters of the sNPF from a maximum likelihood approach learning. We finally employ a simple decoding task to illustrate the capabilities of the sNPF and to highlight one possible future application of our inference and learning algorithm

    Deep Reinforcement Learning for Swarm Systems

    Full text link
    Recently, deep reinforcement learning (RL) methods have been applied successfully to multi-agent scenarios. Typically, these methods rely on a concatenation of agent states to represent the information content required for decentralized decision making. However, concatenation scales poorly to swarm systems with a large number of homogeneous agents as it does not exploit the fundamental properties inherent to these systems: (i) the agents in the swarm are interchangeable and (ii) the exact number of agents in the swarm is irrelevant. Therefore, we propose a new state representation for deep multi-agent RL based on mean embeddings of distributions. We treat the agents as samples of a distribution and use the empirical mean embedding as input for a decentralized policy. We define different feature spaces of the mean embedding using histograms, radial basis functions and a neural network learned end-to-end. We evaluate the representation on two well known problems from the swarm literature (rendezvous and pursuit evasion), in a globally and locally observable setup. For the local setup we furthermore introduce simple communication protocols. Of all approaches, the mean embedding representation using neural network features enables the richest information exchange between neighboring agents facilitating the development of more complex collective strategies.Comment: 31 pages, 12 figures, version 3 (published in JMLR Volume 20

    Learning Task Specifications from Demonstrations

    Full text link
    Real world applications often naturally decompose into several sub-tasks. In many settings (e.g., robotics) demonstrations provide a natural way to specify the sub-tasks. However, most methods for learning from demonstrations either do not provide guarantees that the artifacts learned for the sub-tasks can be safely recombined or limit the types of composition available. Motivated by this deficit, we consider the problem of inferring Boolean non-Markovian rewards (also known as logical trace properties or specifications) from demonstrations provided by an agent operating in an uncertain, stochastic environment. Crucially, specifications admit well-defined composition rules that are typically easy to interpret. In this paper, we formulate the specification inference task as a maximum a posteriori (MAP) probability inference problem, apply the principle of maximum entropy to derive an analytic demonstration likelihood model and give an efficient approach to search for the most likely specification in a large candidate pool of specifications. In our experiments, we demonstrate how learning specifications can help avoid common problems that often arise due to ad-hoc reward composition.Comment: NIPS 201

    Long-term Blood Pressure Prediction with Deep Recurrent Neural Networks

    Full text link
    Existing methods for arterial blood pressure (BP) estimation directly map the input physiological signals to output BP values without explicitly modeling the underlying temporal dependencies in BP dynamics. As a result, these models suffer from accuracy decay over a long time and thus require frequent calibration. In this work, we address this issue by formulating BP estimation as a sequence prediction problem in which both the input and target are temporal sequences. We propose a novel deep recurrent neural network (RNN) consisting of multilayered Long Short-Term Memory (LSTM) networks, which are incorporated with (1) a bidirectional structure to access larger-scale context information of input sequence, and (2) residual connections to allow gradients in deep RNN to propagate more effectively. The proposed deep RNN model was tested on a static BP dataset, and it achieved root mean square error (RMSE) of 3.90 and 2.66 mmHg for systolic BP (SBP) and diastolic BP (DBP) prediction respectively, surpassing the accuracy of traditional BP prediction models. On a multi-day BP dataset, the deep RNN achieved RMSE of 3.84, 5.25, 5.80 and 5.81 mmHg for the 1st day, 2nd day, 4th day and 6th month after the 1st day SBP prediction, and 1.80, 4.78, 5.0, 5.21 mmHg for corresponding DBP prediction, respectively, which outperforms all previous models with notable improvement. The experimental results suggest that modeling the temporal dependencies in BP dynamics significantly improves the long-term BP prediction accuracy.Comment: To appear in IEEE BHI 201

    Momentum Control with Hierarchical Inverse Dynamics on a Torque-Controlled Humanoid

    Full text link
    Hierarchical inverse dynamics based on cascades of quadratic programs have been proposed for the control of legged robots. They have important benefits but to the best of our knowledge have never been implemented on a torque controlled humanoid where model inaccuracies, sensor noise and real-time computation requirements can be problematic. Using a reformulation of existing algorithms, we propose a simplification of the problem that allows to achieve real-time control. Momentum-based control is integrated in the task hierarchy and a LQR design approach is used to compute the desired associated closed-loop behavior and improve performance. Extensive experiments on various balancing and tracking tasks show very robust performance in the face of unknown disturbances, even when the humanoid is standing on one foot. Our results demonstrate that hierarchical inverse dynamics together with momentum control can be efficiently used for feedback control under real robot conditions.Comment: 21 pages, 11 figures, 4 tables in Autonomous Robots (2015

    Contrastive Hebbian Learning with Random Feedback Weights

    Full text link
    Neural networks are commonly trained to make predictions through learning algorithms. Contrastive Hebbian learning, which is a powerful rule inspired by gradient backpropagation, is based on Hebb's rule and the contrastive divergence algorithm. It operates in two phases, the forward (or free) phase, where the data are fed to the network, and a backward (or clamped) phase, where the target signals are clamped to the output layer of the network and the feedback signals are transformed through the transpose synaptic weight matrices. This implies symmetries at the synaptic level, for which there is no evidence in the brain. In this work, we propose a new variant of the algorithm, called random contrastive Hebbian learning, which does not rely on any synaptic weights symmetries. Instead, it uses random matrices to transform the feedback signals during the clamped phase, and the neural dynamics are described by first order non-linear differential equations. The algorithm is experimentally verified by solving a Boolean logic task, classification tasks (handwritten digits and letters), and an autoencoding task. This article also shows how the parameters affect learning, especially the random matrices. We use the pseudospectra analysis to investigate further how random matrices impact the learning process. Finally, we discuss the biological plausibility of the proposed algorithm, and how it can give rise to better computational models for learning
    • …
    corecore