203,960 research outputs found
Particle-filtering approaches for nonlinear Bayesian decoding of neuronal spike trains
The number of neurons that can be simultaneously recorded doubles every seven
years. This ever increasing number of recorded neurons opens up the possibility
to address new questions and extract higher dimensional stimuli from the
recordings. Modeling neural spike trains as point processes, this task of
extracting dynamical signals from spike trains is commonly set in the context
of nonlinear filtering theory. Particle filter methods relying on importance
weights are generic algorithms that solve the filtering task numerically, but
exhibit a serious drawback when the problem dimensionality is high: they are
known to suffer from the 'curse of dimensionality' (COD), i.e. the number of
particles required for a certain performance scales exponentially with the
observable dimensions. Here, we first briefly review the theory on filtering
with point process observations in continuous time. Based on this theory, we
investigate both analytically and numerically the reason for the COD of
weighted particle filtering approaches: Similarly to particle filtering with
continuous-time observations, the COD with point-process observations is due to
the decay of effective number of particles, an effect that is stronger when the
number of observable dimensions increases. Given the success of unweighted
particle filtering approaches in overcoming the COD for continuous- time
observations, we introduce an unweighted particle filter for point-process
observations, the spike-based Neural Particle Filter (sNPF), and show that it
exhibits a similar favorable scaling as the number of dimensions grows.
Further, we derive rules for the parameters of the sNPF from a maximum
likelihood approach learning. We finally employ a simple decoding task to
illustrate the capabilities of the sNPF and to highlight one possible future
application of our inference and learning algorithm
Deep Reinforcement Learning for Swarm Systems
Recently, deep reinforcement learning (RL) methods have been applied
successfully to multi-agent scenarios. Typically, these methods rely on a
concatenation of agent states to represent the information content required for
decentralized decision making. However, concatenation scales poorly to swarm
systems with a large number of homogeneous agents as it does not exploit the
fundamental properties inherent to these systems: (i) the agents in the swarm
are interchangeable and (ii) the exact number of agents in the swarm is
irrelevant. Therefore, we propose a new state representation for deep
multi-agent RL based on mean embeddings of distributions. We treat the agents
as samples of a distribution and use the empirical mean embedding as input for
a decentralized policy. We define different feature spaces of the mean
embedding using histograms, radial basis functions and a neural network learned
end-to-end. We evaluate the representation on two well known problems from the
swarm literature (rendezvous and pursuit evasion), in a globally and locally
observable setup. For the local setup we furthermore introduce simple
communication protocols. Of all approaches, the mean embedding representation
using neural network features enables the richest information exchange between
neighboring agents facilitating the development of more complex collective
strategies.Comment: 31 pages, 12 figures, version 3 (published in JMLR Volume 20
Learning Task Specifications from Demonstrations
Real world applications often naturally decompose into several sub-tasks. In
many settings (e.g., robotics) demonstrations provide a natural way to specify
the sub-tasks. However, most methods for learning from demonstrations either do
not provide guarantees that the artifacts learned for the sub-tasks can be
safely recombined or limit the types of composition available. Motivated by
this deficit, we consider the problem of inferring Boolean non-Markovian
rewards (also known as logical trace properties or specifications) from
demonstrations provided by an agent operating in an uncertain, stochastic
environment. Crucially, specifications admit well-defined composition rules
that are typically easy to interpret. In this paper, we formulate the
specification inference task as a maximum a posteriori (MAP) probability
inference problem, apply the principle of maximum entropy to derive an analytic
demonstration likelihood model and give an efficient approach to search for the
most likely specification in a large candidate pool of specifications. In our
experiments, we demonstrate how learning specifications can help avoid common
problems that often arise due to ad-hoc reward composition.Comment: NIPS 201
Long-term Blood Pressure Prediction with Deep Recurrent Neural Networks
Existing methods for arterial blood pressure (BP) estimation directly map the
input physiological signals to output BP values without explicitly modeling the
underlying temporal dependencies in BP dynamics. As a result, these models
suffer from accuracy decay over a long time and thus require frequent
calibration. In this work, we address this issue by formulating BP estimation
as a sequence prediction problem in which both the input and target are
temporal sequences. We propose a novel deep recurrent neural network (RNN)
consisting of multilayered Long Short-Term Memory (LSTM) networks, which are
incorporated with (1) a bidirectional structure to access larger-scale context
information of input sequence, and (2) residual connections to allow gradients
in deep RNN to propagate more effectively. The proposed deep RNN model was
tested on a static BP dataset, and it achieved root mean square error (RMSE) of
3.90 and 2.66 mmHg for systolic BP (SBP) and diastolic BP (DBP) prediction
respectively, surpassing the accuracy of traditional BP prediction models. On a
multi-day BP dataset, the deep RNN achieved RMSE of 3.84, 5.25, 5.80 and 5.81
mmHg for the 1st day, 2nd day, 4th day and 6th month after the 1st day SBP
prediction, and 1.80, 4.78, 5.0, 5.21 mmHg for corresponding DBP prediction,
respectively, which outperforms all previous models with notable improvement.
The experimental results suggest that modeling the temporal dependencies in BP
dynamics significantly improves the long-term BP prediction accuracy.Comment: To appear in IEEE BHI 201
Momentum Control with Hierarchical Inverse Dynamics on a Torque-Controlled Humanoid
Hierarchical inverse dynamics based on cascades of quadratic programs have
been proposed for the control of legged robots. They have important benefits
but to the best of our knowledge have never been implemented on a torque
controlled humanoid where model inaccuracies, sensor noise and real-time
computation requirements can be problematic. Using a reformulation of existing
algorithms, we propose a simplification of the problem that allows to achieve
real-time control. Momentum-based control is integrated in the task hierarchy
and a LQR design approach is used to compute the desired associated closed-loop
behavior and improve performance. Extensive experiments on various balancing
and tracking tasks show very robust performance in the face of unknown
disturbances, even when the humanoid is standing on one foot. Our results
demonstrate that hierarchical inverse dynamics together with momentum control
can be efficiently used for feedback control under real robot conditions.Comment: 21 pages, 11 figures, 4 tables in Autonomous Robots (2015
Contrastive Hebbian Learning with Random Feedback Weights
Neural networks are commonly trained to make predictions through learning
algorithms. Contrastive Hebbian learning, which is a powerful rule inspired by
gradient backpropagation, is based on Hebb's rule and the contrastive
divergence algorithm. It operates in two phases, the forward (or free) phase,
where the data are fed to the network, and a backward (or clamped) phase, where
the target signals are clamped to the output layer of the network and the
feedback signals are transformed through the transpose synaptic weight
matrices. This implies symmetries at the synaptic level, for which there is no
evidence in the brain. In this work, we propose a new variant of the algorithm,
called random contrastive Hebbian learning, which does not rely on any synaptic
weights symmetries. Instead, it uses random matrices to transform the feedback
signals during the clamped phase, and the neural dynamics are described by
first order non-linear differential equations. The algorithm is experimentally
verified by solving a Boolean logic task, classification tasks (handwritten
digits and letters), and an autoencoding task. This article also shows how the
parameters affect learning, especially the random matrices. We use the
pseudospectra analysis to investigate further how random matrices impact the
learning process. Finally, we discuss the biological plausibility of the
proposed algorithm, and how it can give rise to better computational models for
learning
- …