151 research outputs found
Online Learning of a Memory for Learning Rates
The promise of learning to learn for robotics rests on the hope that by
extracting some information about the learning process itself we can speed up
subsequent similar learning tasks. Here, we introduce a computationally
efficient online meta-learning algorithm that builds and optimizes a memory
model of the optimal learning rate landscape from previously observed gradient
behaviors. While performing task specific optimization, this memory of learning
rates predicts how to scale currently observed gradients. After applying the
gradient scaling our meta-learner updates its internal memory based on the
observed effect its prediction had. Our meta-learner can be combined with any
gradient-based optimizer, learns on the fly and can be transferred to new
optimization tasks. In our evaluations we show that our meta-learning algorithm
speeds up learning of MNIST classification and a variety of learning control
tasks, either in batch or online learning settings.Comment: accepted to ICRA 2018, code available:
https://github.com/fmeier/online-meta-learning ; video pitch available:
https://youtu.be/9PzQ25FPPO
A New Data Source for Inverse Dynamics Learning
Modern robotics is gravitating toward increasingly collaborative human robot
interaction. Tools such as acceleration policies can naturally support the
realization of reactive, adaptive, and compliant robots. These tools require us
to model the system dynamics accurately -- a difficult task. The fundamental
problem remains that simulation and reality diverge--we do not know how to
accurately change a robot's state. Thus, recent research on improving inverse
dynamics models has been focused on making use of machine learning techniques.
Traditional learning techniques train on the actual realized accelerations,
instead of the policy's desired accelerations, which is an indirect data
source. Here we show how an additional training signal -- measured at the
desired accelerations -- can be derived from a feedback control signal. This
effectively creates a second data source for learning inverse dynamics models.
Furthermore, we show how both the traditional and this new data source, can be
used to train task-specific models of the inverse dynamics, when used
independently or combined. We analyze the use of both data sources in
simulation and demonstrate its effectiveness on a real-world robotic platform.
We show that our system incrementally improves the learned inverse dynamics
model, and when using both data sources combined converges more consistently
and faster.Comment: IROS 201
A New Perspective and Extension of the Gaussian Filter
The Gaussian Filter (GF) is one of the most widely used filtering algorithms;
instances are the Extended Kalman Filter, the Unscented Kalman Filter and the
Divided Difference Filter. GFs represent the belief of the current state by a
Gaussian with the mean being an affine function of the measurement. We show
that this representation can be too restrictive to accurately capture the
dependences in systems with nonlinear observation models, and we investigate
how the GF can be generalized to alleviate this problem. To this end, we view
the GF from a variational-inference perspective. We analyse how restrictions on
the form of the belief can be relaxed while maintaining simplicity and
efficiency. This analysis provides a basis for generalizations of the GF. We
propose one such generalization which coincides with a GF using a virtual
measurement, obtained by applying a nonlinear function to the actual
measurement. Numerical experiments show that the proposed Feature Gaussian
Filter (FGF) can have a substantial performance advantage over the standard GF
for systems with nonlinear observation models.Comment: Will appear in Robotics: Science and Systems (R:SS) 201
The Coordinate Particle Filter - A novel Particle Filter for High Dimensional Systems
Parametric filters, such as the Extended Kalman Filter and the Unscented
Kalman Filter, typically scale well with the dimensionality of the problem, but
they are known to fail if the posterior state distribution cannot be closely
approximated by a density of the assumed parametric form. For nonparametric
filters, such as the Particle Filter, the converse holds. Such methods are able
to approximate any posterior, but the computational requirements scale
exponentially with the number of dimensions of the state space. In this paper,
we present the Coordinate Particle Filter which alleviates this problem. We
propose to compute the particle weights recursively, dimension by dimension.
This allows us to explore one dimension at a time, and resample after each
dimension if necessary. Experimental results on simulated as well as real data
confirm that the proposed method has a substantial performance advantage over
the Particle Filter in high-dimensional systems where not all dimensions are
highly correlated. We demonstrate the benefits of the proposed method for the
problem of multi-object and robotic manipulator tracking
Successful Arterial Embolisation of Giant Liver Haemangioma
A 28-year old man presented with a symptomatic giant haemangioma. On June 26, 1983, at laparotomy,
no resection was attempted because the lesion involved the right lobe of the liver and a part of segments
II and III. The patient underwent a right hepatic arterial embolisation with gelatine sponge particles.
During follow-up, the patient remained asymptomatic. Five-year review by CT-scan showed a diminution
of the size of the haemangioma and hypertrophy of the left lobe. On October 21, 1988, the patient
was reoperated on for liver abscess and complete necrosis of the haemangioma. A right hepatectomy
was performed. In conclusion, the long-term effect of hepatic arterial embolisation, as demonstrated in
our case by regular CT-scans, is useful in cases of diffuse haemangioma as an alternative to hazardous
major liver resection. To our knowledge, the long-term effect of hepatic arterial embolisation on
symptoms and tumor size have never been reported for giant liver haemangioma
Acidic Residues Control the Dimerization of the N-terminal Domain of Black Widow Spiders’ Major Ampullate Spidroin 1
Dragline silk is the most prominent amongst spider silks and comprises two types of major ampullate spidroins (MaSp) differing in their proline content. In the natural spinning process, the conversion of soluble MaSp into a tough fiber is, amongst other factors, triggered by dimerization and conformational switching of their helical amino-terminal domains (NRN). Both processes are induced by protonation of acidic residues upon acidification along the spinning duct. Here, the structure and monomer-dimer-equilibrium of the domain NRN1 of Latrodectus hesperus MaSp1 and variants thereof have been investigated, and the key residues for both could be identified. Changes in ionic composition and strength within the spinning duct enable electrostatic interactions between the acidic and basic pole of two monomers which prearrange into an antiparallel dimer. Upon naturally occurring acidification this dimer is stabilized by protonation of residue E114. A conformational change is independently triggered by protonation of clustered acidic residues (D39, E76, E81). Such step-by-step mechanism allows a controlled spidroin assembly in a pH- and salt sensitive manner, preventing premature aggregation of spider silk proteins in the gland and at the same time ensuring fast and efficient dimer formation and stabilization on demand in the spinning duct
Probabilistic Recurrent State-Space Models
State-space models (SSMs) are a highly expressive model class for learning
patterns in time series data and for system identification. Deterministic
versions of SSMs (e.g. LSTMs) proved extremely successful in modeling complex
time series data. Fully probabilistic SSMs, however, are often found hard to
train, even for smaller problems. To overcome this limitation, we propose a
novel model formulation and a scalable training algorithm based on doubly
stochastic variational inference and Gaussian processes. In contrast to
existing work, the proposed variational approximation allows one to fully
capture the latent state temporal correlations. These correlations are the key
to robust training. The effectiveness of the proposed PR-SSM is evaluated on a
set of real-world benchmark datasets in comparison to state-of-the-art
probabilistic model learning methods. Scalability and robustness are
demonstrated on a high dimensional problem
Effect of Combined Methamphetamine and Oxycodone Use on the Synaptic Proteome in an In Vitro Model of Polysubstance Use
Polysubstance use (PSU) generally involves the simultaneous use of an opioid along with a stimulant. In recent years, this problem has escalated into a nationwide epidemic. Understanding the mechanisms and effects underlying the interaction between these drugs is essential for the development of treatments for those suffering from addiction. Currently, the effect of PSU on synapses-critical points of contact between neurons-remains poorly understood. Using an in vitro model of primary neurons, we examined the combined effects of the psychostimulant methamphetamine (METH) and the prescription opioid oxycodone (oxy) on the synaptic proteome using quantitative mass-spectrometry-based proteomics. A further ClueGO analysis and Ingenuity Pathway Analysis (IPA) indicated the dysregulation of several molecular functions, biological processes, and pathways associated with neural plasticity and structural development. We identified one key synaptic protein, Striatin-1, which plays a vital role in many of these processes and functions, to be downregulated following METH+oxy treatment. This downregulation of Striatin-1 was further validated by Western blot. Overall, the present study indicates several damaging effects of the combined use of METH and oxy on neural function and warrants further detailed investigation into mechanisms contributing to synaptic dysfunction
Learning modular policies for robotics
A promising idea for scaling robot learning to more complex tasks is to use elemental behaviors as building blocks to compose more complex behavior. Ideally, such building blocks are used in combination with a learning algorithm that is able to learn to select, adapt, sequence and co-activate the building blocks. While there has been a lot of work on approaches that support one of these requirements, no learning algorithm exists that unifies all these properties in one framework. In this paper we present our work on a unified approach for learning such a modular control architecture. We introduce new policy search algorithms that are based on information-theoretic principles and are able to learn to select, adapt and sequence the building blocks. Furthermore, we developed a new representation for the individual building block that supports co-activation and principled ways for adapting the movement. Finally, we summarize our experiments for learning modular control architectures in simulation and with real robots
- …