2,567 research outputs found
Adversarial Generation of Informative Trajectories for Dynamics System Identification
Dynamic System Identification approaches usually heavily rely on the
evolutionary and gradient-based optimisation techniques to produce optimal
excitation trajectories for determining the physical parameters of robot
platforms. Current optimisation techniques tend to generate single
trajectories. This is expensive, and intractable for longer trajectories, thus
limiting their efficacy for system identification. We propose to tackle this
issue by using multiple shorter cyclic trajectories, which can be generated in
parallel, and subsequently combined together to achieve the same effect as a
longer trajectory. Crucially, we show how to scale this approach even further
by increasing the generation speed and quality of the dataset through the use
of generative adversarial network (GAN) based architectures to produce a large
databases of valid and diverse excitation trajectories. To the best of our
knowledge, this is the first robotics work to explore system identification
with multiple cyclic trajectories and to develop GAN-based techniques for
scaleably producing excitation trajectories that are diverse in both control
parameter and inertial parameter spaces. We show that our approach dramatically
accelerates trajectory optimisation, while simultaneously providing more
accurate system identification than the conventional approach.Comment: Accepted for publication in IEEE iROS 202
Tracking by Prediction: A Deep Generative Model for Mutli-Person localisation and Tracking
Current multi-person localisation and tracking systems have an over reliance
on the use of appearance models for target re-identification and almost no
approaches employ a complete deep learning solution for both objectives. We
present a novel, complete deep learning framework for multi-person localisation
and tracking. In this context we first introduce a light weight sequential
Generative Adversarial Network architecture for person localisation, which
overcomes issues related to occlusions and noisy detections, typically found in
a multi person environment. In the proposed tracking framework we build upon
recent advances in pedestrian trajectory prediction approaches and propose a
novel data association scheme based on predicted trajectories. This removes the
need for computationally expensive person re-identification systems based on
appearance features and generates human like trajectories with minimal
fragmentation. The proposed method is evaluated on multiple public benchmarks
including both static and dynamic cameras and is capable of generating
outstanding performance, especially among other recently proposed deep neural
network based approaches.Comment: To appear in IEEE Winter Conference on Applications of Computer
Vision (WACV), 201
SOM-VAE: Interpretable Discrete Representation Learning on Time Series
High-dimensional time series are common in many domains. Since human
cognition is not optimized to work well in high-dimensional spaces, these areas
could benefit from interpretable low-dimensional representations. However, most
representation learning algorithms for time series data are difficult to
interpret. This is due to non-intuitive mappings from data features to salient
properties of the representation and non-smoothness over time. To address this
problem, we propose a new representation learning framework building on ideas
from interpretable discrete dimensionality reduction and deep generative
modeling. This framework allows us to learn discrete representations of time
series, which give rise to smooth and interpretable embeddings with superior
clustering performance. We introduce a new way to overcome the
non-differentiability in discrete representation learning and present a
gradient-based version of the traditional self-organizing map algorithm that is
more performant than the original. Furthermore, to allow for a probabilistic
interpretation of our method, we integrate a Markov model in the representation
space. This model uncovers the temporal transition structure, improves
clustering performance even further and provides additional explanatory
insights as well as a natural representation of uncertainty. We evaluate our
model in terms of clustering performance and interpretability on static
(Fashion-)MNIST data, a time series of linearly interpolated (Fashion-)MNIST
images, a chaotic Lorenz attractor system with two macro states, as well as on
a challenging real world medical time series application on the eICU data set.
Our learned representations compare favorably with competitor methods and
facilitate downstream tasks on the real world data.Comment: Accepted for publication at the Seventh International Conference on
Learning Representations (ICLR 2019
On learning and generalization in unstructured taskspaces
L'apprentissage robotique est incroyablement prometteur pour l'intelligence artificielle incarnée, avec un apprentissage par renforcement apparemment parfait pour les robots du futur: apprendre de l'expérience, s'adapter à la volée et généraliser à des scénarios invisibles.
Cependant, notre réalité actuelle nécessite de grandes quantités de données pour former la plus simple des politiques d'apprentissage par renforcement robotique, ce qui a suscité un regain d'intérêt de la formation entièrement dans des simulateurs de physique efficaces. Le but étant l'intelligence incorporée, les politiques formées à la simulation sont transférées sur du matériel réel pour évaluation; cependant, comme aucune simulation n'est un modèle parfait du monde réel, les politiques transférées se heurtent à l'écart de transfert sim2real: les erreurs se sont produites lors du déplacement des politiques des simulateurs vers le monde réel en raison d'effets non modélisés dans des modèles physiques inexacts et approximatifs.
La randomisation de domaine - l'idée de randomiser tous les paramètres physiques dans un simulateur, forçant une politique à être robuste aux changements de distribution - s'est avérée utile pour transférer des politiques d'apprentissage par renforcement sur de vrais robots. En pratique, cependant, la méthode implique un processus difficile, d'essais et d'erreurs, montrant une grande variance à la fois en termes de convergence et de performances. Nous introduisons Active Domain Randomization, un algorithme qui implique l'apprentissage du curriculum dans des espaces de tâches non structurés (espaces de tâches où une notion de difficulté - tâches intuitivement faciles ou difficiles - n'est pas facilement disponible). La randomisation de domaine active montre de bonnes performances sur le pourrait utiliser zero shot sur de vrais robots. La thèse introduit également d'autres variantes de l'algorithme, dont une qui permet d'incorporer un a priori de sécurité et une qui s'applique au domaine de l'apprentissage par méta-renforcement. Nous analysons également l'apprentissage du curriculum dans une perspective d'optimisation et tentons de justifier les avantages de l'algorithme en étudiant les interférences de gradient.Robotic learning holds incredible promise for embodied artificial intelligence, with reinforcement learning seemingly a strong candidate to be the \textit{software} of robots of the future: learning from experience, adapting on the fly, and generalizing to unseen scenarios.
However, our current reality requires vast amounts of data to train the simplest of robotic reinforcement learning policies, leading to a surge of interest of training entirely in efficient physics simulators. As the goal is embodied intelligence, policies trained in simulation are transferred onto real hardware for evaluation; yet, as no simulation is a perfect model of the real world, transferred policies run into the sim2real transfer gap: the errors accrued when shifting policies from simulators to the real world due to unmodeled effects in inaccurate, approximate physics models.
Domain randomization - the idea of randomizing all physical parameters in a simulator, forcing a policy to be robust to distributional shifts - has proven useful in transferring reinforcement learning policies onto real robots. In practice, however, the method involves a difficult, trial-and-error process, showing high variance in both convergence and performance. We introduce Active Domain Randomization, an algorithm that involves curriculum learning in unstructured task spaces (task spaces where a notion of difficulty - intuitively easy or hard tasks - is not readily available). Active Domain Randomization shows strong performance on zero-shot transfer on real robots. The thesis also introduces other variants of the algorithm, including one that allows for the incorporation of a safety prior and one that is applicable to the field of Meta-Reinforcement Learning. We also analyze curriculum learning from an optimization perspective and attempt to justify the benefit of the algorithm by studying gradient interference
Generative neural data synthesis for autonomous systems
A significant number of Machine Learning methods for automation currently rely on
data-hungry training techniques. The lack of accessible training data often represents
an insurmountable obstacle, especially in the fields of robotics and automation, where
acquiring new data can be far from trivial. Additional data acquisition is not only often
expensive and time-consuming, but occasionally is not even an option. Furthermore,
the real world applications sometimes have commercial sensitivity issues associated
with the distribution of the raw data.
This doctoral thesis explores bypassing the aforementioned difficulties by synthesising new realistic and diverse datasets using the Generative Adversarial Network (GAN).
The success of this approach is demonstrated empirically through solving a variety of
case-specific data-hungry problems, via application of novel GAN-based techniques
and architectures.
Specifically, it starts with exploring the use of GANs for the realistic simulation of
the extremely high-dimensional underwater acoustic imagery for the purpose of training
both teleoperators and autonomous target recognition systems. We have developed a
method capable of generating realistic sonar data of any chosen dimension by image-translation GANs with Markov principle.
Following this, we apply GAN-based models to robot behavioural repertoire generation, that enables a robot manipulator to successfully overcome unforeseen impedances,
such as unknown sets of obstacles and random broken joints scenarios.
Finally, we consider dynamical system identification for articulated robot arms. We
show how using diversity-driven GAN models to generate exploratory trajectories can
allow dynamic parameters to be identified more efficiently and accurately than with
conventional optimisation approaches.
Together, these results show that GANs have the potential to benefit a variety of
robotics learning problems where training data is currently a bottleneck
- …