3,661 research outputs found
Local Motion Planner for Autonomous Navigation in Vineyards with a RGB-D Camera-Based Algorithm and Deep Learning Synergy
With the advent of agriculture 3.0 and 4.0, researchers are increasingly
focusing on the development of innovative smart farming and precision
agriculture technologies by introducing automation and robotics into the
agricultural processes. Autonomous agricultural field machines have been
gaining significant attention from farmers and industries to reduce costs,
human workload, and required resources. Nevertheless, achieving sufficient
autonomous navigation capabilities requires the simultaneous cooperation of
different processes; localization, mapping, and path planning are just some of
the steps that aim at providing to the machine the right set of skills to
operate in semi-structured and unstructured environments. In this context, this
study presents a low-cost local motion planner for autonomous navigation in
vineyards based only on an RGB-D camera, low range hardware, and a dual layer
control algorithm. The first algorithm exploits the disparity map and its depth
representation to generate a proportional control for the robotic platform.
Concurrently, a second back-up algorithm, based on representations learning and
resilient to illumination variations, can take control of the machine in case
of a momentaneous failure of the first block. Moreover, due to the double
nature of the system, after initial training of the deep learning model with an
initial dataset, the strict synergy between the two algorithms opens the
possibility of exploiting new automatically labeled data, coming from the
field, to extend the existing model knowledge. The machine learning algorithm
has been trained and tested, using transfer learning, with acquired images
during different field surveys in the North region of Italy and then optimized
for on-device inference with model pruning and quantization. Finally, the
overall system has been validated with a customized robot platform in the
relevant environment
Spatiotemporal Attention Enhances Lidar-Based Robot Navigation in Dynamic Environments
Foresighted robot navigation in dynamic indoor environments with
cost-efficient hardware necessitates the use of a lightweight yet dependable
controller. So inferring the scene dynamics from sensor readings without
explicit object tracking is a pivotal aspect of foresighted navigation among
pedestrians. In this paper, we introduce a spatiotemporal attention pipeline
for enhanced navigation based on 2D lidar sensor readings. This pipeline is
complemented by a novel lidar-state representation that emphasizes dynamic
obstacles over static ones. Subsequently, the attention mechanism enables
selective scene perception across both space and time, resulting in improved
overall navigation performance within dynamic scenarios. We thoroughly
evaluated the approach in different scenarios and simulators, finding good
generalization to unseen environments. The results demonstrate outstanding
performance compared to state-of-the-art methods, thereby enabling the seamless
deployment of the learned controller on a real robot
On learning and generalization in unstructured taskspaces
L'apprentissage robotique est incroyablement prometteur pour l'intelligence artificielle incarnée, avec un apprentissage par renforcement apparemment parfait pour les robots du futur: apprendre de l'expérience, s'adapter à la volée et généraliser à des scénarios invisibles.
Cependant, notre réalité actuelle nécessite de grandes quantités de données pour former la plus simple des politiques d'apprentissage par renforcement robotique, ce qui a suscité un regain d'intérêt de la formation entièrement dans des simulateurs de physique efficaces. Le but étant l'intelligence incorporée, les politiques formées à la simulation sont transférées sur du matériel réel pour évaluation; cependant, comme aucune simulation n'est un modèle parfait du monde réel, les politiques transférées se heurtent à l'écart de transfert sim2real: les erreurs se sont produites lors du déplacement des politiques des simulateurs vers le monde réel en raison d'effets non modélisés dans des modèles physiques inexacts et approximatifs.
La randomisation de domaine - l'idée de randomiser tous les paramètres physiques dans un simulateur, forçant une politique à être robuste aux changements de distribution - s'est avérée utile pour transférer des politiques d'apprentissage par renforcement sur de vrais robots. En pratique, cependant, la méthode implique un processus difficile, d'essais et d'erreurs, montrant une grande variance à la fois en termes de convergence et de performances. Nous introduisons Active Domain Randomization, un algorithme qui implique l'apprentissage du curriculum dans des espaces de tâches non structurés (espaces de tâches où une notion de difficulté - tâches intuitivement faciles ou difficiles - n'est pas facilement disponible). La randomisation de domaine active montre de bonnes performances sur le pourrait utiliser zero shot sur de vrais robots. La thèse introduit également d'autres variantes de l'algorithme, dont une qui permet d'incorporer un a priori de sécurité et une qui s'applique au domaine de l'apprentissage par méta-renforcement. Nous analysons également l'apprentissage du curriculum dans une perspective d'optimisation et tentons de justifier les avantages de l'algorithme en étudiant les interférences de gradient.Robotic learning holds incredible promise for embodied artificial intelligence, with reinforcement learning seemingly a strong candidate to be the \textit{software} of robots of the future: learning from experience, adapting on the fly, and generalizing to unseen scenarios.
However, our current reality requires vast amounts of data to train the simplest of robotic reinforcement learning policies, leading to a surge of interest of training entirely in efficient physics simulators. As the goal is embodied intelligence, policies trained in simulation are transferred onto real hardware for evaluation; yet, as no simulation is a perfect model of the real world, transferred policies run into the sim2real transfer gap: the errors accrued when shifting policies from simulators to the real world due to unmodeled effects in inaccurate, approximate physics models.
Domain randomization - the idea of randomizing all physical parameters in a simulator, forcing a policy to be robust to distributional shifts - has proven useful in transferring reinforcement learning policies onto real robots. In practice, however, the method involves a difficult, trial-and-error process, showing high variance in both convergence and performance. We introduce Active Domain Randomization, an algorithm that involves curriculum learning in unstructured task spaces (task spaces where a notion of difficulty - intuitively easy or hard tasks - is not readily available). Active Domain Randomization shows strong performance on zero-shot transfer on real robots. The thesis also introduces other variants of the algorithm, including one that allows for the incorporation of a safety prior and one that is applicable to the field of Meta-Reinforcement Learning. We also analyze curriculum learning from an optimization perspective and attempt to justify the benefit of the algorithm by studying gradient interference
DOP: Deep Optimistic Planning with Approximate Value Function Evaluation
Research on reinforcement learning has demonstrated promising results in
manifold applications and domains. Still, efficiently learning effective robot
behaviors is very difficult, due to unstructured scenarios, high uncertainties,
and large state dimensionality (e.g. multi-agent systems or hyper-redundant
robots). To alleviate this problem, we present DOP, a deep model-based
reinforcement learning algorithm, which exploits action values to both (1)
guide the exploration of the state space and (2) plan effective policies.
Specifically, we exploit deep neural networks to learn Q-functions that are
used to attack the curse of dimensionality during a Monte-Carlo tree search.
Our algorithm, in fact, constructs upper confidence bounds on the learned value
function to select actions optimistically. We implement and evaluate DOP on
different scenarios: (1) a cooperative navigation problem, (2) a fetching task
for a 7-DOF KUKA robot, and (3) a human-robot handover with a humanoid robot
(both in simulation and real). The obtained results show the effectiveness of
DOP in the chosen applications, where action values drive the exploration and
reduce the computational demand of the planning process while achieving good
performance.Comment: to appear as an extended abstract paper in the Proc. of the 17th
International Conference on Autonomous Agents and Multiagent Systems (AAMAS
2018), Stockholm, Sweden, July 10-15, 2018, IFAAMAS. arXiv admin note: text
overlap with arXiv:1803.0029
Towards Target-Driven Visual Navigation in Indoor Scenes via Generative Imitation Learning
We present a target-driven navigation system to improve mapless visual
navigation in indoor scenes. Our method takes a multi-view observation of a
robot and a target as inputs at each time step to provide a sequence of actions
that move the robot to the target without relying on odometry or GPS at
runtime. The system is learned by optimizing a combinational objective
encompassing three key designs. First, we propose that an agent conceives the
next observation before making an action decision. This is achieved by learning
a variational generative module from expert demonstrations. We then propose
predicting static collision in advance, as an auxiliary task to improve safety
during navigation. Moreover, to alleviate the training data imbalance problem
of termination action prediction, we also introduce a target checking module to
differentiate from augmenting navigation policy with a termination action. The
three proposed designs all contribute to the improved training data efficiency,
static collision avoidance, and navigation generalization performance,
resulting in a novel target-driven mapless navigation system. Through
experiments on a TurtleBot, we provide evidence that our model can be
integrated into a robotic system and navigate in the real world. Videos and
models can be found in the supplementary material.Comment: 11 pages, accepted by IEEE Robotics and Automation Letter
- …