1,998 research outputs found
On Reward Shaping for Mobile Robot Navigation: A Reinforcement Learning and SLAM Based Approach
We present a map-less path planning algorithm based on Deep Reinforcement
Learning (DRL) for mobile robots navigating in unknown environment that only
relies on 40-dimensional raw laser data and odometry information. The planner
is trained using a reward function shaped based on the online knowledge of the
map of the training environment, obtained using grid-based Rao-Blackwellized
particle filter, in an attempt to enhance the obstacle awareness of the agent.
The agent is trained in a complex simulated environment and evaluated in two
unseen ones. We show that the policy trained using the introduced reward
function not only outperforms standard reward functions in terms of convergence
speed, by a reduction of 36.9\% of the iteration steps, and reduction of the
collision samples, but it also drastically improves the behaviour of the agent
in unseen environments, respectively by 23\% in a simpler workspace and by 45\%
in a more clustered one. Furthermore, the policy trained in the simulation
environment can be directly and successfully transferred to the real robot. A
video of our experiments can be found at: https://youtu.be/UEV7W6e6Zq
Position-agnostic autonomous navigation in vineyards with Deep Reinforcement Learning
Precision agriculture is rapidly attracting research to efficiently introduce automation and robotics solutions to support agricultural activities. Robotic navigation in vineyards and orchards offers competitive advantages in autonomously monitoring and easily accessing crops for harvesting, spraying and performing time-consuming necessary tasks. Nowadays, autonomous navigation algorithms exploit expensive sensors which also require heavy computational cost for data processing. Nonetheless, vineyard rows represent a challenging outdoor scenario where GPS and Visual Odometry techniques often struggle to provide reliable positioning information. In this work, we combine Edge AI with Deep Reinforcement Learning to propose a cutting-edge lightweight solution to tackle the problem of autonomous vineyard navigation with-out exploiting precise localization data and overcoming task-tailored algorithms with a flexible learning-based approach. We train an end-to-end sensorimotor agent which directly maps noisy depth images and position-agnostic robot state information to velocity commands and guides the robot to the end of a row, continuously adjusting its heading for a collision-free central trajectory. Our extensive experimentation in realistic simulated vineyards demonstrates the effectiveness of our solution and the generalization capabilities of our agent
Enhancing Exploration and Safety in Deep Reinforcement Learning
A Deep Reinforcement Learning (DRL) agent tries to learn a policy maximizing a long-term objective by trials and errors in large state spaces. However, this learning paradigm requires a non-trivial amount of interactions in the environment to achieve good performance. Moreover, critical applications, such as robotics, typically involve safety criteria to consider while designing novel DRL solutions. Hence, devising safe learning approaches with efficient exploration is crucial to avoid getting stuck in local optima, failing to learn properly, or causing damages to the surrounding environment. This thesis focuses on developing Deep Reinforcement Learning algorithms to foster efficient exploration and safer behaviors in simulation and real domains of interest, ranging from robotics to multi-agent systems. To this end, we rely both on standard benchmarks, such as SafetyGym, and robotic tasks widely adopted in the literature (e.g., manipulation, navigation). This variety of problems is crucial to assess the statistical significance of our empirical studies and the generalization skills of our approaches. We initially benchmark the sample efficiency versus performance trade-off between value-based and policy-gradient algorithms. This part highlights the benefits of using non-standard simulation environments (i.e., Unity), which also facilitates the development of further optimization for DRL. We also discuss the limitations of standard evaluation metrics (e.g., return) in characterizing the actual behaviors of a policy, proposing the use of Formal Verification (FV) as a practical methodology to evaluate behaviors over desired specifications. The second part introduces Evolutionary Algorithms (EAs) as a gradient-free complimentary optimization strategy. In detail, we combine population-based and gradient-based DRL to diversify exploration and improve performance both in single and multi-agent applications. For the latter, we discuss how prior Multi-Agent (Deep) Reinforcement Learning (MARL) approaches hinder exploration, proposing an architecture that favors cooperation without affecting exploration
- …