We present a map-less path planning algorithm based on Deep Reinforcement
Learning (DRL) for mobile robots navigating in unknown environment that only
relies on 40-dimensional raw laser data and odometry information. The planner
is trained using a reward function shaped based on the online knowledge of the
map of the training environment, obtained using grid-based Rao-Blackwellized
particle filter, in an attempt to enhance the obstacle awareness of the agent.
The agent is trained in a complex simulated environment and evaluated in two
unseen ones. We show that the policy trained using the introduced reward
function not only outperforms standard reward functions in terms of convergence
speed, by a reduction of 36.9\% of the iteration steps, and reduction of the
collision samples, but it also drastically improves the behaviour of the agent
in unseen environments, respectively by 23\% in a simpler workspace and by 45\%
in a more clustered one. Furthermore, the policy trained in the simulation
environment can be directly and successfully transferred to the real robot. A
video of our experiments can be found at: https://youtu.be/UEV7W6e6Zq