42 research outputs found
View-Action Representation Learning for Active First-Person Vision
In visual navigation, a moving agent equipped with a camera is traditionally controlled by an input action and the estimation of the features from a sensory state (i.e. the camera view) is treated as a pre-processing step to perform high-level vision tasks. In this paper, we present a representation learning approach that, instead, considers both state and action as inputs. We condition the encoded feature from the state transition network on the action that changes the view of the camera, thus describing the scene more effectively. Specifically, we introduce an action representation module that generates decoded higher dimensional representations from an input action to increase the representational power. We then fuse the output from the action representation module with the intermediate response of the state transition network that predicts the future state. To enhance the discrimination capability among predictions from different input actions, we further introduce triplet ranking loss and N-tuplet loss functions, which in turn can be integrated with the regression loss. We demonstrate the proposed representation learning approach in reinforcement and imitation learning-based mapless navigation tasks, where the camera agent learns to navigate only through the view of the camera and the performed action, without external information
Mapless Navigation among Dynamics with Social-safety-awareness: a reinforcement learning approach from 2D laser scans
We propose a method to tackle the problem of mapless collision-avoidance
navigation where humans are present using 2D laser scans. Our proposed method
uses ego-safety to measure collision from the robot's perspective while
social-safety to measure the impact of our robot's actions on surrounding
pedestrians. Specifically, the social-safety part predicts the intrusion impact
of our robot's action into the interaction area with surrounding humans. We
train the policy using reinforcement learning on a simple simulator and
directly evaluate the learned policy in Gazebo and real robot tests.
Experiments show the learned policy can be smoothly transferred without any
fine tuning. We observe that our method demonstrates time-efficient path
planning behavior with high success rate in mapless navigation tasks.
Furthermore, we test our method in a navigation among dynamic crowds task
considering both low and high volume traffic. Our learned policy demonstrates
cooperative behavior that actively drives our robot into traffic flows while
showing respect to nearby pedestrians. Evaluation videos are at
https://sites.google.com/view/ssw-batmanComment: Accepted in ICRA 202
Reinforcement Learning for Self-exploration in Narrow Spaces
In narrow spaces, motion planning based on the traditional hierarchical
autonomous system could cause collisions due to mapping, localization, and
control noises. Additionally, it is disabled when mapless. To tackle these
problems, we leverage deep reinforcement learning which is verified to be
effective in self-decision-making, to self-explore in narrow spaces without a
map while avoiding collisions. Specifically, based on our Ackermann-steering
rectangular-shaped ZebraT robot and its Gazebo simulator, we propose the
rectangular safety region to represent states and detect collisions for
rectangular-shaped robots, and a carefully crafted reward function for
reinforcement learning that does not require the destination information. Then
we benchmark five reinforcement learning algorithms including DDPG, DQN, SAC,
PPO, and PPO-discrete, in a simulated narrow track. After training, the
well-performed DDPG and DQN models can be transferred to three brand new
simulated tracks, and furthermore to three real-world tracks
Deep Reinforcement Learning-Based Mapless Crowd Navigation with Perceived Risk of the Moving Crowd for Mobile Robots
Current state-of-the-art crowd navigation approaches are mainly deep
reinforcement learning (DRL)-based. However, DRL-based methods suffer from the
issues of generalization and scalability. To overcome these challenges, we
propose a method that includes a Collision Probability (CP) in the observation
space to give the robot a sense of the level of danger of the moving crowd to
help the robot navigate safely through crowds with unseen behaviors. We studied
the effects of changing the number of moving obstacles to pay attention during
navigation. During training, we generated local waypoints to increase the
reward density and improve the learning efficiency of the system. Our approach
was developed using deep reinforcement learning (DRL) and trained using the
Gazebo simulator in a non-cooperative crowd environment with obstacles moving
at randomized speeds and directions. We then evaluated our model on four
different crowd-behavior scenarios. The results show that our method achieved a
100% success rate in all test settings. We compared our approach with a current
state-of-the-art DRL-based approach, and our approach has performed
significantly better, especially in terms of social safety. Importantly, our
method can navigate in different crowd behaviors and requires no fine-tuning
after being trained once. We further demonstrated the crowd navigation
capability of our model in real-world tests.Comment: 6 pages, 7 figure
Enhancing Exploration and Safety in Deep Reinforcement Learning
A Deep Reinforcement Learning (DRL) agent tries to learn a policy maximizing a long-term objective by trials and errors in large state spaces. However, this learning paradigm requires a non-trivial amount of interactions in the environment to achieve good performance. Moreover, critical applications, such as robotics, typically involve safety criteria to consider while designing novel DRL solutions. Hence, devising safe learning approaches with efficient exploration is crucial to avoid getting stuck in local optima, failing to learn properly, or causing damages to the surrounding environment. This thesis focuses on developing Deep Reinforcement Learning algorithms to foster efficient exploration and safer behaviors in simulation and real domains of interest, ranging from robotics to multi-agent systems. To this end, we rely both on standard benchmarks, such as SafetyGym, and robotic tasks widely adopted in the literature (e.g., manipulation, navigation). This variety of problems is crucial to assess the statistical significance of our empirical studies and the generalization skills of our approaches. We initially benchmark the sample efficiency versus performance trade-off between value-based and policy-gradient algorithms. This part highlights the benefits of using non-standard simulation environments (i.e., Unity), which also facilitates the development of further optimization for DRL. We also discuss the limitations of standard evaluation metrics (e.g., return) in characterizing the actual behaviors of a policy, proposing the use of Formal Verification (FV) as a practical methodology to evaluate behaviors over desired specifications. The second part introduces Evolutionary Algorithms (EAs) as a gradient-free complimentary optimization strategy. In detail, we combine population-based and gradient-based DRL to diversify exploration and improve performance both in single and multi-agent applications. For the latter, we discuss how prior Multi-Agent (Deep) Reinforcement Learning (MARL) approaches hinder exploration, proposing an architecture that favors cooperation without affecting exploration
Goal-Guided Transformer-Enabled Reinforcement Learning for Efficient Autonomous Navigation
Despite some successful applications of goal-driven navigation, existing deep
reinforcement learning (DRL)-based approaches notoriously suffers from poor
data efficiency issue. One of the reasons is that the goal information is
decoupled from the perception module and directly introduced as a condition of
decision-making, resulting in the goal-irrelevant features of the scene
representation playing an adversary role during the learning process. In light
of this, we present a novel Goal-guided Transformer-enabled reinforcement
learning (GTRL) approach by considering the physical goal states as an input of
the scene encoder for guiding the scene representation to couple with the goal
information and realizing efficient autonomous navigation. More specifically,
we propose a novel variant of the Vision Transformer as the backbone of the
perception system, namely Goal-guided Transformer (GoT), and pre-train it with
expert priors to boost the data efficiency. Subsequently, a reinforcement
learning algorithm is instantiated for the decision-making system, taking the
goal-oriented scene representation from the GoT as the input and generating
decision commands. As a result, our approach motivates the scene representation
to concentrate mainly on goal-relevant features, which substantially enhances
the data efficiency of the DRL learning process, leading to superior navigation
performance. Both simulation and real-world experimental results manifest the
superiority of our approach in terms of data efficiency, performance,
robustness, and sim-to-real generalization, compared with other
state-of-the-art (SOTA) baselines. The demonstration video
(https://www.youtube.com/watch?v=aqJCHcsj4w0) and the source code
(https://github.com/OscarHuangWind/DRL-Transformer-SimtoReal-Navigation) are
also provided