2,286 research outputs found
Reinforcement Learning and Planning for Preference Balancing Tasks
Robots are often highly non-linear dynamical systems with many degrees of freedom, making solving motion problems computationally challenging. One solution has been reinforcement learning (RL), which learns through experimentation to automatically perform the near-optimal motions that complete a task. However, high-dimensional problems and task formulation often prove challenging for RL. We address these problems with PrEference Appraisal Reinforcement Learning (PEARL), which solves Preference Balancing Tasks (PBTs). PBTs define a problem as a set of preferences that the system must balance to achieve a goal. The method is appropriate for acceleration-controlled systems with continuous state-space and either discrete or continuous action spaces with unknown system dynamics. We show that PEARL learns a sub-optimal policy on a subset of states and actions, and transfers the policy to the expanded domain to produce a more refined plan on a class of robotic problems. We establish convergence to task goal conditions, and even when preconditions are not verifiable, show that this is a valuable method to use before other more expensive approaches. Evaluation is done on several robotic problems, such as Aerial Cargo Delivery, Multi-Agent Pursuit, Rendezvous, and Inverted Flying Pendulum both in simulation and experimentally. Additionally, PEARL is leveraged outside of robotics as an array sorting agent. The results demonstrate high accuracy and fast learning times on a large set of practical applications
Social-aware drone navigation using social force model
Robot’s navigation is one of the hardest challenges to deal with, because
real environments imply highly dynamic objects moving in all directions.
The main ideal goal is to conduct a safe navigation within the environment,
avoiding obstacles and reaching the final proposed goal. Nowadays, with
the last advances in technology, we are able to see robots almost everywhere,
and this can lead us to think about the robot’s role in the future,
and where we would find them, and it is no exaggerated to say, that practically,
flying and land-based robots are going to live together with people,
interacting in our houses, streets and shopping centers. Moreover, we will
notice their presence, gradually inserted in our human societies, every time
doing more human tasks, which in the past years were unthinkable.
Therefore, if we think about robots moving or flying around us, we must
consider safety, the distance the robot should take to make the human feel
comfortable, and the different reactions people would have. The main goal
of this work is to accompany people making use of a flying robot. The term
social navigation gives us the path to follow when we talk about a social environment.
Robots must be able to navigate between humans, giving sense
of security to those who are walking close to them. In this work, we present
a model called Social Force Model, which states that the human social interaction
between persons and objects is inspired in the fluid dynamics de-
fined by Newton’s equations, and also, we introduce the extended version
which complements the initial method with the human-robot interaction
force.
In the robotics field, the use of tools for helping the development and
the implementation part are crucial. The fast advances in technology allows
the international community to have access to cheaper and more compact
hardware and software than a decade ago. It is becoming more and
more usual to have access to more powerful technology which helps us to
run complex algorithms, and because of that, we can run bigger systems
in reduced space, making robots more intelligent, more compact and more
robust against failures. Our case was not an exception, in the next chapters
we will present the procedure we followed to implement the approaches,
supported by different simulation tools and software. Because of the nature
of the problem we were facing, we made use of Robotic Operating System
along with Gazebo, which help us to have a good outlook of how the code
will work in real-life experiments.
In this work, both real and simulated experiments are presented, in
which we expose the interaction conducted by the 3D Aerial Social Force
Model, between humans, objects and in this case the AR.Drone, a flying
drone property of the Instituto de Robótica e Informática Industrial. We
focus on making the drone navigation more socially acceptable by the humans
around; the main purpose of the drone is to accompany a person,
which we will call the "main" person in this work, who is going to try to
navigate side-by-side, with a behavior being dictated with some forces exerted
by the environment, and also is going to try to be the more socially
close acceptable possible to the remaining humans around. Also, it is presented
a comparison between the 3D Aerial Social Force Model and the
Artificial Potential Fields method, a well-known method and widely used
in robot navigation. We present both methods and the description of the
forces each one involves.
Along with these two models, there is also another important topic to
introduce. As we said, the robot must be able to accompany a pedestrian in
his way, and for that reason, the forecasting capacity is an important feature
since the robot does not know the final destination of the human to accompany.
It is essential to give it the ability to predict the human movements.
In this work, we used the differential values between the past position values
to know how much is changing through time. This gives us an accurate
idea of how the human would behave or which direction he/she would
take next.
Furthermore, we present a description of the human motion prediction
model based on linear regression. The motivation behind the idea of building
a Regression Model was the simplicity of the implementation, the robustness
and the very accurate results of the approach. The previous main
human positions are taken, in order to forecast the new position of the human,
the next seconds. This is done with the main purpose of letting the
drone know about the direction the human is taking, to move forward beside
the human, as if the drone was accompanying him. The optimization
for the linear regression model, to find the right weights for our model, was
carried out by gradient descent, implementing also de RMSprop variant in
order to reach convergence in a faster way. The strategy that was followed
to build the prediction model is explained with detail later in this work.
The presence of social robots has grown during the past years, many
researchers have contributed and many techniques are being used to give
them the capacity of interacting safely and effectively with the people, and
it is a hot topic which has matured a lot, but still there is many research to
be investigated
SoRTS: Learned Tree Search for Long Horizon Social Robot Navigation
The fast-growing demand for fully autonomous robots in shared spaces calls
for the development of trustworthy agents that can safely and seamlessly
navigate in crowded environments. Recent models for motion prediction show
promise in characterizing social interactions in such environments. Still,
adapting them for navigation is challenging as they often suffer from
generalization failures. Prompted by this, we propose Social Robot Tree Search
(SoRTS), an algorithm for safe robot navigation in social domains. SoRTS aims
to augment existing socially aware motion prediction models for long-horizon
navigation using Monte Carlo Tree Search.
We use social navigation in general aviation as a case study to evaluate our
approach and further the research in full-scale aerial autonomy. In doing so,
we introduce XPlaneROS, a high-fidelity aerial simulator that enables
human-robot interaction. We use XPlaneROS to conduct a first-of-its-kind user
study where 26 FAA-certified pilots interact with a human pilot, our algorithm,
and its ablation. Our results, supported by statistical evidence, show that
SoRTS exhibits a comparable performance to competent human pilots,
significantly outperforming its ablation. Finally, we complement these results
with a broad set of self-play experiments to showcase our algorithm's
performance in scenarios with increasing complexity.Comment: arXiv admin note: substantial text overlap with arXiv:2304.0142
- …