2,286 research outputs found

    Reinforcement Learning and Planning for Preference Balancing Tasks

    Get PDF
    Robots are often highly non-linear dynamical systems with many degrees of freedom, making solving motion problems computationally challenging. One solution has been reinforcement learning (RL), which learns through experimentation to automatically perform the near-optimal motions that complete a task. However, high-dimensional problems and task formulation often prove challenging for RL. We address these problems with PrEference Appraisal Reinforcement Learning (PEARL), which solves Preference Balancing Tasks (PBTs). PBTs define a problem as a set of preferences that the system must balance to achieve a goal. The method is appropriate for acceleration-controlled systems with continuous state-space and either discrete or continuous action spaces with unknown system dynamics. We show that PEARL learns a sub-optimal policy on a subset of states and actions, and transfers the policy to the expanded domain to produce a more refined plan on a class of robotic problems. We establish convergence to task goal conditions, and even when preconditions are not verifiable, show that this is a valuable method to use before other more expensive approaches. Evaluation is done on several robotic problems, such as Aerial Cargo Delivery, Multi-Agent Pursuit, Rendezvous, and Inverted Flying Pendulum both in simulation and experimentally. Additionally, PEARL is leveraged outside of robotics as an array sorting agent. The results demonstrate high accuracy and fast learning times on a large set of practical applications

    Social-aware drone navigation using social force model

    Get PDF
    Robot’s navigation is one of the hardest challenges to deal with, because real environments imply highly dynamic objects moving in all directions. The main ideal goal is to conduct a safe navigation within the environment, avoiding obstacles and reaching the final proposed goal. Nowadays, with the last advances in technology, we are able to see robots almost everywhere, and this can lead us to think about the robot’s role in the future, and where we would find them, and it is no exaggerated to say, that practically, flying and land-based robots are going to live together with people, interacting in our houses, streets and shopping centers. Moreover, we will notice their presence, gradually inserted in our human societies, every time doing more human tasks, which in the past years were unthinkable. Therefore, if we think about robots moving or flying around us, we must consider safety, the distance the robot should take to make the human feel comfortable, and the different reactions people would have. The main goal of this work is to accompany people making use of a flying robot. The term social navigation gives us the path to follow when we talk about a social environment. Robots must be able to navigate between humans, giving sense of security to those who are walking close to them. In this work, we present a model called Social Force Model, which states that the human social interaction between persons and objects is inspired in the fluid dynamics de- fined by Newton’s equations, and also, we introduce the extended version which complements the initial method with the human-robot interaction force. In the robotics field, the use of tools for helping the development and the implementation part are crucial. The fast advances in technology allows the international community to have access to cheaper and more compact hardware and software than a decade ago. It is becoming more and more usual to have access to more powerful technology which helps us to run complex algorithms, and because of that, we can run bigger systems in reduced space, making robots more intelligent, more compact and more robust against failures. Our case was not an exception, in the next chapters we will present the procedure we followed to implement the approaches, supported by different simulation tools and software. Because of the nature of the problem we were facing, we made use of Robotic Operating System along with Gazebo, which help us to have a good outlook of how the code will work in real-life experiments. In this work, both real and simulated experiments are presented, in which we expose the interaction conducted by the 3D Aerial Social Force Model, between humans, objects and in this case the AR.Drone, a flying drone property of the Instituto de Robótica e Informática Industrial. We focus on making the drone navigation more socially acceptable by the humans around; the main purpose of the drone is to accompany a person, which we will call the "main" person in this work, who is going to try to navigate side-by-side, with a behavior being dictated with some forces exerted by the environment, and also is going to try to be the more socially close acceptable possible to the remaining humans around. Also, it is presented a comparison between the 3D Aerial Social Force Model and the Artificial Potential Fields method, a well-known method and widely used in robot navigation. We present both methods and the description of the forces each one involves. Along with these two models, there is also another important topic to introduce. As we said, the robot must be able to accompany a pedestrian in his way, and for that reason, the forecasting capacity is an important feature since the robot does not know the final destination of the human to accompany. It is essential to give it the ability to predict the human movements. In this work, we used the differential values between the past position values to know how much is changing through time. This gives us an accurate idea of how the human would behave or which direction he/she would take next. Furthermore, we present a description of the human motion prediction model based on linear regression. The motivation behind the idea of building a Regression Model was the simplicity of the implementation, the robustness and the very accurate results of the approach. The previous main human positions are taken, in order to forecast the new position of the human, the next seconds. This is done with the main purpose of letting the drone know about the direction the human is taking, to move forward beside the human, as if the drone was accompanying him. The optimization for the linear regression model, to find the right weights for our model, was carried out by gradient descent, implementing also de RMSprop variant in order to reach convergence in a faster way. The strategy that was followed to build the prediction model is explained with detail later in this work. The presence of social robots has grown during the past years, many researchers have contributed and many techniques are being used to give them the capacity of interacting safely and effectively with the people, and it is a hot topic which has matured a lot, but still there is many research to be investigated

    SoRTS: Learned Tree Search for Long Horizon Social Robot Navigation

    Full text link
    The fast-growing demand for fully autonomous robots in shared spaces calls for the development of trustworthy agents that can safely and seamlessly navigate in crowded environments. Recent models for motion prediction show promise in characterizing social interactions in such environments. Still, adapting them for navigation is challenging as they often suffer from generalization failures. Prompted by this, we propose Social Robot Tree Search (SoRTS), an algorithm for safe robot navigation in social domains. SoRTS aims to augment existing socially aware motion prediction models for long-horizon navigation using Monte Carlo Tree Search. We use social navigation in general aviation as a case study to evaluate our approach and further the research in full-scale aerial autonomy. In doing so, we introduce XPlaneROS, a high-fidelity aerial simulator that enables human-robot interaction. We use XPlaneROS to conduct a first-of-its-kind user study where 26 FAA-certified pilots interact with a human pilot, our algorithm, and its ablation. Our results, supported by statistical evidence, show that SoRTS exhibits a comparable performance to competent human pilots, significantly outperforming its ablation. Finally, we complement these results with a broad set of self-play experiments to showcase our algorithm's performance in scenarios with increasing complexity.Comment: arXiv admin note: substantial text overlap with arXiv:2304.0142
    • …
    corecore