124 research outputs found
Multiagent Flight Control in Dynamic Environments with Cooperative Coevolutionary Algorithms
Dynamic flight environments in which objectives and environmental features change with respect to time pose a difficult problem with regards to planning optimal flight paths. Path planning methods are typically computationally expensive, and are often difficult to implement in real time if system objectives are changed. This computational problem is compounded when multiple agents are present in the system, as the state and action space grows exponentially. In this work, we use cooperative coevolutionary algorithms in order to develop policies which control agent motion in a dynamic multiagent unmanned aerial system environment such that goals and perceptions change, while ensuring safety constraints are not violated. Rather than replanning new paths when the environment changes, we develop a policy which can map the new environmental features to a trajectory for the agent while ensuring safe and reliable operation, while providing 92% of the theoretically optimal performanc
Super Ball Bot - Structures for Planetary Landing and Exploration
Small, light-weight and low-cost missions will become increasingly important to NASA's exploration goals for our solar system. Ideally teams of dozens or even hundreds of small, collapsable robots, weighing only a few kilograms a piece, will be conveniently packed during launch and would reliably separate and unpack at their destination. Such teams will allow rapid, reliable in-situ exploration of hazardous destination such as Titan, where imprecise terrain knowledge and unstable precipitation cycles make single-robot exploration problematic. Unfortunately landing many lightweight conventional robots is difficult with conventional technology. Current robot designs are delicate, requiring combinations of devices such as parachutes, retrorockets and impact balloons to minimize impact forces and to place a robot in a proper orientation. Instead we propose to develop a radically different robot based on a "tensegrity" built purely upon tensile and compression elements. These robots can be light-weight, absorb strong impacts, are redundant against single-point failures, can recover from different landing orientations and are easy to collapse and uncollapse. We believe tensegrity robot technology can play a critical role in future planetary exploration
Recommended from our members
Adaptive Multiagent Traffic Management for Autonomous Robotic Systems
There is growing commercial interest in the use of unmanned aerial vehicles (UAVs) in urban environments, specifically for package delivery applications. However, the size, complexity and sheer numbers of expected UAVs makes conventional air traffic management that relies on human air traffic controllers infeasible. To enable UAVs to safely and efficiently operate in congested environments, it is essential to develop autonomous UAV management strategies.
We introduce a dynamic hierarchical traffic control model that reacts to traffic conditions instantaneously to reduce congestion in the airspace. An obstacle-filled airspace lends itself to a modelling as a graph structure similar to a road network. We introduce controller agents, which set costs across the airspace. These agents control traffic similarly to adaptive metering lights in highway traffic. UAVs then plan their paths based on the costs (e.g. conflicts, or delays) they see for traversing particular parts of the airspace. This provides us a decentralized method for reducing traffic in an airspace
Our hierarchical structure allows us to separate the traffic reduction problem from the individual robot navigation problem. Each robot does not explicitly coordinate with others in the airspace. Instead, robots execute their own individual internal cost-based planner to travel between locations. We then use neuro-evolution to provide incentives to these cost-based planners to reduce traffic in the environment.
Traffic quality can be expressed in several different ways. We first evaluate traffic our traffic reduction policies in terms of `conflicts', which characterizes situations where an aircraft comes too close to another for safety in a physical space. We then examine traffic in terms of the amount of `delay' that all agents incur, which assumes that there is a structure to ensure only a safe number of UAVs occupy the same area. Finally, we look at the total travel time that a UAV can expect to take from the moment it enters the airspace until the time it gets to its destination.
To facilitate an exploration of the UTM problem without waiting for a full simulation of UAVS running with A* , we develop an abstraction of the UTM domain that preserves the core UTM problem. We then investigate performance under differing levels of traffic, a well as two different agent structures. Our results show similar performance for both agent definitions, with delay reduction of up to 68% in high traffic cases.
With a fast version of the UTM problem, we explore the effect of redefining the control structure such that links, or edges of the UTM graph, set costs individually. This shifts the control paradigm toward controlling directional travel rather than areas in the space, as was the case with sector agents used in previous approaches. Due to our graph structure, we find that there are far more control elements in the link agent approach than in the sector agent approach. We identify a tradeoff; link agents give finer control, but the coordination problem for the sector agents is easier because there are fewer sector agents. This indicates that we can improve performance out of a more distributed link-based setup if we address the challenges of multiagent coordination. However, the UAV traffic management domain presents a uniquely difficult coordination problem; each agent's action can affect the perceived value of every other agent's actions. This means that there is an excessive amount of noise in the system, as another agent's action can have a lot of impact on the reward an agent receives.
We reduce the amount of multiagent noise by reducing the number of agents that are capable of learning. We identify that some agents have more ability to influence traffic based on the topology and traffic profile of the graph. This metric we call impactfulness. We use this metric to improve the learning by removing less impactful agents from the learning process, making a more stationary system in which the impactful agents can learn.
The contributions of this work are to:
- Introduce a cost-based traffic management approach that is platform-agnostic and fast to implement.
- Develop a multiagent approach to setting costs in this traffic management system that is adaptive to traffic conditions and learns long-term effects of management decisions.
- Create an abstraction of UAV traffic that captures key physical attributes, creating a fast and flexible simulation method.
- Quantify agent contributions to system performance by experimenting with single agent learning, single agent exclusion, and a sliding number of agents learning in the system.Keywords: Planning, UAV, Multiagen
Human–machine network through bio‑inspired decentralized swarm intelligence and heterogeneous teaming in SAR operations
Disaster management has always been a struggle due to unpredictable changing conditions and chaotic occurrences that require real-time adaption. Highly optimized missions and robust systems mitigate uncertainty effects and improve notoriously success rates. This paper brings a niching hybrid human–machine system that combines UAVs fast responsiveness with two robust, decentralized, and scalable bio-inspired techniques. Cloud-Sharing Network (CSN) and Pseudo-Central Network (PCN), based on Bacterial and Honeybee behaviors, are presented, and applied to Safe and Rescue (SAR) operations. A post-earthquake scenario is proposed, where a heterogeneous fleet of UAVs cooperates with human rescue teams to detect and locate victims distributed along the map. Monte Carlo simulations are carried out to test both approaches through state-of-the-art metrics. This paper introduces two hybrid and bio-inspired schemes to deal with critical scouting stages, poor communications environments and high uncertainly levels in disaster release operations. Role heterogeneity, path optimization and hive data-sharing structure give PCN an efficient performance as far as task allocation and communications are concerned. Cloud-sharing network gains strength when the allocated agents per victim and square meter is high, allowing fast data transmission. Potential applications of these algorithms are not only comprehended in SAR field, but also in surveillance, geophysical mapping, security and planetary exploration
Recommended from our members
Theoretical and implementation improvements for difference evaluation functions
Multiagent learning with cooperative coevolutionary algorithms is a critical area of research, and is relevant to many real-world applications including air traffic control, distributed sensor network control, and game-theoretic applications such as border patrol. A key difficulty in multiagent learning is the credit assignment problem, where the impact of each individual agent on the overall system performance must be ascertained. Difference evaluation functions aim to solve this credit assignment problem, by approximating the effect that each agent has on the system evaluation function. Difference evaluations have proven to produce superior learned policies in many multiagent settings.
Although difference evaluations have produced excellent empirical results, there are still three key research questions that must be addressed regarding their usefulness in real-world systems. More specifically, the performance, theoretical advantages, and methodology for implementation must be addressed in order to demonstrate that difference evaluations are practical for use in real-world multiagent learning. These research questions are addressed in this dissertation. The first contribution of this dissertation is to demonstrate that difference evaluations may be extended and combined with other coordination mechanisms, resulting in superior learned performance. The second contribution of this dissertation is to derive conditions which guarantee that difference evaluations will outperform traditional coordination mechanisms. The third and final contribution of this dissertation is to demonstrate that difference evaluations may be approximated using only local knowledge, allowing for their implementation in any generic multiagent learning setting. By addressing the performance, theoretical foundation, and implementation concerns of difference evaluations, this dissertation provides a detailed analysis demonstrating the usefulness of difference evaluation functions in multiagent learning systems
Recommended from our members
A Neuro-evolutionary Approach to Control Surface Segmentation for Micro Aerial Vehicles
This paper addresses control surface segmentation in micro aerial vehicles (MAVs) by leveraging neuro-evolutionary techniques that allow the control of a higher number of control surfaces. Applying classical control methods to MAVs is a difficult process due to the complexity of the control laws with fast and highly non-linear dynamics. These methods are mostly based on models that are difficult to obtain for dynamic and stochastic environments. Moreover, these problems are exacerbated when both the number of control surfaces increases and the model’s accuracy in determining the impact of each control surface decreases. Instead, we focus on neuro-evolutionary techniques that have been successfully applied in many domains with limited models and highly non-linear dynamics. Wind tunnel simulations with AVL show that MAV performances are improved in terms of both reduced deflection angles and reduced drag (up to 5%) over a simplified model in two sets of experiments with different objective functions. We also show robustness to actuator failure with desired roll moment values still attained with failed actuators in the system through the neuro-controller.Keywords: Evolutionary algorithms, Micro Aerial Vehicles, Neural Network
Computational intelligence approaches to robotics, automation, and control [Volume guest editors]
No abstract available
Recommended from our members
Tackling Credit Assignment Using Memory and Multilevel Optimization for Multiagent Reinforcement Learning
There is growing commercial interest in the use of multiagent systems in real world applications. Some examples include inventory management in warehouses, smart homes, planetary exploration, search and rescue, air-traffic management and autonomous transportation systems. However, multiagent coordination is an extremely challenging problem. First, information relevant for coordination is often distributed across the team members, and fragmented amongst each agent's observation histories (past states). Second, the coordination objective is often sparse and noisy from the perspective of an agent. Designing general mechanisms of generating agent-specific reward functions that incentivizes an agent to collaborate towards the shared global objective is extremely difficult. From a learning perspective, both difficulties can be linked to the difficulty of credit assignment - the process of accurately associating rewards with actions.
The primary contribution of this dissertation is to tackle credit assignment in multiagent systems in order to enable better multiagent coordination. First we leverage memory as a tool in enabling better credit assignment by facilitating associations between rewards and actions separated across time. We achieve this by introducing Modular Memory Units (MMU), a memory-augmented neural architecture that can reliably retain and propagate information over an extended period of time. We then use MMU to augment individual agents' policies in solving dynamic tasks that require adaptive behavior from a distributed multiagent team. We also introduce Distributed MMU (DMMU) which uses memory as a shared knowledge base across a team of distributed agents to enable distributed one-shot decision making.
Switching our attention from the agent to the learning algorithm, we then introduce Evolutionary Reinforcement Learning (ERL), a multilevel optimization framework that blends the strength of policy gradients and evolutionary algorithms to improve learning. We further extend the ERL framework to introduce Collaborative ERL (CERL) which employs a collection of policy gradient learners (portfolio), each optimizing over varying resolution of the same underlying task. This leads to a diverse set of policies that are able to reach diverse regions within the solution space. Results in a range of continuous control benchmarks demonstrate that ERL and CERL significantly outperform their composite learners while remaining overall more sample-efficient.
Finally, we introduce Multiagent ERL (MERL), a hybrid algorithm that leverages the multilevel optimization framework of ERL to enable improved multiagent coordination without requiring explicit alignment between local and global reward functions. MERL uses fast, policy-gradient based learning for each agent by utilizing their dense local rewards. Concurrently, evolution is used to recruit agents into a team by directly optimizing the sparser global objective. Experiments in multiagent coordination benchmarks demonstrate that MERL's integrated approach significantly outperforms the state-of-the-art multiagent policy-gradient algorithms
Modeling Team Performance For Coordination Configurations Of Large Multi-Agent Teams Using Stochastic Neural Networks
Coordination of large numbers of agents to perform complex tasks in complex domains is a rapidly progressing area of research. Because of the high complexity of the problem, approximate and heuristic algorithms are typically used for key coordination tasks. Such algorithms usually require tuning algorithm parameters to yield the best performance under particular circumstances. Manually tuning parameters is sometimes difficult. In domains where characteristics of the environment can vary dramatically from scenario to scenario, it is desirable to have automated techniques for appropriately configuring the coordination. This research presents an approach to online reconfiguration of heuristic coordination algorithms. The approach uses an abstract simulation to produce a large performance data set to train a stochastic neural network that concisely models the complex, probabilistic relationship between configurations, environments and performance metrics. The final stochastic neural network, referred as the team performance model, is then used as the core of a tool that allows rapid online or offline configuration of coordination algorithms to particular scenarios and user preferences. The overall system allows rapid adaptation of coordination, leading to better performance in new scenarios. Results show that the team performance model captured key features of a very large configuration space and mostly captured the uncertainty in performance well. The tool was shown to be often capable of reconfiguring the algorithms to meet user requests for increases or decreases in performance parameters. This work represents the first practical approach to quickly reconfiguring a complex set of algorithms for a specific scenario
- …