266 research outputs found

    Multi-Agent Pursuit-Evasion Game Based on Organizational Architecture

    Get PDF
    Multi-agent coordination mechanisms are frequently used in pursuit-evasion games with the aim of enabling the coalitions of the pursuers and unifying their individual skills to deal with the complex tasks encountered. In this paper, we propose a coalition formation algorithm based on organizational principles and applied to the pursuit-evasion problem. In order to allow the alliances of the pursuers in different pursuit groups, we have used the concepts forming an organizational modeling framework known as YAMAM (Yet Another Multi Agent Model). Specifically, we have used the concepts Agent, Role, Task, and Skill, proposed in this model to develop a coalition formation algorithm to allow the optimal task sharing. To control the pursuers\u27 path planning in the environment as well as their internal development during the pursuit, we have used a Reinforcement learning method (Q-learning). Computer simulations reflect the impact of the proposed techniques

    Deep Reinforcement Learning for Swarm Systems

    Full text link
    Recently, deep reinforcement learning (RL) methods have been applied successfully to multi-agent scenarios. Typically, these methods rely on a concatenation of agent states to represent the information content required for decentralized decision making. However, concatenation scales poorly to swarm systems with a large number of homogeneous agents as it does not exploit the fundamental properties inherent to these systems: (i) the agents in the swarm are interchangeable and (ii) the exact number of agents in the swarm is irrelevant. Therefore, we propose a new state representation for deep multi-agent RL based on mean embeddings of distributions. We treat the agents as samples of a distribution and use the empirical mean embedding as input for a decentralized policy. We define different feature spaces of the mean embedding using histograms, radial basis functions and a neural network learned end-to-end. We evaluate the representation on two well known problems from the swarm literature (rendezvous and pursuit evasion), in a globally and locally observable setup. For the local setup we furthermore introduce simple communication protocols. Of all approaches, the mean embedding representation using neural network features enables the richest information exchange between neighboring agents facilitating the development of more complex collective strategies.Comment: 31 pages, 12 figures, version 3 (published in JMLR Volume 20

    On the role and opportunities in teamwork design for advanced multi-robot search systems

    Get PDF
    Intelligent robotic systems are becoming ever more present in our lives across a multitude of domains such as industry, transportation, agriculture, security, healthcare and even education. Such systems enable humans to focus on the interesting and sophisticated tasks while robots accomplish tasks that are either too tedious, routine or potentially dangerous for humans to do. Recent advances in perception technologies and accompanying hardware, mainly attributed to rapid advancements in the deep-learning ecosystem, enable the deployment of robotic systems equipped with onboard sensors as well as the computational power to perform autonomous reasoning and decision making online. While there has been significant progress in expanding the capabilities of single and multi-robot systems during the last decades across a multitude of domains and applications, there are still many promising areas for research that can advance the state of cooperative searching systems that employ multiple robots. In this article, several prospective avenues of research in teamwork cooperation with considerable potential for advancement of multi-robot search systems will be visited and discussed. In previous works we have shown that multi-agent search tasks can greatly benefit from intelligent cooperation between team members and can achieve performance close to the theoretical optimum. The techniques applied can be used in a variety of domains including planning against adversarial opponents, control of forest fires and coordinating search-and-rescue missions. The state-of-the-art on methods of multi-robot search across several selected domains of application is explained, highlighting the pros and cons of each method, providing an up-to-date view on the current state of the domains and their future challenges

    Security Games for Node Localization through Verifiable Multilateration

    Get PDF
    Most applications of wireless sensor networks (WSNs) rely on data about the positions of sensor nodes, which are not necessarily known beforehand. Several localization approaches have been proposed but most of them omit to consider that WSNs could be deployed in adversarial settings, where hostile nodes under the control of an attacker coexist with faithful ones. Verifiable multilateration (VM) was proposed to cope with this problem by leveraging on a set of trusted landmark nodes that act as verifiers. Although VM is able to recognize reliable localization measures, it allows for regions of undecided positions that can amount to the 40 percent of the monitored area. We studied the properties of VM as a noncooperative two-player game where the first player employs a number of verifiers to do VM computations and the second player controls a malicious node. The verifiers aim at securely localizing malicious nodes, while malicious nodes strive to masquerade as unknown and to pretend false positions. Thanks to game theory, the potentialities of VM are analyzed with the aim of improving the defender's strategy. We found that the best placement for verifiers is an equilateral triangle with edge equal to the power range R, and maximum deception in the undecided region is approximately 0.27R. Moreover, we characterized-in terms of the probability of choosing an unknown node to examine further-the strategies of the players

    Progression Cognition Reinforcement Learning with Prioritized Experience for Multi-Vehicle Pursuit

    Get PDF
    Multi-vehicle pursuit (MVP) such as autonomous police vehicles pursuing suspects is important but very challenging due to its mission and safety-critical nature. While multi-agent reinforcement learning (MARL) algorithms have been proposed for MVP in structured grid-pattern roads, the existing algorithms use random training samples in centralized learning, which leads to homogeneous agents showing low collaboration performance. For the more challenging problem of pursuing multiple evaders, these algorithms typically select a fixed target evader for pursuers without considering dynamic traffic situation, which significantly reduces pursuing success rate. To address the above problems, this paper proposes a Progression Cognition Reinforcement Learning with Prioritized Experience for MVP (PEPCRL-MVP) in urban multi-intersection dynamic traffic scenes. PEPCRL-MVP uses a prioritization network to assess the transitions in the global experience replay buffer according to each MARL agent’s parameters. With the personalized and prioritized experience set selected via the prioritization network, diversity is introduced to the MARL learning process, which can improve collaboration and task-related performance. Furthermore, PEPCRL-MVP employs an attention module to extract critical features from dynamic urban traffic environments. These features are used to develop a progression cognition method to adaptively group pursuing vehicles. Each group efficiently targets one evading vehicle. Extensive experiments conducted with a simulator over unstructured roads of an urban area show that PEPCRL-MVP is superior to other state-of-the-art methods. Specifically, PEPCRL-MVP improves pursuing efficiency by 3.95 % over Twin Delayed Deep Deterministic policy gradient-Decentralized Multi-Agent Pursuit and its success rate is 34.78 % higher than that of Multi-Agent Deep Deterministic Policy Gradient. Codes are open-sourced

    Deep Reinforcement Learning for Swarm Systems

    Get PDF
    Recently, deep reinforcement learning (RL) methods have been applied successfully to multi-agent scenarios. Typically, the observation vector for decentralized decision making is represented by a concatenation of the (local) information an agent gathers about other agents. However, concatenation scales poorly to swarm systems with a large number of homogeneous agents as it does not exploit the fundamental properties inherent to these systems: (i) the agents in the swarm are interchangeable and (ii) the exact number of agents in the swarm is irrelevant. Therefore, we propose a new state representation for deep multi-agent RL based on mean embeddings of distributions, where we treat the agents as samples and use the empirical mean embedding as input for a decentralized policy. We define different feature spaces of the mean embedding using histograms, radial basis functions and neural networks trained end-to-end. We evaluate the representation on two well-known problems from the swarm literature in a globally and locally observable setup. For the local setup we furthermore introduce simple communication protocols. Of all approaches, the mean embedding representation using neural network features enables the richest information exchange between neighboring agents, facilitating the development of complex collective strategies

    Towards Trust and Transparency in Deep Learning Systems through Behavior Introspection & Online Competency Prediction

    Get PDF
    Deep neural networks are naturally “black boxes”, offering little insight into how or why they make decisions. These limitations diminish the adoption likelihood of such systems for important tasks and as trusted teammates. We employ introspective techniques to abstract machine activation patterns into human-interpretable strategies and identify relationships between environmental conditions (why), strategies (how), and performance (result) on both a deep reinforcement learning two-dimensional pursuit game application and image-based deep supervised learning obstacle recognition application. Pursuit-evasion games have been studied for decades under perfect information and analytically-derived policies for static environments. We incorporate uncertainty in a target’s position via simulated measurements and demonstrate a novel continuous deep reinforcement learning approach against speed-advantaged targets. The resulting approach was tested under many scenarios and performance exceeded that of a baseline course-aligned strategy. We manually observed separation of learned pursuit behaviors into strategy groups and manually hypothesized environmental conditions that affected performance. These manual observations motivated automation and abstraction of conditions, performance and strategy relationships. Next, we found that deep network activation patterns could be abstracted into human-interpretable strategies for two separate deep learning approaches. We characterized machine commitment by the introduction of a novel measure and revealed significant correlations between machine commitment, strategies, environmental conditions, and task performance. As such, we motivated online exploitation of machine behavior estimation for competency-aware intelligent systems. And finally, we realized online prediction capabilities for conditions, strategies, and performance. Our competency-aware machine learning approach is easily portable to new applications due to its Bayesian nonparametric foundation, wherein all inputs are compactly transformed into the same compact data representation. In particular, image data is transformed into a probability distribution over features extracted from the data. The resulting transformation forms a common representation for comparing two images, possibly from different types of sensors. By uncovering relationships between environmental conditions (why), machine strategies (how), & performance (result) and by giving rise to online estimation of machine competency, we increase transparency and trust in machine learning systems, contributing to the overarching explainable artificial intelligence initiative.

    Autonomous Highway Systems Safety and Security

    Get PDF
    Automated vehicles are getting closer each day to large-scale deployment. It is expected that self-driving cars will be able to alleviate traffic congestion by safely operating at distances closer than human drivers are capable of and will overall improve traffic throughput. In these conditions, passenger safety and security is of utmost importance. When multiple autonomous cars follow each other on a highway, they will form what is known as a cyber-physical system. In a general setting, there are tools to assess the level of influence a possible attacker can have on such a system, which then describes the level of safety and security. An attacker might attempt to counter the benefits of automation by causing collisions and/or decreasing highway throughput. These strings (platoons) of automated vehicles will rely on control algorithms to maintain required distances from other cars and objects around them. The vehicle dynamics themselves and the controllers used will form the cyber-physical system and its response to an attacker can be assessed in the context of multiple interacting vehicles. While the vehicle dynamics play a pivotal role in the security of this system, the choice of controller can also be leveraged to enhance the safety of such a system. After knowledge of some attacker capabilities, adversarial-aware controllers can be designed to react to the presence of an attacker, adding an extra level of security. This work will attempt to address these issues in vehicular platooning. Firstly, a general analysis concerning the capabilities of possible attacks in terms of control system theory will be presented. Secondly, mitigation strategies to some of these attacks will be discussed. Finally, the results of an experimental validation of these mitigation strategies and their implications will be shown

    Fixed-wing UAV tracking of evasive targets in 3-dimensional space

    Get PDF
    In this thesis, we explore the development of autonomous tracking and interception strategies for single and multiple fixed-wing Unmanned Aerial Vehicles (UAVs) pursuing single or multiple evasive targets in 3-dimensional (3D) space. We considered a scenario where we intend to protect high-value facilities from adversarial groups employing ground-based vehicles and quadrotor swarms and focused on solving the target tracking problem. Accordingly, we refined a min-max optimal control algorithm for fixed-wing UAVs tracking ground-based targets, by introducing constraints on bank angles and turn rates to enhance actuator reliability when pursuing agile and evasive targets. An intelligent and persistent evasive control strategy for the target was also devised to ensure robust performance testing and optimisation. These strategies were extended to 3D space, incorporating three altitude control algorithms to facilitate flexible UAV altitude control, leveraging various parameters such as desired UAV altitude and image size on the tracking camera lens. A novel evasive quadrotor algorithm was introduced, systematically testing UAV tracking efficacy against various evasive scenarios while implementing anti-collision measures to ensure UAV safety and adaptive optimisation improve the achieved performance. Using decentralised control strategies, cooperative tracking by multiple UAVs of single evasive quadrotor-type and dynamic target clusters was developed along with a new altitude control strategy and task assignment logic for efficient target interception. Lastly, a countermeasure strategy for tracking and neutralising non-cooperative adversarial targets within restricted airspace was implemented, using both Nonlinear Model Predictive Control (NMPC) and optimal controllers. The major contributions of this thesis include optimal control strategies, evasive target control, 3D target tracking, altitude control, cooperative multi-UAV tracking, adaptive optimisation, high-precision projectile algorithms, and countermeasures. We envision practical applications of the findings from this research in surveillance, security, search and rescue, agriculture, environmental monitoring, drone defence, and autonomous delivery systems. Future efforts to extend this research could explore adaptive evasion, enhanced collaborative UAV swarms, machine learning integration, sensor technologies, and real-world testing
    • …
    corecore