52 research outputs found

    Improving Wildlife Monitoring using a Multi-criteria Cooperative Target Observation Approach

    Get PDF
    Wildlife Monitoring is very important for maintaining sustainability of environment. In this paper we pose Wildlife Monitoring as Cooperative Target Observation (CTO) problem and propose a Multi Criteria Decision Analysis (MCDA) based algorithm named MCDA-CTO, to maximize the observation of different animal species by Unmanned Aerial Vehicles (UAVs) and to effectively handle multiple target types and the multiple criteria that arise due to targets and environmental factors, during decision making. UAVs have uncertainty in observation of targets which makes it challenging to develop a high-quality monitoring strategy. We therefore develop monitoring techniques that explicitly take actions to improve belief about the true type of targets being observed. In wildlife monitoring, it is often reasonable to assume that the observers may themselves be a subject of observation by unknown adversaries (poachers). Randomizing the observer’s actions can therefore help to make the target observation strategy less predictable. We then provide experimental validation that shows that the techniques we develop provide a higher (true positive/true negative) ratio along with better randomization than state of the art approaches

    Successor features based multi-agent RL for event-based decentralized MDPs

    Get PDF
    Decentralized MDPs (Dec-MDPs) provide a rigorous framework for collaborative multi-agent sequential decisionmaking under uncertainty. However, their computational complexity limits the practical impact. To address this, we focus on a class of Dec-MDPs consisting of independent collaborating agents that are tied together through a global reward function that depends upon their entire histories of states and actions to accomplish joint tasks. To overcome scalability barrier, our main contributions are: (a) We propose a new actor-critic based Reinforcement Learning (RL) approach for event-based Dec-MDPs using successor features (SF) which is a value function representation that decouples the dynamics of the environment from the rewards; (b) We then present Dec-ESR (Decentralized Event based Successor Representation) which generalizes learning for event-based Dec-MDPs using SF within an end-to-end deep RL framework; (c) We also show that Dec-ESR allows useful transfer of information on related but different tasks, hence bootstraps the learning for faster convergence on new tasks; (d) For validation purposes, we test our approach on a large multi-agent coverage problem which models schedule coordination of agents in a real urban subway network and achieves better quality solutions than previous best approaches

    Imitative Follower Deception in Stackelberg Games

    Full text link
    Information uncertainty is one of the major challenges facing applications of game theory. In the context of Stackelberg games, various approaches have been proposed to deal with the leader's incomplete knowledge about the follower's payoffs, typically by gathering information from the leader's interaction with the follower. Unfortunately, these approaches rely crucially on the assumption that the follower will not strategically exploit this information asymmetry, i.e., the follower behaves truthfully during the interaction according to their actual payoffs. As we show in this paper, the follower may have strong incentives to deceitfully imitate the behavior of a different follower type and, in doing this, benefit significantly from inducing the leader into choosing a highly suboptimal strategy. This raises a fundamental question: how to design a leader strategy in the presence of a deceitful follower? To answer this question, we put forward a basic model of Stackelberg games with (imitative) follower deception and show that the leader is indeed able to reduce the loss due to follower deception with carefully designed policies. We then provide a systematic study of the problem of computing the optimal leader policy and draw a relatively complete picture of the complexity landscape; essentially matching positive and negative complexity results are provided for natural variants of the model. Our intractability results are in sharp contrast to the situation with no deception, where the leader's optimal strategy can be computed in polynomial time, and thus illustrate the intrinsic difficulty of handling follower deception. Through simulations we also examine the benefit of considering follower deception in randomly generated games

    Computation of Stackelberg Equilibria of Finite Sequential Games

    Get PDF
    The Stackelberg equilibrium solution concept describes optimal strategies to commit to: Player 1 (termed the leader) publicly commits to a strategy and Player 2 (termed the follower) plays a best response to this strategy (ties are broken in favor of the leader). We study Stackelberg equilibria in finite sequential games (or extensive-form games) and provide new exact algorithms, approximate algorithms, and hardness results for several classes of these sequential games

    Effect of human biases on human-agent teams

    Get PDF

    Multiagent Teamwork: Hybrid Approaches

    Get PDF
    Conference paper published in CSI Communications</p

    CB+NN Ensemble to Improve Tracking Accuracy in Air Surveillance

    No full text
    Finding or tracking the location of an object accurately is a crucial problem in defense applications, robotics and computer vision. Radars fall into the spectrum of high-end defense sensors or systems upon which the security and surveillance of the entire world depends. There has been a lot of focus on the topic of Multi Sensor Tracking in recent years, with radars as the sensors. The Indian Air Force uses a Multi Sensor Tracking (MST) system to detect flights pan India, developed and supported by BEL(Bharat Electronics Limited), a defense agency we are working with. In this paper, we describe our Machine Learning approach, which is built on top of the existing system, the Air force uses. For purposes of this work, we trained our models on about 13 million anonymized real Multi Sensor tracking data points provided by radars performing tracking activity across the Indian air space. The approach has shown an increase in the accuracy of tracking by 5 percent from 91 to 96. The model and the corresponding code were transitioned to BEL, which has been tested in their simulation environment with a plan to take forward for ground testing. Our approach comprises of 3 steps: (a) We train a Neural Network model and a CatBoost model and ensemble them using a Logistic Regression model to predict one type of error, namely Splitting error, which can help to improve the accuracy of tracking. (b) We again train a Neural Network model and a CatBoost model and ensemble them using a different Logistic Regression model to predict the second type of error, namely Merging error, which can further improve the accuracy of tracking. (c) We use cosine similarity to find the nearest neighbour and correct the data points, predicted to have Splitting/Merging errors, by predicting the original global track of these data points
    corecore