6,401 research outputs found

    Learning Multi-Pursuit Evasion for Safe Targeted Navigation of Drones

    Full text link
    Safe navigation of drones in the presence of adversarial physical attacks from multiple pursuers is a challenging task. This paper proposes a novel approach, asynchronous multi-stage deep reinforcement learning (AMS-DRL), to train adversarial neural networks that can learn from the actions of multiple evolved pursuers and adapt quickly to their behavior, enabling the drone to avoid attacks and reach its target. Specifically, AMS-DRL evolves adversarial agents in a pursuit-evasion game where the pursuers and the evader are asynchronously trained in a bipartite graph way during multiple stages. Our approach guarantees convergence by ensuring Nash equilibrium among agents from the game-theory analysis. We evaluate our method in extensive simulations and show that it outperforms baselines with higher navigation success rates. We also analyze how parameters such as the relative maximum speed affect navigation performance. Furthermore, we have conducted physical experiments and validated the effectiveness of the trained policies in real-time flights. A success rate heatmap is introduced to elucidate how spatial geometry influences navigation outcomes. Project website: https://github.com/NTU-ICG/AMS-DRL-for-Pursuit-Evasion.Comment: Accepted by IEEE Transactions on Artificial Intelligenc

    Planning in the face of immovable subjects: a dialogue about resistance to development forces

    Get PDF
    Urban development can often seem an irresistible force. The imperatives of development are deeply inscribed in the DNA of liberal capitalist societies. As well as realising profit-making opportunities for the private sector, urban change is a mechanism for (re)generating neighbourhoods, for providing public goods such as waste management, energy generation or public housing. The state may seek to mediate, ameliorate or shape development forces, thereby alleviating tensions and inequalities between divergent publics, and establishing claims to a greater public interest in certain forms of change. As it does so, state support may make development seem even more irresistible, especially if space for political challenge closes down. Yet, the seemingly irresistible force often summons seemingly immovable subjects of resistance: namely citizens and campaign groups who stand against planned changes and declare: ‘we shall not be moved’. Sometimes resistance dissolves with meaningful public input and project improvements; sometimes it remains steadfast in its opposition. The ‘immovable subjects’ who resist are mobilised by concerns to which we may be more or less sympathetic: perceived threats to valued place attachments and identities; outrage at environmental injustices; the desire to defend private property rights; racism and anti-immigrant sentiment. Whether singly or collectively, these claims and their nuanced interpretations can motivate intractable and sometimes violent opposition. The starting point for this Interface is a view that contemporary planning theory and practice continue to struggle with the complex and ambiguous political and ethical challenges posed by the forms of opposition that coalesce around state-mediated urban development. How can, and how should, the ‘essential injustices’ (Davy, 1997) that planning and development generate be managed and distributed? Can meaningful engagement with opposition address tensions and contribute to better outcomes? The implications for representative democracy and collaborative governance are no less profound: from the local to the global, resistance and opposition are central but also often disruptive to the democratic exercise of power.info:eu-repo/semantics/publishedVersio

    The Epistemic Bases of Changes of Opinion and Choices: The Joint Effects of the Need for Cognitive Closure, Ascribed Epistemic Authority and Quality of Advice

    Get PDF
    This research investigates the epistemic underpinnings of changes of opinion and choices. Based on the Lay Epistemic Theory (Kruglanski et al., 2009) and consistent with relevant theories of persuasion (e.g., Chaiken, Liberman, & Eagly, 1989; Kruglanski, & Thompson, 1999; Petty & Cacioppo, 1986), we hypothesized that individuals with a high (vs. low) need for cognitive closure would be more influenced by the high (vs. low) level of the epistemic authority of an advisor, and would be less influenced by the quality of the provided advice. These hypotheses were supported in two experimental studies (Total N=352) within two different domains of decision-making (a legal case in Study 1 and consumer behavior in Study 2). The theoretical and practical implications of the results are discussed

    Exploiting opponent behavior in multi-agent systems

    Get PDF
    Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 201

    Learning Policies from Self-Play with Policy Gradients and MCTS Value Estimates

    Get PDF
    In recent years, state-of-the-art game-playing agents often involve policies that are trained in self-playing processes where Monte Carlo tree search (MCTS) algorithms and trained policies iteratively improve each other. The strongest results have been obtained when policies are trained to mimic the search behaviour of MCTS by minimising a cross-entropy loss. Because MCTS, by design, includes an element of exploration, policies trained in this manner are also likely to exhibit a similar extent of exploration. In this paper, we are interested in learning policies for a project with future goals including the extraction of interpretable strategies, rather than state-of-the-art game-playing performance. For these goals, we argue that such an extent of exploration is undesirable, and we propose a novel objective function for training policies that are not exploratory. We derive a policy gradient expression for maximising this objective function, which can be estimated using MCTS value estimates, rather than MCTS visit counts. We empirically evaluate various properties of resulting policies, in a variety of board games.Comment: Accepted at the IEEE Conference on Games (CoG) 201
    • …
    corecore