8,770 research outputs found
On the Inducibility of Stackelberg Equilibrium for Security Games
Strong Stackelberg equilibrium (SSE) is the standard solution concept of
Stackelberg security games. As opposed to the weak Stackelberg equilibrium
(WSE), the SSE assumes that the follower breaks ties in favor of the leader and
this is widely acknowledged and justified by the assertion that the defender
can often induce the attacker to choose a preferred action by making an
infinitesimal adjustment to her strategy. Unfortunately, in security games with
resource assignment constraints, the assertion might not be valid; it is
possible that the defender cannot induce the desired outcome. As a result, many
results claimed in the literature may be overly optimistic. To remedy, we first
formally define the utility guarantee of a defender strategy and provide
examples to show that the utility of SSE can be higher than its utility
guarantee. Second, inspired by the analysis of leader's payoff by Von Stengel
and Zamir (2004), we provide the solution concept called the inducible
Stackelberg equilibrium (ISE), which owns the highest utility guarantee and
always exists. Third, we show the conditions when ISE coincides with SSE and
the fact that in general case, SSE can be extremely worse with respect to
utility guarantee. Moreover, introducing the ISE does not invalidate existing
algorithmic results as the problem of computing an ISE polynomially reduces to
that of computing an SSE. We also provide an algorithmic implementation for
computing ISE, with which our experiments unveil the empirical advantage of the
ISE over the SSE.Comment: The Thirty-Third AAAI Conference on Artificial Intelligenc
Social Network Based Substance Abuse Prevention via Network Modification (A Preliminary Study)
Substance use and abuse is a significant public health problem in the United
States. Group-based intervention programs offer a promising means of preventing
and reducing substance abuse. While effective, unfortunately, inappropriate
intervention groups can result in an increase in deviant behaviors among
participants, a process known as deviancy training. This paper investigates the
problem of optimizing the social influence related to the deviant behavior via
careful construction of the intervention groups. We propose a Mixed Integer
Optimization formulation that decides on the intervention groups, captures the
impact of the groups on the structure of the social network, and models the
impact of these changes on behavior propagation. In addition, we propose a
scalable hybrid meta-heuristic algorithm that combines Mixed Integer
Programming and Large Neighborhood Search to find near-optimal network
partitions. Our algorithm is packaged in the form of GUIDE, an AI-based
decision aid that recommends intervention groups. Being the first quantitative
decision aid of this kind, GUIDE is able to assist practitioners, in particular
social workers, in three key areas: (a) GUIDE proposes near-optimal solutions
that are shown, via extensive simulations, to significantly improve over the
traditional qualitative practices for forming intervention groups; (b) GUIDE is
able to identify circumstances when an intervention will lead to deviancy
training, thus saving time, money, and effort; (c) GUIDE can evaluate current
strategies of group formation and discard strategies that will lead to deviancy
training. In developing GUIDE, we are primarily interested in substance use
interventions among homeless youth as a high risk and vulnerable population.
GUIDE is developed in collaboration with Urban Peak, a homeless-youth serving
organization in Denver, CO, and is under preparation for deployment
Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making
In multi-objective decision planning and learning, much attention is paid to
producing optimal solution sets that contain an optimal policy for every
possible user preference profile. We argue that the step that follows, i.e,
determining which policy to execute by maximising the user's intrinsic utility
function over this (possibly infinite) set, is under-studied. This paper aims
to fill this gap. We build on previous work on Gaussian processes and pairwise
comparisons for preference modelling, extend it to the multi-objective decision
support scenario, and propose new ordered preference elicitation strategies
based on ranking and clustering. Our main contribution is an in-depth
evaluation of these strategies using computer and human-based experiments. We
show that our proposed elicitation strategies outperform the currently used
pairwise methods, and found that users prefer ranking most. Our experiments
further show that utilising monotonicity information in GPs by using a linear
prior mean at the start and virtual comparisons to the nadir and ideal points,
increases performance. We demonstrate our decision support framework in a
real-world study on traffic regulation, conducted with the city of Amsterdam.Comment: AAMAS 2018, Source code at
https://github.com/lmzintgraf/gp_pref_elici
A Study of AI Population Dynamics with Million-agent Reinforcement Learning
We conduct an empirical study on discovering the ordered collective dynamics
obtained by a population of intelligence agents, driven by million-agent
reinforcement learning. Our intention is to put intelligent agents into a
simulated natural context and verify if the principles developed in the real
world could also be used in understanding an artificially-created intelligent
population. To achieve this, we simulate a large-scale predator-prey world,
where the laws of the world are designed by only the findings or logical
equivalence that have been discovered in nature. We endow the agents with the
intelligence based on deep reinforcement learning (DRL). In order to scale the
population size up to millions agents, a large-scale DRL training platform with
redesigned experience buffer is proposed. Our results show that the population
dynamics of AI agents, driven only by each agent's individual self-interest,
reveals an ordered pattern that is similar to the Lotka-Volterra model studied
in population biology. We further discover the emergent behaviors of collective
adaptations in studying how the agents' grouping behaviors will change with the
environmental resources. Both of the two findings could be explained by the
self-organization theory in nature.Comment: Full version of the paper presented at AAMAS 2018 (International
Conference on Autonomous Agents and Multiagent Systems
Trust Strategies for the Semantic Web
Everyone agrees on the importance of enabling trust on the SemanticWebto ensure more efficient agent interaction. Current research on trust seems to focus on developing computational models, semantic representations, inference techniques, etc. However, little attention has been given to the plausible trust strategies or tactics that an agent can follow when interacting with other agents on the Semantic Web. In this paper we identify five most common strategies of trust and discuss their envisaged costs and benefits. The aim is to provide some guidelines to help system developers appreciate the risks and gains involved with each trust strategy
Evolution of a supply chain management game for the trading agent competition
TAC SCM is a supply chain management game for the Trading Agent Competition (TAC). The purpose of TAC is to spur high quality research into realistic trading agent problems. We discuss TAC and TAC SCM: game and competition design, scientific impact, and lessons learnt
- …