463 research outputs found
Online Convex Optimization for Sequential Decision Processes and Extensive-Form Games
Regret minimization is a powerful tool for solving large-scale extensive-form
games. State-of-the-art methods rely on minimizing regret locally at each
decision point. In this work we derive a new framework for regret minimization
on sequential decision problems and extensive-form games with general compact
convex sets at each decision point and general convex losses, as opposed to
prior work which has been for simplex decision points and linear losses. We
call our framework laminar regret decomposition. It generalizes the CFR
algorithm to this more general setting. Furthermore, our framework enables a
new proof of CFR even in the known setting, which is derived from a perspective
of decomposing polytope regret, thereby leading to an arguably simpler
interpretation of the algorithm. Our generalization to convex compact sets and
convex losses allows us to develop new algorithms for several problems:
regularized sequential decision making, regularized Nash equilibria in
extensive-form games, and computing approximate extensive-form perfect
equilibria. Our generalization also leads to the first regret-minimization
algorithm for computing reduced-normal-form quantal response equilibria based
on minimizing local regrets. Experiments show that our framework leads to
algorithms that scale at a rate comparable to the fastest variants of
counterfactual regret minimization for computing Nash equilibrium, and
therefore our approach leads to the first algorithm for computing quantal
response equilibria in extremely large games. Finally we show that our
framework enables a new kind of scalable opponent exploitation approach
Smoothing Method for Approximate Extensive-Form Perfect Equilibrium
Nash equilibrium is a popular solution concept for solving
imperfect-information games in practice. However, it has a major drawback: it
does not preclude suboptimal play in branches of the game tree that are not
reached in equilibrium. Equilibrium refinements can mend this issue, but have
experienced little practical adoption. This is largely due to a lack of
scalable algorithms.
Sparse iterative methods, in particular first-order methods, are known to be
among the most effective algorithms for computing Nash equilibria in
large-scale two-player zero-sum extensive-form games. In this paper, we
provide, to our knowledge, the first extension of these methods to equilibrium
refinements. We develop a smoothing approach for behavioral perturbations of
the convex polytope that encompasses the strategy spaces of players in an
extensive-form game. This enables one to compute an approximate variant of
extensive-form perfect equilibria. Experiments show that our smoothing approach
leads to solutions with dramatically stronger strategies at information sets
that are reached with low probability in approximate Nash equilibria, while
retaining the overall convergence rate associated with fast algorithms for Nash
equilibrium. This has benefits both in approximate equilibrium finding (such
approximation is necessary in practice in large games) where some probabilities
are low while possibly heading toward zero in the limit, and exact equilibrium
computation where the low probabilities are actually zero.Comment: Published at IJCAI 1
Robust Stackelberg Equilibria in Extensive-Form Games and Extension to Limited Lookahead
Stackelberg equilibria have become increasingly important as a solution
concept in computational game theory, largely inspired by practical problems
such as security settings. In practice, however, there is typically uncertainty
regarding the model about the opponent. This paper is, to our knowledge, the
first to investigate Stackelberg equilibria under uncertainty in extensive-form
games, one of the broadest classes of game. We introduce robust Stackelberg
equilibria, where the uncertainty is about the opponent's payoffs, as well as
ones where the opponent has limited lookahead and the uncertainty is about the
opponent's node evaluation function. We develop a new mixed-integer program for
the deterministic limited-lookahead setting. We then extend the program to the
robust setting for Stackelberg equilibrium under unlimited and under limited
lookahead by the opponent. We show that for the specific case of interval
uncertainty about the opponent's payoffs (or about the opponent's node
evaluations in the case of limited lookahead), robust Stackelberg equilibria
can be computed with a mixed-integer program that is of the same asymptotic
size as that for the deterministic setting.Comment: Published at AAAI1
Quasi-Perfect Stackelberg Equilibrium
Equilibrium refinements are important in extensive-form (i.e., tree-form)
games, where they amend weaknesses of the Nash equilibrium concept by requiring
sequential rationality and other beneficial properties. One of the most
attractive refinement concepts is quasi-perfect equilibrium. While
quasi-perfection has been studied in extensive-form games, it is poorly
understood in Stackelberg settings---that is, settings where a leader can
commit to a strategy---which are important for modeling, for example, security
games. In this paper, we introduce the axiomatic definition of quasi-perfect
Stackelberg equilibrium. We develop a broad class of game perturbation schemes
that lead to them in the limit. Our class of perturbation schemes strictly
generalizes prior perturbation schemes introduced for the computation of
(non-Stackelberg) quasi-perfect equilibria. Based on our perturbation schemes,
we develop a branch-and-bound algorithm for computing a quasi-perfect
Stackelberg equilibrium. It leverages a perturbed variant of the linear program
for computing a Stackelberg extensive-form correlated equilibrium. Experiments
show that our algorithm can be used to find an approximate quasi-perfect
Stackelberg equilibrium in games with thousands of nodes
Feline infectious peritonitis: role of the feline coronavirus 3c gene in intestinal tropism and pathogenicity based upon isolates from resident and adopted shelter cats.
Feline infectious peritonitis virus (FIPV) was presumed to arise from mutations in the 3c of a ubiquitous and largely nonpathogenic feline enteric coronavirus (FECV). However, a recent study found that one-third of FIPV isolates have an intact 3c and suggested that it is not solely involved in FIP but is essential for intestinal replication. In order to confirm these assumptions, 27 fecal and 32 FIP coronavirus isolates were obtained from resident or adopted cats from a large metropolitan shelter during 2008-2009 and their 3a-c, E, and M genes sequenced. Forty percent of coronavirus isolates from FIP tissues had an intact 3c gene, while 60% had mutations that truncated the gene product. The 3c genes of fecal isolates from healthy cats were always intact. Coronavirus from FIP diseased tissues consistently induced FIP when given either oronasally or intraperitoneally (i.p.), regardless of the functional status of their 3c genes, thus confirming them to be FIPVs. In contrast, fecal isolates from healthy cats were infectious following oronasal infection and shed at high levels in feces without causing disease, as expected for FECVs. Only one in three cats shed FECV in the feces following i.p. infection, indicating that FECVs can replicate systemically, but with difficulty. FIPVs having a mutated 3c were not shed in the feces following either oronasal or i.p. inoculation, while FIPVs with intact 3c genes were shed in the feces following oronasal but not i.p. inoculation. Therefore, an intact 3c appears to be essential for intestinal replication. Although FIPVs with an intact 3c were shed in the feces following oronasal inoculation, fecal virus from these cats was not infectious for other cats. Attempts to identify potential FIP mutations in the 3a, 3b, E, and M were negative. However, the 3c gene of FIPVs, even though appearing intact, contained many more non-synonymous amino acid changes in the 3' one-third of the 3c protein than FECVs. An attempt to trace FIPV isolates back to enteric strains existing in the shelter was only partially successful due to the large region over which shelter cats and kittens originated, housing conditions prior to acquisition, and rapid movement through the shelter. No evidence could be found to support a recent theory that FIPVs and FECVs are genetically distinct
Faster Game Solving via Predictive Blackwell Approachability: Connecting Regret Matching and Mirror Descent
Blackwell approachability is a framework for reasoning about repeated games
with vector-valued payoffs. We introduce predictive Blackwell approachability,
where an estimate of the next payoff vector is given, and the decision maker
tries to achieve better performance based on the accuracy of that estimator. In
order to derive algorithms that achieve predictive Blackwell approachability,
we start by showing a powerful connection between four well-known algorithms.
Follow-the-regularized-leader (FTRL) and online mirror descent (OMD) are the
most prevalent regret minimizers in online convex optimization. In spite of
this prevalence, the regret matching (RM) and regret matching+ (RM+) algorithms
have been preferred in the practice of solving large-scale games (as the local
regret minimizers within the counterfactual regret minimization framework). We
show that RM and RM+ are the algorithms that result from running FTRL and OMD,
respectively, to select the halfspace to force at all times in the underlying
Blackwell approachability game. By applying the predictive variants of FTRL or
OMD to this connection, we obtain predictive Blackwell approachability
algorithms, as well as predictive variants of RM and RM+. In experiments across
18 common zero-sum extensive-form benchmark games, we show that predictive RM+
coupled with counterfactual regret minimization converges vastly faster than
the fastest prior algorithms (CFR+, DCFR, LCFR) across all games but two of the
poker games and Liar's Dice, sometimes by two or more orders of magnitude
- …