Search CORE

4 research outputs found

Irreversible Games with Incomplete Information: The Asymptotic Value

Author: Rida Laraki
Publication venue
Publication date
Field of study

Les jeux irréversibles sont des jeux stochastiques où une fois un état est quitté, il n'est plus jamais revisité. Cette classe contient les jeux absorbants. Cet article démontre l'existence et une caractérisation de la valeur asymptotique pour tout jeu irréversible fini à information incomplète des deux côtés. Cela généralise Mertens et Zamir 1971 pour les jeux répétés à information incomplète des deux côtés et Rosenberg 2000 pour les jeux absorbants à information incomplète d'un côté.Jeux stochastiques; jeux répétés; information incomplète; valeur asymptotique; principe de comparaison; inégalités variationelles

Research Papers in Economics

Convergence des EDSRs et homogéneisation des inégalités variationnelles semilinéaires dans un convexe

Author: Es-Saky El Hassan
Ouknine Youssef
Publication venue: Éditions scientifiques et médicales Elsevier SAS.
Publication date: 30/06/2002
Field of study

AbstractWe study the limit of the solution of a Semi-linear Variational Inequality (SVI for short) involving a second order differential operator of parabolic type with periodic coefficients and highly oscillating term. Our basic tool is the approach given by Pardoux [16]. In particular, we use the weak convergence of an associated reflected Backward Stochastic Differential Equation (BSDE for short)

Elsevier - Publisher Connector

Irreversible Games with Incomplete Information: The Asymptotic Value

Author: Laraki Rida
Publication venue: HAL CCSD
Publication date: 06/04/2010
Field of study

Irreversible games are stochastic games in which once the play leaves a state it never revisits that state. This class includes absorbing games. This paper proves the existence and a characterization of the asymptotic value for any finite irreversible game with incomplete information on both sides. This result extends Mertens and Zamir 1971 for repeated games with incomplete information on both sides, and Rosenberg 2000 for absorbing games with incomplete information on one side.Les jeux irréversibles sont des jeux stochastiques où une fois un état est quitté, il n'est plus jamais revisité. Cette classe contient les jeux absorbants. Cet article démontre l'existence et une caractérisation de la valeur asymptotique pour tout jeu irréversible fini à information incomplète des deux côtés. Cela généralise Mertens et Zamir 1971 pour les jeux répétés à information incomplète des deux côtés et Rosenberg 2000 pour les jeux absorbants à information incomplète d'un côté

HAL-Polytechnique

Multi-player games in the era of machine learning

Author: Gidel Gauthier
Publication venue
Publication date: 01/07/2020
Field of study

Parmi tous les jeux de société joués par les humains au cours de l’histoire, le jeu de go était considéré comme l’un des plus difficiles à maîtriser par un programme informatique [Van Den Herik et al., 2002]; Jusqu’à ce que ce ne soit plus le cas [Silveret al., 2016]. Cette percée révolutionnaire [Müller, 2002, Van Den Herik et al., 2002] fût le fruit d’une combinaison sophistiquée de Recherche arborescente Monte-Carlo et de techniques d’apprentissage automatique pour évaluer les positions du jeu, mettant en lumière le grand potentiel de l’apprentissage automatique pour résoudre des jeux. L’apprentissage antagoniste, un cas particulier de l’optimisation multiobjective, est un outil de plus en plus utile dans l’apprentissage automatique. Par exemple, les jeux à deux joueurs et à somme nulle sont importants dans le domain des réseaux génératifs antagonistes [Goodfellow et al., 2014] ainsi que pour maîtriser des jeux comme le Go ou le Poker en s’entraînant contre lui-même [Silver et al., 2017, Brown andSandholm, 2017]. Un résultat classique de la théorie des jeux indique que les jeux convexes-concaves ont toujours un équilibre [Neumann, 1928]. Étonnamment, les praticiens en apprentissage automatique entrainent avec succès une seule paire de réseaux de neurones dont l’objectif est un problème de minimax non-convexe et non-concave alors que pour une telle fonction de gain, l’existence d’un équilibre de Nash n’est pas garantie en général. Ce travail est une tentative d'établir une solide base théorique pour l’apprentissage dans les jeux. La première contribution explore le théorème minimax pour une classe particulière de jeux non-convexes et non-concaves qui englobe les réseaux génératifs antagonistes. Cette classe correspond à un ensemble de jeux à deux joueurs et a somme nulle joués avec des réseaux de neurones. Les deuxième et troisième contributions étudient l’optimisation des problèmes minimax, et plus généralement, les inégalités variationnelles dans le cadre de l’apprentissage automatique. Bien que la méthode standard de descente de gradient ne parvienne pas à converger vers l’équilibre de Nash de jeux convexes-concaves simples, il existe des moyens d’utiliser des gradients pour obtenir des méthodes qui convergent. Nous étudierons plusieurs techniques telles que l’extrapolation, la moyenne et la quantité de mouvement à paramètre négatif. La quatrième contribution fournit une étude empirique du comportement pratique des réseaux génératifs antagonistes. Dans les deuxième et troisième contributions, nous diagnostiquons que la méthode du gradient échoue lorsque le champ de vecteur du jeu est fortement rotatif. Cependant, une telle situation peut décrire un pire des cas qui ne se produit pas dans la pratique. Nous fournissons de nouveaux outils de visualisation afin d’évaluer si nous pouvons détecter des rotations dans comportement pratique des réseaux génératifs antagonistes.Among all the historical board games played by humans, the game of go was considered one of the most difficult to master by a computer program [Van Den Heriket al., 2002]; Until it was not [Silver et al., 2016]. This odds-breaking break-through [Müller, 2002, Van Den Herik et al., 2002] came from a sophisticated combination of Monte Carlo tree search and machine learning techniques to evaluate positions, shedding light upon the high potential of machine learning to solve games. Adversarial training, a special case of multiobjective optimization, is an increasingly useful tool in machine learning. For example, two-player zero-sum games are important for generative modeling (GANs) [Goodfellow et al., 2014] and mastering games like Go or Poker via self-play [Silver et al., 2017, Brown and Sandholm,2017]. A classic result in Game Theory states that convex-concave games always have an equilibrium [Neumann, 1928]. Surprisingly, machine learning practitioners successfully train a single pair of neural networks whose objective is a nonconvex-nonconcave minimax problem while for such a payoff function, the existence of a Nash equilibrium is not guaranteed in general. This work is an attempt to put learning in games on a firm theoretical foundation. The first contribution explores minimax theorems for a particular class of nonconvex-nonconcave games that encompasses generative adversarial networks. The proposed result is an approximate minimax theorem for two-player zero-sum games played with neural networks, including WGAN, StarCrat II, and Blotto game. Our findings rely on the fact that despite being nonconcave-nonconvex with respect to the neural networks parameters, the payoff of these games are concave-convex with respect to the actual functions (or distributions) parametrized by these neural networks. The second and third contributions study the optimization of minimax problems, and more generally, variational inequalities in the context of machine learning. While the standard gradient descent-ascent method fails to converge to the Nash equilibrium of simple convex-concave games, there exist ways to use gradients to obtain methods that converge. We investigate several techniques such as extrapolation, averaging and negative momentum. We explore these techniques experimentally by proposing a state-of-the-art (at the time of publication) optimizer for GANs called ExtraAdam. We also prove new convergence results for Extrapolation from the past, originally proposed by Popov [1980], as well as for gradient method with negative momentum. The fourth contribution provides an empirical study of the practical landscape of GANs. In the second and third contributions, we diagnose that the gradient method breaks when the game’s vector field is highly rotational. However, such a situation may describe a worst-case that does not occur in practice. We provide new visualization tools in order to exhibit rotations in practical GAN landscapes. In this contribution, we show empirically that the training of GANs exhibits significant rotations around Local Stable Stationary Points (LSSP), and we provide empirical evidence that GAN training converges to a stable stationary point, which is a saddle point for the generator loss, not a minimum, while still achieving excellent performance

Dépôt Institutionnel Numérique