Jeux différentiels stochastiques non-Markoviens etdynamiques de Langevin à champ-moyen

Hu, Kaitong

Jeux différentiels stochastiques non-Markoviens etdynamiques de Langevin à champ-moyen

Authors: Kaitong Hu
Publication date: 18 June 2020
Publisher: HAL CCSD

Abstract

Two independent subjects are studied in this thesis, the first of which consists of two distinct problems.In the first part, we begin with the Principal-Agent problem in degenerate systems, which appear naturally in partially observed random environment in which the Agent and the Principal can only observe one part of the system. Our approach is based on the stochastic maximum principle, the goal of which is to extend the existing results using dynamic programming principle to the degenerate case. We first solve the Principal's problem in an enlarged set of contracts given by the first order condition of the Agent's problem in form of a path-dependent forward-backward stochastic differential equation (abbreviated FBSDE). Afterward, we use the sufficient condition of the Agent's problem to verify that the previously obtained optimal contract is indeed implementable. Meanwhile, a parallel study is devoted to the wellposedness of path-dependent FBSDEs in the chapter IV. We generalize the decoupling field method to the case where the coefficients of the equations can depend on the whole path of the forward process and show the stability property of this type of FBSDEs. Finally, we study the Principal-Agent problem with multiple Principals. The Agent can only work for one Principal at a time and therefore needs to solve an optimal switching problem. By using randomization, we show that the value function of the Agent's problem and his optimal control are given by an Itô process. This representation allows us to solve the Principal's problem in the mean-field case when there is an infinite number of Principals. We justify the mean-field formulation using an argument of backward propagation of chaos.The second part of the thesis consists of chapter V and VI. The motivation of this work is to give a rigorous theoretical underpinning for the convergence of gradient-descent type of algorithms frequently used in non-convex optimization problems like calibrating a deep neural network.For one-layer neural networks, the key insight is to convexify the problem by lifting it to the measure space. We show that the corresponding energy function has a unique minimiser which can be characterized by some first order condition using derivatives in measure space. We present a probabilistic analysis of the long-time behavior of the mean-field Langevin dynamics, which have a gradient flow structure in 2-Wasserstein metric. By using a generalization of LaSalle's invariance principle, we show that the flow of marginal laws induced by the mean-field Langevin dynamics converges to the stationary distribution, which is exactly the minimiser of the energy function.As for deep neural networks, we model them as some continuous-time optimal control problems. Firstly, we find the first order condition by using Pontryagin maximum principle, which later helps us find the associated mean-field Langevin system, the invariant measure of which is again the minimiser of the optimal control problem. As last, by using the reflection coupling, we show that the marginal distribution of the mean-field Langevin system converges to the unique invariant measure exponentially.Cette thèse se compose de deux parties indépendantes et la première regroupant deux problématiques distinctes. Dans la première partie, nous étudions d’abord le problème de Principal-Agent dans des systèmes dégénérés, qui apparaissent naturellement dans des environnements à l’observation partielle où l’Agent et le Principal n’observent qu’une partie du système. Nous présentons une approche se basant sur le principe du maximum stochastique, dont le but est d’étendre les travaux existants qui utilisent le principe de la programmation dynamique dans des systèmes non-dégénérés. D’abord nous résolvons le problème du Principal dans un ensembledes contrats élargi donné par la condition du premier ordre du problème de l’Agent sous forme d’une équation différentielle stochastique progressive-rétrograde (abrégée EDSPR) dépendante de la trajectoire. Ensuite nous utilisons la condition suffisante du problème de l’Agent pour vérifier que le contrat optimal obtenu est bien implémentable. Une étude parallèle est consacrée à l’existence et l’unicité de la solution d'EDSPRs dépendantes de la trajectoire dans le chapitre IV. Nous étendons la méthode de champ de découplage aux cas où les coefficients des équations peuvent dépendre de la trajectoire du processus forward. Nous démontrons également une propriété de stabilité pour ce genre d'EDSPRs. Enfin, nous étudions le problème de hasard moral avec plusieurs Principals. L’Agent ne peut travailler que pour un seul Principal à la fois et fait donc face à un problème de switching optimal. En utilisant la méthode de randomisation nous montrons que la fonction valeur de l’Agent et son effort optimal sont donnés par un processus d’Itô. Cette représentation nous aide à résoudre ensuite le problème du Principal lorsqu’il y a une infinité de Principals en équilibre selon un jeu à champ-moyen. Nous justifions la formulation à champ-moyen par un argument de propagation de chaos.La deuxième partie de cette thèse est constituée des chapitres V et VI. La motivation de ces travaux est de donner un fondement théorique rigoureux pour la convergence des algorithmes du type descente de gradient très souvent utilisés dans la résolution des problème non-convexes comme la calibration d’un réseau de neurones. Pour les problèmes non-convexes du type réseaux de neurones à une couche cachée, l’idée clé est de transformer le problème en un problème convexe en le relevant dans l’espace des mesures. Nous montrons que la fonction d’énergie correspondante admet un unique minimiseur qui peut être caractérisé par une condition du premier ordre utilisant la dérivation dans l’espace des mesures au sens de Lions. Nous présentons ensuite une analyse du comportement à long terme de la dynamique de Langevin à champ-moyen, qui possède une structure de flot de gradient dans la métrique de 2-Wasserstein. Nous montrons que le flot de la loi marginale induite par la dynamique de Langevin à champ-moyen converge vers une loi stationnaire en utilisant le principe d’invariance de La Salle, qui est le minimiseur de la fonction d’énergie.Dans le cas des réseaux de neurones profonds, nous les modélisons à l’aide d’un problème de contrôle optimal en temps continu. Nous donnons d’abord la conditiondu premier ordre à l’aide du principe de Pontryagin, qui nous aidera ensuiteà introduire le système d’équation de Langevin à champ-moyen, dont la mesure invariante correspond au minimiseur du problème de contrôle optimal. Enfin, avec la méthode de couplage par réflexion nous montrons que la loi marginale du système de Langevin à champ-moyen converge vers la mesure invariante avec une vitesse exponentielle

Similar works

Full text

Available Versions

HAL-Polytechnique

oai:HAL:tel-02918519v1

Last time updated on 16/09/2020

Thèses en Ligne

oai:HAL:tel-02918519v1

Last time updated on 11/01/2021