Search CORE

75,372 research outputs found

Numerical solution methods for differential game problems

Author: Johnson Philip A. (Philip Arthur)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2009
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2009.Includes bibliographical references (p. 95-98).Differential game theory provides a potential means for the parametric analysis of combat engagement scenarios. To determine its viability for this type of analysis, three frameworks for solving differential game problems are evaluated. Each method solves zero-sum, pursuit-evasion games in which two players have opposing goals. A solution to the saddle-point equilibrium problem is sought in which one player minimizes the value of the game while the other player maximizes it. The boundary value method is an indirect method that makes use of the analytical necessary conditions of optimality and is solved using a conventional optimal control framework. This method provides a high accuracy solution but has a limited convergence space that requires a good initial guess for both the state and less intuitive costate. The decomposition method in which optimal trajectories for each player are iteratively calculated is a direct method that bypasses the need for costate information. Because a linearized cost gradient is used to update the evader's strategy the initial conditions can heavily influence the convergence of the problem. The new method of neural networks involves the use of neural networks to govern the control policy for each player. An optimization tool adjusts the weights and biases of the network to form the control policy that results in the best final value of the game. An automatic differentiation engine provides gradient information for the sensitivity of each weight to the final cost.(cont.) The final weights define the control policy's response to a range of initial conditions dependent upon the breadth of the state-space used to train each neural network. The neural nets are initialized with a normal distribution of weights so that no information regarding the state, costate, or switching structure of the controller is required. In its current form this method often converges to a sub-optimal solution. Also, creative techniques are required when dealing with boundary conditions and free end-time problems.by Philip A. Johnson.S.M

DSpace@MIT

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle

Author: Dong Bin
Lu Yiping
Zhang Dinghuai
Zhang Tianyuan
Zhu Zhanxing
Publication venue
Publication date: 01/11/2019
Field of study

Deep learning achieves state-of-the-art results in many tasks in computer vision and natural language processing. However, recent works have shown that deep networks can be vulnerable to adversarial perturbations, which raised a serious robustness issue of deep networks. Adversarial training, typically formulated as a robust optimization problem, is an effective way of improving the robustness of deep networks. A major drawback of existing adversarial training algorithms is the computational overhead of the generation of adversarial examples, typically far greater than that of the network training. This leads to the unbearable overall computational cost of adversarial training. In this paper, we show that adversarial training can be cast as a discrete time differential game. Through analyzing the Pontryagin's Maximal Principle (PMP) of the problem, we observe that the adversary update is only coupled with the parameters of the first layer of the network. This inspires us to restrict most of the forward and back propagation within the first layer of the network during adversary updates. This effectively reduces the total number of full forward and backward propagation to only one for each group of adversary updates. Therefore, we refer to this algorithm YOPO (You Only Propagate Once). Numerical experiments demonstrate that YOPO can achieve comparable defense accuracy with approximately 1/5 ~ 1/4 GPU time of the projected gradient descent (PGD) algorithm. Our codes are available at https://https://github.com/a1600012888/YOPO-You-Only-Propagate-Once.Comment: Accepted as a conference paper at NeurIPS 201

arXiv.org e-Print Archive

Southampton (e-Prints Soton)