51,949 research outputs found

    High order structure preserving explicit methods for solving linear-quadratic optimal control problems

    Full text link
    [EN] We consider the numerical integration of linear-quadratic optimal control problems. This problem requires the solution of a boundary value problem: a non-autonomous matrix Riccati differential equation (RDE) with final conditions coupled with the state vector equation with initial conditions. The RDE has positive definite matrix solution and to numerically preserve this qualitative property we propose first to integrate this equation backward in time with a sufficiently accurate scheme. Then, this problem turns into an initial value problem, and we analyse splitting and Magnus integrators for the forward time integration which preserve the positive definite matrix solutions for the RDE. Duplicating the time as two new coordinates and using appropriate splitting methods, high order methods preserving the desired property can be obtained. The schemes make sequential computations and do not require the storrage of intermediate results, so the storage requirements are minimal. The proposed methods are also adapted for solving linear-quadratic N-player differential games. The performance of the splitting methods can be considerably improved if the system is a perturbation of an exactly solvable problem and the system is properly split. Some numerical examples illustrate the performance of the proposed methods.The author wishes to thank the University of California San Diego for its hospitality where part of this work was done. He also acknowledges the support of the Ministerio de Ciencia e Innovacion (Spain) under the coordinated project MTM2010-18246-C03. The author also acknowledges the suggestions by the referees to improve the presentation of this work.Blanes Zamora, S. (2015). High order structure preserving explicit methods for solving linear-quadratic optimal control problems. Numerical Algorithms. 69:271-290. https://doi.org/10.1007/s11075-014-9894-0S27129069Abou-Kandil, H., Freiling, G., Ionescy, V., Jank, G.: Matrix Riccati equations in control and systems theory. Basel, Burkhäuser Verlag (2003)Al-Mohy, A.H., Higham, N.J.: Computing the Action of the Matrix Exponential, with an Application to Exponential Integrators. SIAM. J. Sci. Comp. 33, 488–511 (2011)Anderson, B.D.O., Moore, J.B.: Optimal control: linear quadratic methods. Dover, New York (1990)Ascher, U.M., Mattheij, R.M., Russell, R.D.: Numerical solutions of boundary value problems for ordinary differential equations. Prentice-Hall, Englewood Cliffs (1988)Bader, P., Blanes, S., Ponsoda, E.: Structure preserving integrators for solving linear quadratic optimal control problems with applications to describe the flight of a quadrotor. J. Comput. Appl. Math. 262, 223–233 (2014)Basar, T., Olsder, G.J.: Dynamic non cooperative game theory, 2nd Ed, SIAM, Philadelphhia (1999)Blanes, S., Casas, F.: On the necessity of negative coefficients for operator splitting schemes of order higher than two. Appl. Num. Math. 54, 23–37 (2005)Blanes, S., Casas, F., Farrés, A., Laskar, J., Makazaga, J., Murua, A.: New families of symplectic splitting methods for numerical integration in dynamical astronomy. Appl. Numer. Math. 68, 58–72 (2013)Blanes, S., Casas, F., Oteo, J.A., Ros, J.: The Magnus expansion and some of its applications. Phys. Rep. 470, 151–238 (2009)Blanes, S., Casas, F., Ros, J.: High order optimized geometric integrators for linear differential equations. BIT 42, 262–284 (2002)Blanes, S., Diele, F., Marangi, C., Ragni, S.: Splitting and composition methods for explicit time dependence in separable dynamical systems. J. Comput. Appl. Math. 235, 646–659 (2010)Blanes, S., Moan, P.C.: Practical symplectic partitioned Runge-Kutta and Runge-Kutta-Nystrm methods. J. Comput. Appl. Math. 142, 313–330 (2002)Blanes, S., Ponsoda, E.: Magnus integrators for solving linear-quadratic differential games. J. Comput. Appl. Math. 236, 3394–3408 (2012)Brif, C., Chakrabarti, R., Rabitz, H.: Control of quantum phenomena: past, present and future. New J. Phys. 12, 075008(68pp) (2010)Cruz, J.B., Chen, C.I.: Series Nash solution of two person non zero sum linear quadratic games. J. Optim. Theory Appl. 7, 240–257 (1971)Dieci, L., Eirola, T.: Positive definitness in the numerical solution of Riccati differential quations. Numer. Math. 67, 303–313 (1994)Engwerda, J.: LQ dynamic optimization and differential games. Wiley (2005)Hairer, E., Lubich, C., Wanner, G.: Geometric Numerical Integration. Structure-Preserving Algorithms for Ordinary Differential Equations (2nd edition). Springer Series in Computational Mathematics, 31. Springer-Verlag (2006)Hochbruck, M., Ostermann, A.: Exponential integrators. Acta Numerica 19, 209–286 (2010)Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, New York (1985)Iserles, A., Munthe-Kaas, H.Z., Nørsett, S.P., Zanna, A.: Lie group methods. Acta Numerica 9, 215–365 (2000)Iserles, A., Nørsett, S.P.: On the solution of linear differential equations in Lie groups. Phil. Trans. R. Soc. Lond. A 357, 983–1019 (1999)Jódar, L., Ponsoda, E.: Non-autonomous Riccati-type matrix differential equations: existence interval, construction of continuous numerical solutions and error bounds. IMA. J. Num. Anal. 15, 61–74 (1995)Jódar, L., Ponsoda, E., Company, R.: Solutions of coupled Riccati equations arising in differential games. Control. Cybern. 24, 117–128 (1995)Kaitala, V, Pohjola, M. In: Carraro, Filar (eds.) : Sustainable international agreement on greenhouse warming. A game theory study. Control and Game Theoretic Models of the Environment, pp 67–87. Birkhauser, Boston (1995)Keller, H.B.: Numerical solution of two point boundary value problems. In: CBMS-NSF Regional Conference Series in Applied Mathematics, Vol. 24. SIAM, Philadelphia (1976)McLachlan, R.I.: Composition methods in the presence of small parameters. BIT 35, 258–268 (1995)McLachlan, R.I., Quispel, R.: Splitting Methods. Acta Numer. 11, 341–434 (2002)Moler, C.B., Van Loan, C.F.: Nineteen Dubious Ways to Compute the Exponential of a Matrix, twenty-five years later. SIAM Rev. 45, 3–49 (2003)Na, T.Y.: Computational methods in engineering boundary value problems. In: Mathematics in Science and Engineering, Vol. 145. Accademic Press, New York (1979)Palao, J.P., Kosloff, R.: Quantum computing by an optimal control algorithm for unitry transformations. Phys. Rev. Lett. 28 (2002)Peirce, A.P., Dahleh, M.A., Rabitz, H.: Optimal control of quantum-mechanical systems: existence, numerical approximation, and applications. Phys. Rev. A 37, 4950–4967 (1988)Reid, W.T.: Riccati Differential Equations. Academic, New York (1972)Sanz-Serna, J.M., Calvo, M.P.: Numerical Hamiltonian Problems. Chapman & Hall, London (1994)Sidje, R.B.: Expokit: a software package for computing matrix exponentials. ACM Trans. Math. Software 24, 130–156 (1998)Speyer, J.L., Jacobson, D.H.: Primer on optimal control theory. SIAM, Philadelphia (2010)Starr, A.W., Ho, Y.C.: Non-zero sum differential games. J. Optim. Theory and Appl 3, 179–197 (1969)Zhu, W., Rabitz, H.: A rapid monotonically convergent iteration algorithm for quantum optimal control ever the expectation value of a positive definite operator. J. Chem. Phys. 109, 385–391 (1998

    Real and Complex Monotone Communication Games

    Full text link
    Noncooperative game-theoretic tools have been increasingly used to study many important resource allocation problems in communications, networking, smart grids, and portfolio optimization. In this paper, we consider a general class of convex Nash Equilibrium Problems (NEPs), where each player aims to solve an arbitrary smooth convex optimization problem. Differently from most of current works, we do not assume any specific structure for the players' problems, and we allow the optimization variables of the players to be matrices in the complex domain. Our main contribution is the design of a novel class of distributed (asynchronous) best-response- algorithms suitable for solving the proposed NEPs, even in the presence of multiple solutions. The new methods, whose convergence analysis is based on Variational Inequality (VI) techniques, can select, among all the equilibria of a game, those that optimize a given performance criterion, at the cost of limited signaling among the players. This is a major departure from existing best-response algorithms, whose convergence conditions imply the uniqueness of the NE. Some of our results hinge on the use of VI problems directly in the complex domain; the study of these new kind of VIs also represents a noteworthy innovative contribution. We then apply the developed methods to solve some new generalizations of SISO and MIMO games in cognitive radios and femtocell systems, showing a considerable performance improvement over classical pure noncooperative schemes.Comment: to appear on IEEE Transactions in Information Theor

    A Parameterisation of Algorithms for Distributed Constraint Optimisation via Potential Games

    No full text
    This paper introduces a parameterisation of learning algorithms for distributed constraint optimisation problems (DCOPs). This parameterisation encompasses many algorithms developed in both the computer science and game theory literatures. It is built on our insight that when formulated as noncooperative games, DCOPs form a subset of the class of potential games. This result allows us to prove convergence properties of algorithms developed in the computer science literature using game theoretic methods. Furthermore, our parameterisation can assist system designers by making the pros and cons of, and the synergies between, the various DCOP algorithm components clear

    Reinforcement Learning: A Survey

    Full text link
    This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file
    corecore