320 research outputs found

    Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity

    Full text link
    Model-based reinforcement learning (RL), which finds an optimal policy using an empirical model, has long been recognized as one of the corner stones of RL. It is especially suitable for multi-agent RL (MARL), as it naturally decouples the learning and the planning phases, and avoids the non-stationarity problem when all agents are improving their policies simultaneously using samples. Though intuitive, easy-to-implement, and widely-used, the sample complexity of model-based MARL algorithms has not been fully investigated. In this paper, our goal is to address the fundamental question about its sample complexity. We study arguably the most basic MARL setting: two-player discounted zero-sum Markov games, given only access to a generative model. We show that model-based MARL achieves a sample complexity of O~(SAB(1γ)3ϵ2)\tilde O(|S||A||B|(1-\gamma)^{-3}\epsilon^{-2}) for finding the Nash equilibrium (NE) value up to some ϵ\epsilon error, and the ϵ\epsilon-NE policies with a smooth planning oracle, where γ\gamma is the discount factor, and S,A,BS,A,B denote the state space, and the action spaces for the two agents. We further show that such a sample bound is minimax-optimal (up to logarithmic factors) if the algorithm is reward-agnostic, where the algorithm queries state transition samples without reward knowledge, by establishing a matching lower bound. This is in contrast to the usual reward-aware setting, with a Ω~(S(A+B)(1γ)3ϵ2)\tilde\Omega(|S|(|A|+|B|)(1-\gamma)^{-3}\epsilon^{-2}) lower bound, where this model-based approach is near-optimal with only a gap on the A,B|A|,|B| dependence. Our results not only demonstrate the sample-efficiency of this basic model-based approach in MARL, but also elaborate on the fundamental tradeoff between its power (easily handling the more challenging reward-agnostic case) and limitation (less adaptive and suboptimal in A,B|A|,|B|), particularly arises in the multi-agent context

    Effective action for Einstein-Maxwell theory at order RF**4

    Full text link
    We use a recently derived integral representation of the one-loop effective action in Einstein-Maxwell theory for an explicit calculation of the part of the effective action containing the information on the low energy limit of the five-point amplitudes involving one graviton, four photons and either a scalar or spinor loop. All available identities are used to get the result into a relatively compact form.Comment: 13 pages, no figure

    Multi-Layer Cyber-Physical Security and Resilience for Smart Grid

    Full text link
    The smart grid is a large-scale complex system that integrates communication technologies with the physical layer operation of the energy systems. Security and resilience mechanisms by design are important to provide guarantee operations for the system. This chapter provides a layered perspective of the smart grid security and discusses game and decision theory as a tool to model the interactions among system components and the interaction between attackers and the system. We discuss game-theoretic applications and challenges in the design of cross-layer robust and resilient controller, secure network routing protocol at the data communication and networking layers, and the challenges of the information security at the management layer of the grid. The chapter will discuss the future directions of using game-theoretic tools in addressing multi-layer security issues in the smart grid.Comment: 16 page

    Autonomous Robust Skill Generation Using Reinforcement Learning with Plant Variation

    Get PDF
    This paper discusses an autonomous space robot for a truss structure assembly using some reinforcement learning. It is difficult for a space robot to complete contact tasks within a real environment, for example, a peg-in-hole task, because of error between the real environment and the controller model. In order to solve problems, we propose an autonomous space robot able to obtain proficient and robust skills by overcoming error to complete a task. The proposed approach develops skills by reinforcement learning that considers plant variation, that is, modeling error. Numerical simulations and experiments show the proposed method is useful in real environments

    A Gauge-Gravity Relation in the One-loop Effective Action

    Full text link
    We identify an unusual new gauge-gravity relation: the one-loop effective action for a massive spinor in 2n dimensional AdS space is expressed in terms of precisely the same function [a certain multiple gamma function] as the one-loop effective action for a massive charged scalar in 4n dimensions in a maximally symmetric background electromagnetic field [one for which the eigenvalues of F_{\mu\nu} are maximally degenerate, corresponding in 4 dimensions to a self-dual field, equivalently to a field of definite helicity], subject to the identification F^2 \Lambda, where \Lambda is the gravitational curvature. Since these effective actions generate the low energy limit of all one-loop multi-leg graviton or gauge amplitudes, this implies a nontrivial gauge-gravity relation at the non-perturbative level and at the amplitude level.Comment: 6 page

    On the Deformation of a Hyperelastic Tube Due to Steady Viscous Flow Within

    Full text link
    In this chapter, we analyze the steady-state microscale fluid--structure interaction (FSI) between a generalized Newtonian fluid and a hyperelastic tube. Physiological flows, especially in hemodynamics, serve as primary examples of such FSI phenomena. The small scale of the physical system renders the flow field, under the power-law rheological model, amenable to a closed-form solution using the lubrication approximation. On the other hand, negligible shear stresses on the walls of a long vessel allow the structure to be treated as a pressure vessel. The constitutive equation for the microtube is prescribed via the strain energy functional for an incompressible, isotropic Mooney--Rivlin material. We employ both the thin- and thick-walled formulations of the pressure vessel theory, and derive the static relation between the pressure load and the deformation of the structure. We harness the latter to determine the flow rate--pressure drop relationship for non-Newtonian flow in thin- and thick-walled soft hyperelastic microtubes. Through illustrative examples, we discuss how a hyperelastic tube supports the same pressure load as a linearly elastic tube with smaller deformation, thus requiring a higher pressure drop across itself to maintain a fixed flow rate.Comment: 19 pages, 3 figures, Springer book class; v2: minor revisions, final form of invited contribution to the Springer volume entitled "Dynamical Processes in Generalized Continua and Structures" (in honour of Academician D.I. Indeitsev), eds. H. Altenbach, A. Belyaev, V. A. Eremeyev, A. Krivtsov and A. V. Porubo

    Inhomogeneous Condensates in the Thermodynamics of the Chiral NJL_2 model

    Full text link
    We analyze the thermodynamical properties, at finite density and nonzero temperature, of the (1+1)-dimensional chiral Gross-Neveu model (the NJL_2 model), using the exact inhomogeneous (crystalline) condensate solutions to the gap equation. The continuous chiral symmetry of the model plays a crucial role, and the thermodynamics leads to a broken phase with a periodic spiral condensate, the "chiral spiral", as a thermodynamically preferred limit of the more general "twisted kink crystal" solution of the gap equation. This situation should be contrasted with the Gross-Neveu model, which has a discrete chiral symmetry, and for which the phase diagram has a crystalline phase with a periodic kink crystal. We use a combination of analytic, numerical and Ginzburg-Landau techniques to study various parts of the phase diagram.Comment: 28 pages, 13 figure

    Four-dimensional generalized difference matrix and some double sequence spaces

    Get PDF
    In this study, I introduce some new double sequence spaces B(Mu), B(Cp), B(Cbp), B(Cr) and B(Lq) as the domain of four-dimensional generalized difference matrix B(r,s,t,u) in the spaces Mu, Cp, Cbp, Cr and Lq, respectively. I show that the double sequence spaces B(Mu), B(Cbp) and B(Cr) are the Banach spaces under some certain conditions. I give some inclusion relations with some topological properties. Moreover, I determine the α-dual of the spaces B(Mu) and B(Cbp), the β(ϑ)-duals of the spaces B(Mu), B(Cp), B(Cbp), B(Cr) and B(Lq), where ϑ∈{p,bp,r}, and the γ-dual of the spaces B(Mu), B(Cbp) and B(Lq). Finally, I characterize the classes of four-dimensional matrix mappings defined on the spaces B(Mu), B(Cp), B(Cbp), B(Cr) and B(Lq) of double sequences
    corecore