Search CORE

6,070 research outputs found

Reinforcement Learning: A Survey

Author: Kaelbling L. P.
Littman M. L.
Moore A. W.
Publication venue
Publication date: 01/01/1996
Field of study

This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning. Both the historical basis of the field and a broad selection of current work are summarized. Reinforcement learning is the problem faced by an agent that learns behavior through trial-and-error interactions with a dynamic environment. The work described here has a resemblance to work in psychology, but differs considerably in the details and in the use of the word ``reinforcement.'' The paper discusses central issues of reinforcement learning, including trading off exploration and exploitation, establishing the foundations of the field via Markov decision theory, learning from delayed reinforcement, constructing empirical models to accelerate learning, making use of generalization and hierarchy, and coping with hidden state. It concludes with a survey of some implemented systems and an assessment of the practical utility of current methods for reinforcement learning.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Inclusive Cognitive Hierarchy

Author: Koriyama Yukio
Ozkes Ali
Publication venue: WU Vienna University of Economics and Business
Publication date: 04/03/2020
Field of study

Cognitive hierarchy theory, a collection of structural models of non-equilibrium thinking, in which players' best responses rely on heterogeneous beliefs on others' strategies including naive behavior, proved powerful in explaining observations from a wide range of games. We introduce an inclusive cognitive hierarchy model, in which players do not rule out the possibility of facing opponents at their own thinking level. Our theoretical results show that inclusiveness is crucial for asymptotic properties of deviations from equilibrium behavior in expansive games. We show that the limiting behaviors are categorized in three distinct types: naive, Savage rational with inconsistent beliefs, and sophisticated. We test the model in a laboratory experiment of collective decision-making. The data suggests that inclusiveness is indispensable with regard to explanatory power of the models of hierarchical thinking.Series: Department of Strategy and Innovation Working Paper Serie

Multi-Layer Cyber-Physical Security and Resilience for Smart Grid

Author: CH Hauser
DH Lorenz
Drew Fudenberg
E Santacana
F Kuipers
FF Wu
G Xue
GN Ericsson
GN Ericsson
J Casey
K Tomsovic
Mohammad Hossein Manshaei
MS Amin
P McDaniel
P Mieghem Van
Q Zhu
R Hou
S Greengard
S Rass
T Başar
Publication venue
Publication date: 29/09/2018
Field of study

The smart grid is a large-scale complex system that integrates communication technologies with the physical layer operation of the energy systems. Security and resilience mechanisms by design are important to provide guarantee operations for the system. This chapter provides a layered perspective of the smart grid security and discusses game and decision theory as a tool to model the interactions among system components and the interaction between attackers and the system. We discuss game-theoretic applications and challenges in the design of cross-layer robust and resilient controller, secure network routing protocol at the data communication and networking layers, and the challenges of the information security at the management layer of the grid. The chapter will discuss the future directions of using game-theoretic tools in addressing multi-layer security issues in the smart grid.Comment: 16 page

arXiv.org e-Print Archive

A Bayesian Approach to the Estimation of Environmental Kuznets Curves for CO2 Emissions

Author: Antonio Musolesi
Massimiliano Mazzanti
Roberto Zoboli
Publication venue
Publication date
Field of study

This paper investigates the EKC curves for CO2 emissions in a panel of 109 countries during the period 1959-2001. The length of the series makes the application of a heterogeneous estimator suitable from an econometric point of view. The results, based on the hierarchical Bayes estimator, show that different EKC dynamics are associated with the different sub samples of countries considered. On average, more industrialized countries show an EKC evidence in quadratic specifications, which are nevertheless probably evolving into an N shape, emerging from cubic specifications. Less developed countries consistently show that CO2 emissions still rise positively with income, though some signals of an EKC path arise.Environmental Kuznets Curve, CO2 Emissions, Bayesian Approach, Heterogeneous Panels