Search CORE

11 research outputs found

Tsitsiklis, “Linear Stochastic Approximation Driven by Slowly Varying

Author: John N. Tsitsiklis
Vijay R. Konda
Publication venue
Publication date
Field of study

www.elsevier.com/locate/sysconl

CiteSeerX

Actor-critic algorithms

Author: John
N. Tsitsiklis
Vijay R. Konda
Publication venue: MIT Press
Publication date
Field of study

Abstract. In this article, we propose and analyze a class of actor-critic algorithms. These are two-time-scale algorithms in which the critic uses temporal difference learning with a linearly parameterized approximation architecture, and the actor is updated in an approximate gradient direction, based on information provided by the critic. We show that the features for the critic should ideally span a subspace prescribed by the choice of parameterization of the actor. We study actor-critic algorithms for Markov decision processes with Polish state and action spaces. We state and prove two results regarding their convergence

CiteSeerX

On De Finetti Coherence and Kolmogorov Probability

Author: Goldman Sachs
Sanjoy K. Mitter
Vijay R. Konda
Vivek S. Borkar
Publication venue
Publication date: 01/01/2004
Field of study

This article addresses the problem of existence of a countably additive probability measure in the sense of Kolmogorov that is consistent with a probability assignment to a family of sets which is coherent in the sense of De Finetti. Key words: probability assignment, coherence condition, subjective probability, countably additive probability This work done while visiting the Laboratory for Information and Decision Systems. Massachusetts Institute of Technology. This research supported by Grant No. III 5(12)/96-ET from the Department of Science and Technology, Government of India and the U.S. Army Research O#ce under the MURI Grant: Data Fusion in Large Arrays of Microsensors DAAD19-00-1-0466

CiteSeerX

Language support for multi agent reinforcement learning

Author: Andre David
Barat Souvik
Barat Souvik
Boschert Stefan
Buşoniu Lucian
Foerster Jakob
Konda Vijay R
Michel Fabien
Schwab Klaus
Simpkins Christopher
Tuegel Eric J
Zheng Lianmin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/02/2020
Field of study

Software Engineering must increasingly address the issues of complexity and uncertainty that arise when systems are to be deployed into a dynamic software ecosystem. There is also interest in using digital twins of systems in order to design, adapt and control them when faced with such issues. The use of multi-agent systems in combination with reinforcement learning is an approach that will allow software to intelligently adapt to respond to changes in the environment. This paper proposes a language extension that encapsulates learning-based agents and system building operations and shows how it is implemented in ESL. The paper includes examples the key features and describes the application of agent-based learning implemented in ESL applied to a real-world supply chain

Crossref

Middlesex University Research Repository