Search CORE

145 research outputs found

Complex-Valued Reinforcement Learning: a Context-Based Approach for POMDPs

Author: Takeshi Shibuya
Tomoki Hamagami
Publication venue: 'IntechOpen'
Publication date: 14/01/2011
Field of study

IntechOpen

Proposal and Evaluation of the Improved Penalty Avoiding Rational Policy Making Algorithm

Author: Hiroaki Kobayashi
Kazuteru Miyazaki
Takuji Namatame
Publication venue: 'IntechOpen'
Publication date: 01/01/2009
Field of study

IntechOpen

A Novel Credit Assignment to a Rule with Probabilistic State Transition

Author: Wataru Uemura
Publication venue: 'IntechOpen'
Publication date: 01/02/2010
Field of study

IntechOpen

An Analysis of Multi-Agent Reinforcement Learning for Decentralized Inventory Control Systems

Author: del Rio-Chanona Ehecatl Antonio
Kotecha Niki
Mousa Marwan
Mowbray Max
van de Berg Damien
Publication venue
Publication date: 21/07/2023
Field of study

Most solutions to the inventory management problem assume a centralization of information that is incompatible with organisational constraints in real supply chain networks. The inventory management problem is a well-known planning problem in operations research, concerned with finding the optimal re-order policy for nodes in a supply chain. While many centralized solutions to the problem exist, they are not applicable to real-world supply chains made up of independent entities. The problem can however be naturally decomposed into sub-problems, each associated with an independent entity, turning it into a multi-agent system. Therefore, a decentralized data-driven solution to inventory management problems using multi-agent reinforcement learning is proposed where each entity is controlled by an agent. Three multi-agent variations of the proximal policy optimization algorithm are investigated through simulations of different supply chain networks and levels of uncertainty. The centralized training decentralized execution framework is deployed, which relies on offline centralization during simulation-based policy identification, but enables decentralization when the policies are deployed online to the real system. Results show that using multi-agent proximal policy optimization with a centralized critic leads to performance very close to that of a centralized data-driven solution and outperforms a distributed model-based solution in most cases while respecting the information constraints of the system

arXiv.org e-Print Archive

Deep Learning: Our Miraculous Year 1990-1991

Author: Schmidhuber Juergen
Publication venue
Publication date: 12/05/2020
Field of study

In 2020, we will celebrate that many of the basic ideas behind the deep learning revolution were published three decades ago within fewer than 12 months in our "Annus Mirabilis" or "Miraculous Year" 1990-1991 at TU Munich. Back then, few people were interested, but a quarter century later, neural networks based on these ideas were on over 3 billion devices such as smartphones, and used many billions of times per day, consuming a significant fraction of the world's compute.Comment: 37 pages, 188 references, based on work of 4 Oct 201

arXiv.org e-Print Archive