Generic Reword Model of Partially Observed Markov decision processes (POMDP) for pattern detection

Abstract

Research-based on deep reinforcement learning and stochastic modelization for bottleneck phenomenon optimization is the motivation for this development, by using big data technology and IoT-based sensors. In this paper we propose a generic representation of bottleneck phenomenon who narrows (limit) the possible actions in the observed field, such as the impact of the dangerous epidemics on human activity, economic, social and many other areas, which disturb the related schedule process, where the activity threshold must be included in an interval of actions in order to not enter a bottleneck phenomenon. On the other hand, a powerful reinforcement learning model, who handle tough situations that approach real-world complexity, in this level the data of the previous level well allow a better new action that may yield higher rewards in the next transitions, as well as the precise representation of the reward during the studied situation level, allows more wisdom for the future examination

    Similar works