13 research outputs found

    Using a reinforcement learning approach in a discrete event manufacturing system

    Get PDF
    Ergänzung der gedruckten Ausgabe. Diese ist nur online verfügbar ist

    Enhancement of Industrial Energy Efficiency and Sustainability

    Get PDF
    Industrial energy efficiency has been recognized as a major contributor, in the broader set of industrial resources, to improved sustainability and circular economy. Nevertheless, the uptake of energy efficiency measures and practices is still quite low, due to the existence of several barriers. Research has broadly discussed them, together with their drivers. More recently, many researchers have highlighted the existence of several benefits, beyond mere energy savings, stemming from the adoption of such measures, for several stakeholders involved in the value chain of energy efficiency solutions. Nevertheless, a deep understanding of the relationships between the use of the energy resource and other resources in industry, together with the most important factors for the uptake of such measures—also in light of the implications on the industrial operations—is still lacking. However, such understanding could further stimulate the adoption of solutions for improved industrial energy efficiency and sustainability

    Advances in Condition Monitoring, Optimization and Control for Complex Industrial Processes

    Get PDF
    The book documents 25 papers collected from the Special Issue “Advances in Condition Monitoring, Optimization and Control for Complex Industrial Processes”, highlighting recent research trends in complex industrial processes. The book aims to stimulate the research field and be of benefit to readers from both academic institutes and industrial sectors

    Hierarchical Average Reward Reinforcement Learning Hierarchical Average Reward Reinforcement Learning

    No full text
    Hierarchical reinforcement learning (HRL) is a general framework for scaling reinforcement learning (RL) to problems with large state and action spaces by using the task (or action) structure to restrict the space of policies. Prior work in HRL including HAMs, options, MAXQ, and PHAMs has been limited to the discrete-time discounted reward semi-Markov decision process (SMDP) model. The average reward optimality criterion has been recognized to be more appropriate for a wide class of continuing tasks than the discounted framework. Although average reward RL has been studied for decades, prior work has been largely limited to flat policy representations. In this paper, we develop a framework for HRL based on the average reward optimality criterion. We investigate two formulations of HRL based on the average reward SMDP model, both for discrete-time and continuous-time. These formulations correspond to two notions of optimality that have been previously explored in HRL: hierarchical optimality and recursive optimality. We present algorithms that learn to find hierarchically and recursively optimal average reward policies under discrete-time and continuous-time average rewar

    Hierarchical Average Reward Reinforcement Learning

    No full text
    Hierarchical reinforcement learning (HRL) is the study of mechanisms for exploiting the structure of tasks in order to learn more quickly. By decomposing tasks into subtasks, fully or partially specified subtask solutions can be reused in solving tasks at higher levels of abstraction. The theory of semi-Markov decision processes provides a theoretical basis for HRL. Several variant representational schemes based on SMDP models have been studied in previous work, all of which are based on the discrete-time discounted SMDP model. In this approach, policies are learned that maximize the long-term discounted sum of rewards. In this paper we investigate two formulations of HRL based on the average-reward SMDP model, both for discrete time and continuous time. In the average-reward model, policies are sought that maximize the expected reward per step. The two formulations correspond to two different notions of optimality that have been explored in previous work on HRL: hierarchical optimality, which corresponds to the set of optimal policies in the space defined by a task hierarchy, and a weaker local model called recursive optimality. What distinguishes the two models in the average reward framework is the optimizatio
    corecore