70 research outputs found

    Multistage decisions and risk in Markov decision processes: towards effective approximate dynamic programming architectures

    Get PDF
    The scientific domain of this thesis is optimization under uncertainty for discrete event stochastic systems. In particular, this thesis focuses on the practical implementation of the Dynamic Programming (DP) methodology to discrete event stochastic systems. Unfortunately DP in its crude form suffers from three severe computational obstacles that make its imple-mentation to such systems an impossible task. This thesis addresses these obstacles by developing and executing practical Approximate Dynamic Programming (ADP) techniques. Specifically, for the purposes of this thesis we developed the following ADP techniques. The first one is inspired from the Reinforcement Learning (RL) literature and is termed as Real Time Approximate Dynamic Programming (RTADP). The RTADP algorithm is meant for active learning while operating the stochastic system. The basic idea is that the agent while constantly interacts with the uncertain environment accumulates experience, which enables him to react more optimal in future similar situations. While the second one is an off-line ADP procedure These ADP techniques are demonstrated on a variety of discrete event stochastic systems such as: i) a three stage queuing manufacturing network with recycle, ii) a supply chain of the light aromatics of a typical refinery, iii) several stochastic shortest path instances with a single starting and terminal state and iv) a general project portfolio management problem. Moreover, this work addresses, in a systematic way, the issue of multistage risk within the DP framework by exploring the usage of intra-period and inter-period risk sensitive utility functions. In this thesis we propose a special structure for an intra-period utility and compare the derived policies in several multistage instances.Ph.D.Committee Chair: Jay H. Lee; Committee Member: Martha Grover; Committee Member: Matthew J. Realff; Committee Member: Shabbir Ahmed; Committee Member: Stylianos Kavadia

    Reinforcement Learning and Tree Search Methods for the Unit Commitment Problem

    Get PDF
    The unit commitment (UC) problem, which determines operating schedules of generation units to meet demand, is a fundamental task in power systems operation. Existing UC methods using mixed-integer programming are not well-suited to highly stochastic systems. Approaches which more rigorously account for uncertainty could yield large reductions in operating costs by reducing spinning reserve requirements; operating power stations at higher efficiencies; and integrating greater volumes of variable renewables. A promising approach to solving the UC problem is reinforcement learning (RL), a methodology for optimal decision-making which has been used to conquer long-standing grand challenges in artificial intelligence. This thesis explores the application of RL to the UC problem and addresses challenges including robustness under uncertainty; generalisability across multiple problem instances; and scaling to larger power systems than previously studied. To tackle these issues, we develop guided tree search, a novel methodology combining model-free RL and model-based planning. The UC problem is formalised as a Markov decision process and we develop an open-source environment based on real data from Great Britain's power system to train RL agents. In problems of up to 100 generators, guided tree search is shown to be competitive with deterministic UC methods, reducing operating costs by up to 1.4\%. An advantage of RL is that the framework can be easily extended to incorporate considerations important to power systems operators such as robustness to generator failure, wind curtailment or carbon prices. When generator outages are considered, guided tree search saves over 2\% in operating costs as compared with methods using conventional N−xN-x reserve criteria

    Formal Methods for Autonomous Systems

    Full text link
    Formal methods refer to rigorous, mathematical approaches to system development and have played a key role in establishing the correctness of safety-critical systems. The main building blocks of formal methods are models and specifications, which are analogous to behaviors and requirements in system design and give us the means to verify and synthesize system behaviors with formal guarantees. This monograph provides a survey of the current state of the art on applications of formal methods in the autonomous systems domain. We consider correct-by-construction synthesis under various formulations, including closed systems, reactive, and probabilistic settings. Beyond synthesizing systems in known environments, we address the concept of uncertainty and bound the behavior of systems that employ learning using formal methods. Further, we examine the synthesis of systems with monitoring, a mitigation technique for ensuring that once a system deviates from expected behavior, it knows a way of returning to normalcy. We also show how to overcome some limitations of formal methods themselves with learning. We conclude with future directions for formal methods in reinforcement learning, uncertainty, privacy, explainability of formal methods, and regulation and certification

    Towards Optimal Real-Time Automotive Emission Control

    Get PDF
    The legal bounds on both toxic and carbon dioxide emissions from automotive vehicles are continuously being lowered, forcing manufacturers to rely on increasingly advanced methods to reduce emissions and improve fuel efficiency. Though great strides have been made to date, there is still a large potential for continued improvement. Today, many subsystems in vehicles are optimized for static operation, where subsystems in the vehicle perform well at constant operating points. Extending optimal operation to the dynamic case through the use of optimal control is one method for further improvements.This thesis focuses on two subtopics that are crucial for implementing optimal control; dynamic modeling of vehicle subsystems, and methods for generating and evaluating computationally efficient optimal controllers. Though today\u27s vehicles are outfitted with increasingly powerful computers, their computational performance is low compared to a conventional PC. Any controller must therefore be very computationally efficient in order to feasibly be implemented. Furthermore, a sufficiently accurate dynamic model of the subsystem is needed in order to determine the optimal control value. Though many dynamic models of the vehicle\u27s subsystems exist, most do not fulfill the specific requirements set by optimal controllers.This thesis comprises five papers that, together, probe some methods of implementing dynamic optimal control in real-time. Two papers develop optimal control methods, one introduces and studies a cold-start model of the three-way catalyst, one paper extends the three-way catalyst model and studies optimal cold-start control, and one considers fuel-optimally controlling the speed of the engine in a series-hybrid. By combining the method and model papers we open for the potential to reduce toxic emissions by better managing cold-starts in hybrid vehicles, as well as reducing carbon dioxide emissions by operating the engine in a more efficient manner during transients
    • …
    corecore