108,970 research outputs found

    Federated Reinforcement Learning for Electric Vehicles Charging Control on Distribution Networks

    Full text link
    With the growing popularity of electric vehicles (EVs), maintaining power grid stability has become a significant challenge. To address this issue, EV charging control strategies have been developed to manage the switch between vehicle-to-grid (V2G) and grid-to-vehicle (G2V) modes for EVs. In this context, multi-agent deep reinforcement learning (MADRL) has proven its effectiveness in EV charging control. However, existing MADRL-based approaches fail to consider the natural power flow of EV charging/discharging in the distribution network and ignore driver privacy. To deal with these problems, this paper proposes a novel approach that combines multi-EV charging/discharging with a radial distribution network (RDN) operating under optimal power flow (OPF) to distribute power flow in real time. A mathematical model is developed to describe the RDN load. The EV charging control problem is formulated as a Markov Decision Process (MDP) to find an optimal charging control strategy that balances V2G profits, RDN load, and driver anxiety. To effectively learn the optimal EV charging control strategy, a federated deep reinforcement learning algorithm named FedSAC is further proposed. Comprehensive simulation results demonstrate the effectiveness and superiority of our proposed algorithm in terms of the diversity of the charging control strategy, the power fluctuations on RDN, the convergence efficiency, and the generalization ability

    Wind Turbine Fault-Tolerant Control via Incremental Model-Based Reinforcement Learning

    Get PDF
    A reinforcement learning (RL) based fault-tolerant control strategy is developed in this paper for wind turbine torque & pitch control under actuator & sensor faults subject to unknown system models. An incremental model-based heuristic dynamic programming (IHDP) approach, along with a critic-actor structure, is designed to enable fault-tolerance capability and achieve optimal control. Particularly, an incremental model is embedded in the critic-actor structure to quickly learn the potential system changes, such as faults, in real-time. Different from the current IHDP methods that need the intensive evaluation of the state and input matrices, only the input matrix of the incremental model is dynamically evaluated and updated by an online recursive least square estimation procedure in our proposed method. Such a design significantly enhances the online model evaluation efficiency and control performance, especially under faulty conditions. In addition, a value function and a target critic network are incorporated into the main critic-actor structure to improve our method’s learning effectiveness. Case studies for wind turbines under various working conditions are conducted based on the fatigue, aerodynamics, structures, and turbulence (FAST) simulator to demonstrate the proposed method’s solid fault-tolerance capability and adaptability. Note to Practitioners —This work achieves high-performance wind turbine control under unknown actuator & sensor faults. Such a task is still an open problem due to the complexity of turbine dynamics and potential uncertainties in practical situations. A novel data-driven and model-free control strategy based on reinforcement learning is proposed to handle these issues. The designed method can quickly capture the potential changes in the system and adjust its control policy in real-time, rendering strong adaptability and fault-tolerant abilities. It provides data-driven innovations for complex operational tasks of wind turbines and demonstrates the feasibility of applying reinforcement learning to handle fault-tolerant control problems. The proposed method has a generic structure and has the potential to be implemented in other renewable energy systems

    Amortized Network Intervention to Steer the Excitatory Point Processes

    Full text link
    We tackle the challenge of large-scale network intervention for guiding excitatory point processes, such as infectious disease spread or traffic congestion control. Our model-based reinforcement learning utilizes neural ODEs to capture how the networked excitatory point processes will evolve subject to the time-varying changes in network topology. Our approach incorporates Gradient-Descent based Model Predictive Control (GD-MPC), offering policy flexibility to accommodate prior knowledge and constraints. To address the intricacies of planning and overcome the high dimensionality inherent to such decision-making problems, we design an Amortize Network Interventions (ANI) framework, allowing for the pooling of optimal policies from history and other contexts, while ensuring a permutation equivalent property. This property enables efficient knowledge transfer and sharing across diverse contexts. Our approach has broad applications, from curbing infectious disease spread to reducing carbon emissions through traffic light optimization, and thus has the potential to address critical societal and environmental challenges

    Environment Representations of Railway Infrastructure for Reinforcement Learning-Based Traffic Control

    Get PDF
    The real-time railway rescheduling problem is a crucial challenge for human operators since many factors have to be considered during decision making, from the positions and velocities of the vehicles to the different regulations of the individual railway companies. Thanks to that, human operators cannot be expected to provide optimal decisions in a particular situation. Based on the recent successes of multi-agent deep reinforcement learning in challenging control problems, it seems like a suitable choice for such a domain. Consequently, this paper proposes a multi-agent deep reinforcement learning-based approach with different state representational choices to solve the real-time railway rescheduling problem. Furthermore, comparing different methods, the proposed learning-based approaches outperform their competitions, such as the Monte Carlo tree search algorithm, which is utilized as a model-based planner, and also other learning-based methods that utilize different abstractions. The results show that the proposed representation has more significant generalization potential and provides superior performance

    A Sarsa( λ

    Get PDF
    Traffic problems often occur due to the traffic demands by the outnumbered vehicles on road. Maximizing traffic flow and minimizing the average waiting time are the goals of intelligent traffic control. Each junction wants to get larger traffic flow. During the course, junctions form a policy of coordination as well as constraints for adjacent junctions to maximize their own interests. A good traffic signal timing policy is helpful to solve the problem. However, as there are so many factors that can affect the traffic control model, it is difficult to find the optimal solution. The disability of traffic light controllers to learn from past experiences caused them to be unable to adaptively fit dynamic changes of traffic flow. Considering dynamic characteristics of the actual traffic environment, reinforcement learning algorithm based traffic control approach can be applied to get optimal scheduling policy. The proposed Sarsa(λ)-based real-time traffic control optimization model can maintain the traffic signal timing policy more effectively. The Sarsa(λ)-based model gains traffic cost of the vehicle, which considers delay time, the number of waiting vehicles, and the integrated saturation from its experiences to learn and determine the optimal actions. The experiment results show an inspiring improvement in traffic control, indicating the proposed model is capable of facilitating real-time dynamic traffic control

    Multi-stage stochastic optimization and reinforcement learning for forestry epidemic and covid-19 control planning

    Get PDF
    This dissertation focuses on developing new modeling and solution approaches based on multi-stage stochastic programming and reinforcement learning for tackling biological invasions in forests and human populations. Emerald Ash Borer (EAB) is the nemesis of ash trees. This research introduces a multi-stage stochastic mixed-integer programming model to assist forest agencies in managing emerald ash borer insects throughout the U.S. and maximize the public benets of preserving healthy ash trees. This work is then extended to present the first risk-averse multi-stage stochastic mixed-integer program in the invasive species management literature to account for extreme events. Significant computational achievements are obtained using a scenario dominance decomposition and cutting plane algorithm.The results of this work provide crucial insights and decision strategies for optimal resource allocation among surveillance, treatment, and removal of ash trees, leading to a better and healthier environment for future generations. This dissertation also addresses the computational difficulty of solving one of the most difficult classes of combinatorial optimization problems, the Multi-Dimensional Knapsack Problem (MKP). A novel 2-Dimensional (2D) deep reinforcement learning (DRL) framework is developed to represent and solve combinatorial optimization problems focusing on MKP. The DRL framework trains different agents for making sequential decisions and finding the optimal solution while still satisfying the resource constraints of the problem. To our knowledge, this is the first DRL model of its kind where a 2D environment is formulated, and an element of the DRL solution matrix represents an item of the MKP. Our DRL framework shows that it can solve medium-sized and large-sized instances at least 45 and 10 times faster in CPU solution time, respectively, with a maximum solution gap of 0.28% compared to the solution performance of CPLEX. Applying this methodology, yet another recent epidemic problem is tackled, that of COVID-19. This research investigates a reinforcement learning approach tailored with an agent-based simulation model to simulate the disease growth and optimize decision-making during an epidemic. This framework is validated using the COVID-19 data from the Center for Disease Control and Prevention (CDC). Research results provide important insights into government response to COVID-19 and vaccination strategies
    • …
    corecore