432 research outputs found

    Mixed-Integer Optimal Control via Reinforcement Learning: A Case Study on Hybrid Vehicle Energy Management

    Full text link
    Many optimal control problems require the simultaneous output of continuous and discrete control variables. Such problems are usually formulated as mixed-integer optimal control (MIOC) problems, which are challenging to solve due to the complexity of the solution space. Numerical methods such as branch-and-bound are computationally expensive and unsuitable for real-time control. This paper proposes a novel continuous-discrete reinforcement learning (CDRL) algorithm, twin delayed deep deterministic actor-Q (TD3AQ), for MIOC problems. TD3AQ combines the advantages of both actor-critic and Q-learning methods, and can handle the continuous and discrete action spaces simultaneously. The proposed algorithm is evaluated on a hybrid electric vehicle (HEV) energy management problem, where real-time control of the continuous variable engine torque and discrete variable gear ratio is essential to maximize fuel economy while satisfying driving constraints. Simulation results on different drive cycles show that TD3AQ can achieve near-optimal solutions compared to dynamic programming (DP) and outperforms the state-of-the-art discrete RL algorithm Rainbow, which is adopted for MIOC by discretizing continuous actions into a finite set of discrete values.Comment: 12 pages, 12 figure

    Effects of operating damage of labyrinth seal on seal leakage and wheelspace hot gas ingress

    Get PDF
    The labyrinth seal is widely used in turbomachinery to minimize or control leakage between areas of different pressure. The present investigation numerically explored the effect of damage and wear of the labyrinth seal on the turbomachinery flow and temperature fields. Specifically, this work investigated: (1) the effect of rubgroove downstream wall angle on seal leakage, (2) the effect of tooth bending damage on the leakage, (3) the effect of tooth "ÃÂÃÂmushrooming"ÃÂÃÂ damage on seal leakage, and (4) the effect of rub-groove axial position and wall angle on gas turbine ingress heating. To facilitate grid generation, an unstructured grid generator named OpenCFD was also developed. The grid generator is written in C++ and generates hybrid grids consisting primarily of Cartesian cells. This investigation of labyrinth seal damage and wear was conducted using the Reynolds averaged Navier-Stokes equations (RANS) to simulate the flows. The high- Reynolds k - Model and the standard wall function were used to model the turbulence. STAR-CD was used to solve the equations, and the grids were generated using the new code OpenCFD. It was found that the damage and wear of the labyrinth seal have a significant effect on the leakage and temperature field, as well as on the flow pattern. The leakage increases significantly faster than the operating clearance increase from the wear. Further, the specific seal configuration resulting from the damage and wear was found to be important. For example, for pure-bending cases, it was found that the bending curvature and the percentage of tooth length that is bent are important, and that the mushroom radius and tooth bending are important for the mushrooming damage cases. When an abradable labyrinth seal was applied to a very large gas turbine wheelspace cavity, it was found that the rub-groove axial position, and to a smaller degree, rub-groove wall angle, alter the magnitude and distribution of the fluid temperature

    A Unified Contraction Analysis of a Class of Distributed Algorithms for Composite Optimization

    Full text link
    We study distributed composite optimization over networks: agents minimize the sum of a smooth (strongly) convex function, the agents' sum-utility, plus a non-smooth (extended-valued) convex one. We propose a general algorithmic framework for such a class of problems and provide a unified convergence analysis leveraging the theory of operator splitting. Our results unify several approaches proposed in the literature of distributed optimization for special instances of our formulation. Distinguishing features of our scheme are: (i) when the agents' functions are strongly convex, the algorithm converges at a linear rate, whose dependencies on the agents' functions and the network topology are decoupled, matching the typical rates of centralized optimization; (ii) the step-size does not depend on the network parameters but only on the optimization ones; and (iii) the algorithm can adjust the ratio between the number of communications and computations to achieve the same rate of the centralized proximal gradient scheme (in terms of computations). This is the first time that a distributed algorithm applicable to composite optimization enjoys such properties.Comment: To appear in the Proc. of the 2019 IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP 19
    • …
    corecore