101 research outputs found

    Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty

    Full text link
    Robust reinforcement learning (RL) aims to find a policy that optimizes the worst-case performance in the face of uncertainties. In this paper, we focus on action robust RL with the probabilistic policy execution uncertainty, in which, instead of always carrying out the action specified by the policy, the agent will take the action specified by the policy with probability 1ρ1-\rho and an alternative adversarial action with probability ρ\rho. We establish the existence of an optimal policy on the action robust MDPs with probabilistic policy execution uncertainty and provide the action robust Bellman optimality equation for its solution. Furthermore, we develop Action Robust Reinforcement Learning with Certificates (ARRLC) algorithm that achieves minimax optimal regret and sample complexity. Furthermore, we conduct numerical experiments to validate our approach's robustness, demonstrating that ARRLC outperforms non-robust RL algorithms and converges faster than the robust TD algorithm in the presence of action perturbations

    Direct numerical simulation of compressible turbulence accelerated by graphics processing unit. Part 1: An open-source high accuracy accelerated computational fluid dynamic software

    Full text link
    This paper introduces open-source computational fluid dynamics software named open computational fluid dynamic code for scientific computation with graphics processing unit (GPU) system (OpenCFD-SCU), developed by the authors for direct numerical simulation (DNS) of compressible wall-bounded turbulence. This software is based on the finite difference method and is accelerated by the use of a GPU, which provides an acceleration by a factor of more than 200 compared with central processing unit (CPU) software based on the same algorithm and number of message passing interface (MPI) processes, and the running speed of OpenCFD-SCU with just 512 GPUs exceed that of CPU software with 130\,000 CPUs. GPU-Stream technology is used to implement overlap of computing and communication, achieving 98.7\% parallel weak scalability with 24\,576 GPUs. The software includes a variety of high-precision finite difference schemes, and supports a hybrid finite difference scheme, enabling it to provide both robustness and high precision when simulating complex supersonic and hypersonic flows. When used with the wide range of supercomputers currently available, the software should able to improve the performance of large-scale simulations by up to two orders on the computational scale. Then, OpenCFD-SCU is applied to a validation and verification case of a Mach 2.9 compression ramp with mesh numbers up to 31.2 billion. More challenging cases using hybrid finite schemes are shown in Part 2(Dang, Li et al. 2022). The code is available and supported at \url{http://developer.hpccube.com/codes/danggl/opencfd-scu.git}.Comment: 23 pages, 25 figure
    corecore