38 research outputs found

    On the Robustness of Safe Reinforcement Learning under Observational Perturbations

    Full text link
    Safe reinforcement learning (RL) trains a policy to maximize the task reward while satisfying safety constraints. While prior works focus on the performance optimality, we find that the optimal solutions of many safe RL problems are not robust and safe against carefully designed observational perturbations. We formally analyze the unique properties of designing effective state adversarial attackers in the safe RL setting. We show that baseline adversarial attack techniques for standard RL tasks are not always effective for safe RL and proposed two new approaches - one maximizes the cost and the other maximizes the reward. One interesting and counter-intuitive finding is that the maximum reward attack is strong, as it can both induce unsafe behaviors and make the attack stealthy by maintaining the reward. We further propose a more effective adversarial training framework for safe RL and evaluate it via comprehensive experiments. This paper provides a pioneer work to investigate the safety and robustness of RL under observational attacks for future safe RL studies.Comment: 30 pages, 4 figures, 8 table

    Constrained Decision Transformer for Offline Safe Reinforcement Learning

    Full text link
    Safe reinforcement learning (RL) trains a constraint satisfaction policy by interacting with the environment. We aim to tackle a more challenging problem: learning a safe policy from an offline dataset. We study the offline safe RL problem from a novel multi-objective optimization perspective and propose the ϵ\epsilon-reducible concept to characterize problem difficulties. The inherent trade-offs between safety and task performance inspire us to propose the constrained decision transformer (CDT) approach, which can dynamically adjust the trade-offs during deployment. Extensive experiments show the advantages of the proposed method in learning an adaptive, safe, robust, and high-reward policy. CDT outperforms its variants and strong offline safe RL baselines by a large margin with the same hyperparameters across all tasks, while keeping the zero-shot adaptation capability to different constraint thresholds, making our approach more suitable for real-world RL under constraints.Comment: 15 pages, 7 figure

    Ultra-Short-Term Wind Speed Forecasting Using the Hybrid Model of Subseries Reconstruction and Broad Learning System

    No full text
    The traditional decomposition–combination wind speed forecasting model has high complexity and a long calculation time. As a result, an ultra-short-term wind speed hybrid forecasting model based on a broad learning system (BLS) that combines improved variational mode decomposition (EPSO-VMD, EVMD) and subseries reconstruction (SR) is proposed in this work. The values of K and α in the EVMD are determined by minimum mean envelope entropy (MMEE) and enhanced particle swarm optimization (EPSO), and EVMD is used to decompose the original wind speed data. SR is applied to recombine the subseries obtained by EVMD to improve the forecasting efficiency. The sample entropy (SE) is used to quantify the subseries’ complexity, and they are then adaptively divided into high-entropy and low-entropy subseries. Adjacent high-entropy subseries of approximate entropy values are merged to obtain a new group of reconstructed high-entropy subseries, while the low-entropy subseries merge into a new subseries as well. Then, the forecasting results of the reconstructed high- and low-entropy subseries are calculated via the BLS and ARIMA models. Numerical simulation results show that the proposed method is more effective than traditional methods

    Effect of 3-Mercaptopropyltriethoxysilane Modified Illite on the Reinforcement of SBR

    No full text
    To achieve the sustainable development of the rubber industry, the substitute of carbon black, the most widely used but non-renewable filler produced from petroleum, has been considered one of the most effective ways. The naturally occurring illite with higher aspect ratio can be easily obtained in large amounts at lower cost and with lower energy consumption. Therefore, the expansion of its application in advanced materials is of great significance. To explore their potential use as an additive for reinforcing rubber, styrene butadiene rubber (SBR) composites with illites of different size with and without 3-mercaptopropyltriethoxysilane (KH580) modification were studied. It was found that the modification of illite by KH580 increases the K-illite/SBR interaction, and thus improves the dispersion of K-illite in the SBR matrix. The better dispersion of smaller size K-illite with stronger interfacial interaction improves the mechanical properties of SBR remarkably, by an increment of about nine times the tensile strength and more than ten times the modulus. These results demonstrate, except for the evident effect of particle size, the great importance of filler–rubber interaction on the performance of SBR composites. This may be of great significance for the potential wide use of the abundant naturally occurring illite as substitute filler for the rubber industry

    Germinal disc region: an appropriate source for obtaining maternal DNA from eggs

    No full text
    Eggs may serve as an alternative source for DNA extraction. The quality of DNA extracted from eggshell, whole egg liquid (WEL) and germinal disc region (GDR) was compared based on the spectrophotometric, electrophoretic, PCR and reduced-representation library sequencing (RRLS) results. Although these DNAs were all invisible on the gel and can not be measured spectrophotometrically, the GDR DNA was superior to the eggshell and WEL DNA in PCR efficiency. After the whole genome amplification (WGA) was introduced, the yield of GDR DNA was significantly increased. The obtaining DNA had overwhelming superiority over the eggshell and WEL DNA in the ratio of captured genome and the number of called SNP. The GDR DNA extraction followed by the WGA provides a method to obtain sufficient DNA from a single egg.The accepted manuscript in pdf format is listed with the files at the bottom of this page. The presentation of the authors' names and (or) special characters in the title of the manuscript may differ slightly between what is listed on this page and what is listed in the pdf file of the accepted manuscript; that in the pdf file of the accepted manuscript is what was submitted by the author
    corecore