30 research outputs found

    Optimization Landscape of Policy Gradient Methods for Discrete-time Static Output Feedback

    Full text link
    In recent times, significant advancements have been made in delving into the optimization landscape of policy gradient methods for achieving optimal control in linear time-invariant (LTI) systems. Compared with state-feedback control, output-feedback control is more prevalent since the underlying state of the system may not be fully observed in many practical settings. This paper analyzes the optimization landscape inherent to policy gradient methods when applied to static output feedback (SOF) control in discrete-time LTI systems subject to quadratic cost. We begin by establishing crucial properties of the SOF cost, encompassing coercivity, L-smoothness, and M-Lipschitz continuous Hessian. Despite the absence of convexity, we leverage these properties to derive novel findings regarding convergence (and nearly dimension-free rate) to stationary points for three policy gradient methods, including the vanilla policy gradient method, the natural policy gradient method, and the Gauss-Newton method. Moreover, we provide proof that the vanilla policy gradient method exhibits linear convergence towards local minima when initialized near such minima. The paper concludes by presenting numerical examples that validate our theoretical findings. These results not only characterize the performance of gradient descent for optimizing the SOF problem but also provide insights into the effectiveness of general policy gradient methods within the realm of reinforcement learning

    Global Convergence of Two-Timescale Actor-Critic for Solving Linear Quadratic Regulator

    No full text
    The actor-critic (AC) reinforcement learning algorithms have been the powerhouse behind many challenging applications. Nevertheless, its convergence is fragile in general. To study its instability, existing works mostly consider the uncommon double-loop variant or basic models with finite state and action space. We investigate the more practical single-sample two-timescale AC for solving the canonical linear quadratic regulator (LQR) problem, where the actor and the critic update only once with a single sample in each iteration on an unbounded continuous state and action space. Existing analysis cannot conclude the convergence for such a challenging case. We develop a new analysis framework that allows establishing the global convergence to an epsilon-optimal solution with at most an order of epsilon to -2.5 sample complexity. To our knowledge, this is the first finite-time convergence analysis for the single sample two-timescale AC for solving LQR with global optimality. The sample complexity improves those of other variants by orders, which sheds light on the practical wisdom of single sample algorithms. We also further validate our theoretical findings via comprehensive simulation comparisons

    Soil microbial community responses to short-term nitrogen addition in China's Horqin Sandy Land.

    No full text
    Anthropogenic nitrogen (N) addition has increased soil nutrient availability, thereby affecting ecosystem processes and functions in N-limited ecosystems. Long-term N addition decreases plant biodiversity, but the effects of short-term N addition on soil microbial community is poorly understood. The present study examined the impacts of short-term N addition (NH4NO3) on these factors in a sandy grassland and semi-fixed sandy land in the Horqin Sandy Land. We measured the responses of soil microbial biomass C and N; on soil β-1,4-glucosidase (BG) and β-1,4-N-acetylglucosaminidase (NAG) activity; and soil microflora characteristics to N additions gradient with 0 (control), 5 (N5), 10 (N10), and 15 (N15) g N m-2 yr-1. The soil microbial biomass indices, NAG activity, and soil microflora characteristics did not differ significantly among the N levels, and there was no difference at the two sites. The competition for N between plants and soil microbes was not eliminated by short-term N addition due to the low soil nutrient and moisture contents, and the relationships among the original soil microbes did not change. However, N addition increased BG activity in the N5 and N10 additions in the sandy grassland, and in the N5, N10, and N15 additions in the semi-fixed sandy land. This may be due to increased accumulation and fixation of plant litter into soils in response to N addition, leading to increased microbial demand for a C source and increased soil BG activity. Future research should explore the relationships between soil microbial community and N addition at the two sites

    Interpretable Perturbator for Variable Selection in near-Infrared Spectral Analysis

    No full text
    A perturbator was developed for variable selection in near-infrared (NIR) spectral analysis based on the perturbation strategy in deep learning for developing interpretation methods. A deep learning predictor was first constructed to predict the targets from the spectra in the training set. Then, taking the output of the predictor as a reference, the perturbator was trained to derive the perturbation-positive (P+) and perturbation-negative (P–) features from the spectra. Therefore, the weight (σ) of the perturbator layer can be a criterion to evaluate the importance of the variables in the spectra. Ranking the spectral variables by the criterion, the number of the variables used in the quantitative model can be obtained through cross-validation. Three NIR data sets were used to evaluate the proposed method. The root mean squared error was found to be comparable with or superior to that obtained by the commonly used methods. Moreover, the selected spectral variables are interpretable in identifying the key spectral features related to the prediction target. Therefore, the proposed method provides not only an effective tool for optimizing quantitative model, but also an efficient way for explaining spectra of multicomponent samples

    Interpretable Perturbator for Variable Selection in near-Infrared Spectral Analysis

    No full text
    A perturbator was developed for variable selection in near-infrared (NIR) spectral analysis based on the perturbation strategy in deep learning for developing interpretation methods. A deep learning predictor was first constructed to predict the targets from the spectra in the training set. Then, taking the output of the predictor as a reference, the perturbator was trained to derive the perturbation-positive (P+) and perturbation-negative (P–) features from the spectra. Therefore, the weight (σ) of the perturbator layer can be a criterion to evaluate the importance of the variables in the spectra. Ranking the spectral variables by the criterion, the number of the variables used in the quantitative model can be obtained through cross-validation. Three NIR data sets were used to evaluate the proposed method. The root mean squared error was found to be comparable with or superior to that obtained by the commonly used methods. Moreover, the selected spectral variables are interpretable in identifying the key spectral features related to the prediction target. Therefore, the proposed method provides not only an effective tool for optimizing quantitative model, but also an efficient way for explaining spectra of multicomponent samples

    Improved short-circuit current density of a-Si:H thin film solar cells with n-type silicon carbide layer

    No full text
    In this work, the performance of p-i-n hydrogenated amorphous silicon thin film solar cells by adopting n-type silicon carbide (n-SiCx:H) layer was investigated. By varying CH4/SiH4 gas flow ratio, refractive index and electrical conductivity of n-SiCx:H thin films were changed in the range of 3.4 to 3.8 and 1.48E-5 to 1.24 S/cm, respectively. Compared with solar cells with n-Si:H/Ag configuration, short-circuit current density (J (sc) ) of solar cells with n-SiCx:H/Ag configuration was improved up to 8.4%, which was comparable with that of solar cells with n-Si:H/ZnO/Ag configuration. Improved J (sc) was related with enhanced spectral response at long wavelength of 500-800 nm. It was supposed that the decreased refractive index of n-SiCx:H layer resulted in the increased back reflectance, which contributed to the improved J (sc). Our experiments demonstrated that n-SiCx:H thin films were attractive choice because they functioned both as n-layer and interlayer in back reflector, and their deposition method was compatible with preparation process of solar cells

    硅氧合金薄膜结构光电特性及其在硅基薄膜太阳电池中的应用

    No full text
    Silicon film alloying can adjust the bandgap and reflective index of silicon film, thus it is an important material for improving solar cell performance. Mixed-phase silicon suboxide film, with the properties of high conductivity, high bandgap, and low refractive index, has been widely used as the intrinsic and doped layers in p-i-n silicon thin film solar cell. In this paper, we review the microstructure, optical and electrical characteristics of silicon suboxide films, and also its role as the window layer, absorber layer, intermediate reflector, and back reflector in silicon thin film solar cell

    A study of superstrate amorphous silicon thin film solar cells and modules on flexible BZO glass

    No full text
    Flexible thin film silicon solar modules on heat-resistant transparent flexible substrates are promising to achieve high efficiency by a combination of high-quality silicon thin films and fully monolithic series integration. In this work, performance of superstrate hydrogenated amorphous silicon (a-Si:H) thin film solar cells and modules on flexible glass using boron-doped zinc oxide (BZO) front electrode have been investigated. Compared with conventional glass, BZO thin films on flexible glass exhibited mixed structure of large-sized pyramid and small-sized grain, preferential crystalline orientations of (100) and (110), relatively lower surface roughness and scattering ability. Accordingly, compared with conventional BZO glass, a-Si:H thin film solar cells on flexible BZO glass exhibited a relative increase in open-circuit voltage, fill factor, and efficiency of 2.0, 5.5, and 3.4%, respectively. Finally, similar to 50cm(2) flexible a-Si:H thin film solar modules were prepared by fully monolithic series integration using laser scribing, and relatively higher efficiency was achieved by improving thin films uniformity

    Preparation and Experimental Investigations of Low-Shrinkage Commercial Concrete for Tunnel Annular Secondary Lining Engineering

    No full text
    Secondary lining concrete is frequently used in underground tunnels. Due to the internal restriction of the annular concrete segment, micro-cracks may be caused by temperature stress and volume deformation, thus affecting the safe transportation of the tunnel. The purpose of this study is to provide a concrete experimental basis with low hydration heat and low shrinkage for tunnel engineering with different construction requirements. Different amounts of expansion agent (EA), shrinkage-reducing agent (SRA), and superabsorbent polymer (SAP) were considered in commercial concrete. It was found that EA elevated the degree of hydration and the hydration exothermic rate, while SRA and SAP showed the opposite regularity. SRA has the optimum shrinkage reduction performance with a 79% reduction in shrinkage, but the strength decreases significantly compared to EA and SAP groups. The effect of the combination of different shrinkage reducing components in commercial concrete is instructive for the hydration rate and shrinkage compensation in secondary lining engineering
    corecore