30 research outputs found
Optimization Landscape of Policy Gradient Methods for Discrete-time Static Output Feedback
In recent times, significant advancements have been made in delving into the
optimization landscape of policy gradient methods for achieving optimal control
in linear time-invariant (LTI) systems. Compared with state-feedback control,
output-feedback control is more prevalent since the underlying state of the
system may not be fully observed in many practical settings. This paper
analyzes the optimization landscape inherent to policy gradient methods when
applied to static output feedback (SOF) control in discrete-time LTI systems
subject to quadratic cost. We begin by establishing crucial properties of the
SOF cost, encompassing coercivity, L-smoothness, and M-Lipschitz continuous
Hessian. Despite the absence of convexity, we leverage these properties to
derive novel findings regarding convergence (and nearly dimension-free rate) to
stationary points for three policy gradient methods, including the vanilla
policy gradient method, the natural policy gradient method, and the
Gauss-Newton method. Moreover, we provide proof that the vanilla policy
gradient method exhibits linear convergence towards local minima when
initialized near such minima. The paper concludes by presenting numerical
examples that validate our theoretical findings. These results not only
characterize the performance of gradient descent for optimizing the SOF problem
but also provide insights into the effectiveness of general policy gradient
methods within the realm of reinforcement learning
Global Convergence of Two-Timescale Actor-Critic for Solving Linear Quadratic Regulator
The actor-critic (AC) reinforcement learning algorithms have been the powerhouse behind many challenging applications. Nevertheless, its convergence is fragile in general. To study its instability, existing works mostly consider the uncommon double-loop variant or basic models with finite state and action space. We investigate the more practical single-sample two-timescale AC for solving the canonical linear quadratic regulator (LQR) problem, where the actor and the critic update only once with a single sample in each iteration on an unbounded continuous state and action space. Existing analysis cannot conclude the convergence for such a challenging case. We develop a new analysis framework that allows establishing the global convergence to an epsilon-optimal solution with at most an order of epsilon to -2.5 sample complexity. To our knowledge, this is the first finite-time convergence analysis for the single sample two-timescale AC for solving LQR with global optimality. The sample complexity improves those of other variants by orders, which sheds light on the practical wisdom of single sample algorithms. We also further validate our theoretical findings via comprehensive simulation comparisons
Soil microbial community responses to short-term nitrogen addition in China's Horqin Sandy Land.
Anthropogenic nitrogen (N) addition has increased soil nutrient availability, thereby affecting ecosystem processes and functions in N-limited ecosystems. Long-term N addition decreases plant biodiversity, but the effects of short-term N addition on soil microbial community is poorly understood. The present study examined the impacts of short-term N addition (NH4NO3) on these factors in a sandy grassland and semi-fixed sandy land in the Horqin Sandy Land. We measured the responses of soil microbial biomass C and N; on soil β-1,4-glucosidase (BG) and β-1,4-N-acetylglucosaminidase (NAG) activity; and soil microflora characteristics to N additions gradient with 0 (control), 5 (N5), 10 (N10), and 15 (N15) g N m-2 yr-1. The soil microbial biomass indices, NAG activity, and soil microflora characteristics did not differ significantly among the N levels, and there was no difference at the two sites. The competition for N between plants and soil microbes was not eliminated by short-term N addition due to the low soil nutrient and moisture contents, and the relationships among the original soil microbes did not change. However, N addition increased BG activity in the N5 and N10 additions in the sandy grassland, and in the N5, N10, and N15 additions in the semi-fixed sandy land. This may be due to increased accumulation and fixation of plant litter into soils in response to N addition, leading to increased microbial demand for a C source and increased soil BG activity. Future research should explore the relationships between soil microbial community and N addition at the two sites
Interpretable Perturbator for Variable Selection in near-Infrared Spectral Analysis
A perturbator was developed for variable selection in
near-infrared
(NIR) spectral analysis based on the perturbation strategy in deep
learning for developing interpretation methods. A deep learning predictor
was first constructed to predict the targets from the spectra in the
training set. Then, taking the output of the predictor as a reference,
the perturbator was trained to derive the perturbation-positive (P+) and perturbation-negative (P–) features
from the spectra. Therefore, the weight (σ) of the perturbator
layer can be a criterion to evaluate the importance of the variables
in the spectra. Ranking the spectral variables by the criterion, the
number of the variables used in the quantitative model can be obtained
through cross-validation. Three NIR data sets were used to evaluate
the proposed method. The root mean squared error was found to be comparable
with or superior to that obtained by the commonly used methods. Moreover,
the selected spectral variables are interpretable in identifying the
key spectral features related to the prediction target. Therefore,
the proposed method provides not only an effective tool for optimizing
quantitative model, but also an efficient way for explaining spectra
of multicomponent samples
Interpretable Perturbator for Variable Selection in near-Infrared Spectral Analysis
A perturbator was developed for variable selection in
near-infrared
(NIR) spectral analysis based on the perturbation strategy in deep
learning for developing interpretation methods. A deep learning predictor
was first constructed to predict the targets from the spectra in the
training set. Then, taking the output of the predictor as a reference,
the perturbator was trained to derive the perturbation-positive (P+) and perturbation-negative (P–) features
from the spectra. Therefore, the weight (σ) of the perturbator
layer can be a criterion to evaluate the importance of the variables
in the spectra. Ranking the spectral variables by the criterion, the
number of the variables used in the quantitative model can be obtained
through cross-validation. Three NIR data sets were used to evaluate
the proposed method. The root mean squared error was found to be comparable
with or superior to that obtained by the commonly used methods. Moreover,
the selected spectral variables are interpretable in identifying the
key spectral features related to the prediction target. Therefore,
the proposed method provides not only an effective tool for optimizing
quantitative model, but also an efficient way for explaining spectra
of multicomponent samples
Improved short-circuit current density of a-Si:H thin film solar cells with n-type silicon carbide layer
In this work, the performance of p-i-n hydrogenated amorphous silicon thin film solar cells by adopting n-type silicon carbide (n-SiCx:H) layer was investigated. By varying CH4/SiH4 gas flow ratio, refractive index and electrical conductivity of n-SiCx:H thin films were changed in the range of 3.4 to 3.8 and 1.48E-5 to 1.24 S/cm, respectively. Compared with solar cells with n-Si:H/Ag configuration, short-circuit current density (J (sc) ) of solar cells with n-SiCx:H/Ag configuration was improved up to 8.4%, which was comparable with that of solar cells with n-Si:H/ZnO/Ag configuration. Improved J (sc) was related with enhanced spectral response at long wavelength of 500-800 nm. It was supposed that the decreased refractive index of n-SiCx:H layer resulted in the increased back reflectance, which contributed to the improved J (sc). Our experiments demonstrated that n-SiCx:H thin films were attractive choice because they functioned both as n-layer and interlayer in back reflector, and their deposition method was compatible with preparation process of solar cells
硅氧合金薄膜结构光电特性及其在硅基薄膜太阳电池中的应用
Silicon film alloying can adjust the bandgap and reflective index of silicon film, thus it is an important material for improving solar cell performance. Mixed-phase silicon suboxide film, with the properties of high conductivity, high bandgap, and low refractive index, has been widely used as the intrinsic and doped layers in p-i-n silicon thin film solar cell. In this paper, we review the microstructure, optical and electrical characteristics of silicon suboxide films, and also its role as the window layer, absorber layer, intermediate reflector, and back reflector in silicon thin film solar cell
Variations in diurnal and seasonal net ecosystem carbon dioxide exchange in a semiarid sandy grassland ecosystem in China's Horqin Sandy Land
A study of superstrate amorphous silicon thin film solar cells and modules on flexible BZO glass
Flexible thin film silicon solar modules on heat-resistant transparent flexible substrates are promising to achieve high efficiency by a combination of high-quality silicon thin films and fully monolithic series integration. In this work, performance of superstrate hydrogenated amorphous silicon (a-Si:H) thin film solar cells and modules on flexible glass using boron-doped zinc oxide (BZO) front electrode have been investigated. Compared with conventional glass, BZO thin films on flexible glass exhibited mixed structure of large-sized pyramid and small-sized grain, preferential crystalline orientations of (100) and (110), relatively lower surface roughness and scattering ability. Accordingly, compared with conventional BZO glass, a-Si:H thin film solar cells on flexible BZO glass exhibited a relative increase in open-circuit voltage, fill factor, and efficiency of 2.0, 5.5, and 3.4%, respectively. Finally, similar to 50cm(2) flexible a-Si:H thin film solar modules were prepared by fully monolithic series integration using laser scribing, and relatively higher efficiency was achieved by improving thin films uniformity
Preparation and Experimental Investigations of Low-Shrinkage Commercial Concrete for Tunnel Annular Secondary Lining Engineering
Secondary lining concrete is frequently used in underground tunnels. Due to the internal restriction of the annular concrete segment, micro-cracks may be caused by temperature stress and volume deformation, thus affecting the safe transportation of the tunnel. The purpose of this study is to provide a concrete experimental basis with low hydration heat and low shrinkage for tunnel engineering with different construction requirements. Different amounts of expansion agent (EA), shrinkage-reducing agent (SRA), and superabsorbent polymer (SAP) were considered in commercial concrete. It was found that EA elevated the degree of hydration and the hydration exothermic rate, while SRA and SAP showed the opposite regularity. SRA has the optimum shrinkage reduction performance with a 79% reduction in shrinkage, but the strength decreases significantly compared to EA and SAP groups. The effect of the combination of different shrinkage reducing components in commercial concrete is instructive for the hydration rate and shrinkage compensation in secondary lining engineering