Search CORE

109 research outputs found

Zap Q-Learning for Optimal Stopping Time Problems

Author: Bušić Ana
Chen Shuhang
Devraj Adithya M.
Meyn Sean P.
Publication venue
Publication date: 01/01/2019
Field of study

The objective in this paper is to obtain fast converging reinforcement learning algorithms to approximate solutions to the problem of discounted cost optimal stopping in an irreducible, uniformly ergodic Markov chain, evolving on a compact subset of

\mathbb{R}^n

. We build on the dynamic programming approach taken by Tsitsikilis and Van Roy, wherein they propose a Q-learning algorithm to estimate the optimal state-action value function, which then defines an optimal stopping rule. We provide insights as to why the convergence rate of this algorithm can be slow, and propose a fast-converging alternative, the "Zap-Q-learning" algorithm, designed to achieve optimal rate of convergence. For the first time, we prove the convergence of the Zap-Q-learning algorithm under the assumption of linear function approximation setting. We use ODE analysis for the proof, and the optimal asymptotic variance property of the algorithm is reflected via fast convergence in a finance example

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

The Preventive and Therapeutic Effect of Caloric Restriction Therapy on Type 2 Diabetes Mellitus

Author: Chen Guofang
Chunrui Li
Liu Chao
Xu Shuhang
Publication venue: 'IntechOpen'
Publication date: 01/04/2015
Field of study

IntechOpen

Crossref

Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation

Author: Bušić Ana
Chen Shuhang
Devraj Adithya M.
Meyn Sean
Publication venue
Publication date: 01/01/2020
Field of study

This paper concerns error bounds for recursive equations subject to Markovian disturbances. Motivating examples abound within the fields of Markov chain Monte Carlo (MCMC) and Reinforcement Learning (RL), and many of these algorithms can be interpreted as special cases of stochastic approximation (SA). It is argued that it is not possible in general to obtain a Hoeffding bound on the error sequence, even when the underlying Markov chain is reversible and geometrically ergodic, such as the M/M/1 queue. This is motivation for the focus on mean square error bounds for parameter estimates. It is shown that mean square error achieves the optimal rate of

O(1/n)

, subject to conditions on the step-size sequence. Moreover, the exact constants in the rate are obtained, which is of great value in algorithm design

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning

Author: Borkar Vivek
Chen Shuhang
Devraj Adithya
Kontoyiannis Ioannis
Meyn Sean
Publication venue
Publication date: 28/12/2021
Field of study

The paper concerns convergence and asymptotic statistics for stochastic approximation driven by Markovian noise:

\theta_{n+1}= \theta_n + \alpha_{n + 1} f(\theta_n, \Phi_{n+1}) \,,\quad n\ge 0,

in which each

\theta_n\in\Re^d

\{ \Phi_n \}

is a Markov chain on a general state space X with stationary distribution

\pi

, and

f:\Re^d\times \text{X} \to\Re^d

. In addition to standard Lipschitz bounds on

f

, and conditions on the vanishing step-size sequence

\{\alpha_n\}

, it is assumed that the associated ODE is globally asymptotically stable with stationary point denoted

\theta^*

, where

\bar f(\theta)=E[f(\theta,\Phi)]

with

\Phi\sim\pi

. Moreover, the ODE@

\infty

defined with respect to the vector field,

\bar f_\infty(\theta):= \lim_{r\to\infty} r^{-1} \bar f(r\theta) \,,\qquad \theta\in\Re^d,

is asymptotically stable. The main contributions are summarized as follows: (i) The sequence

\theta

is convergent if

\Phi

is geometrically ergodic, and subject to compatible bounds on

f

. The remaining results are established under a stronger assumption on the Markov chain: A slightly weaker version of the Donsker-Varadhan Lyapunov drift condition known as (DV3). (ii) A Lyapunov function is constructed for the joint process

\{\theta_n,\Phi_n\}

that implies convergence of

\{ \theta_n\}

L_4

. (iii) A functional CLT is established, as well as the usual one-dimensional CLT for the normalized error

z_n:= (\theta_n-\theta^*)/\sqrt{\alpha_n}

. Moment bounds combined with the CLT imply convergence of the normalized covariance,

\lim_{n \to \infty} E [ z_n z_n^T ] = \Sigma_\theta,

where

\Sigma_\theta

is the asymptotic covariance appearing in the CLT. (iv) An example is provided where the Markov chain

\Phi

is geometrically ergodic but it does not satisfy (DV3). While the algorithm is convergent, the second moment is unbounded

arXiv.org e-Print Archive

Multifunctional gold nanostar conjugates for tumor imaging and combined photothermal and chemo-therapy

Author: Achilefu Samuel
Chen Haiyan
Cui Sisi
Dai Shuhang
Gu Yueqing
Ma Yuxiang
Zhang Xin
Publication venue: Digital Commons@Becker
Publication date: 01/01/2013
Field of study

Uniform gold nanostars (Au NS) were conjugated with cyclic RGD (cRGD) and near infrared (NIR) fluorescence probe (MPA) or anti-cancer drug (DOX) to obtain multi-functional nanoconstructs, Au-cRGD-MPA and Au-cRGD-DOX respectively. The NIR contrast agent Au-cRGD-MPA was shown to have low cytotoxicity. Using tumor cells and tumor bearing mice, these imaging nanoparticles demonstrated favorable tumor-targeting capability mediated by RGD peptide binding to its over-expressed receptor on the tumor cells. The multi-therapeutic analogue, Au-cRGD-DOX, integrates targeting tumor, chemotherapy and photo-thermotherapy into a single system. The synergistic effect of photo-thermal therapy and chemotherapy was demonstrated in different tumor cell lines and in vivo using S180 tumor-bearing mouse models. The viability of MDA-MB-231 cells was only 40 % after incubation with Au-cRGD-DOX and irradiation with NIR light. Both tail vein and intratumoral injections showed Au-cRGD-DOX treated mice exhibiting the slowest tumor increase. These results indicate that the multifunctional nanoconstruct is a promising combined therapeutic agent for tumor-targeting treatment, with the potential to enhance the anti-cancer treatment outcomes

Crossref

Digital Commons@Becker

PubMed Central

Initial ablation ratio predicts the recurrence of low-risk papillary thyroid microcarcinomas treated with microwave ablation: a 5-year, single-institution cohort study

Author: Chao Liu
Guofang Chen
Lin Jiang
Shuhang Xu
Xue Han
Yujiang Li
Yujie Ren
Publication venue: Bioscientifica
Publication date: 01/08/2023
Field of study

Objective: To assess the long-term efficacy and safety of microwave ablation (MWA) in treating low-risk papillary thyroid microcarcinomas (PTMC) and to identify predictive factors for the postoperative local tumor progression of PTMC. Methods: A total of 154 low-risk PTMC patients treated with MWA who were followed up for at least 3 months were retrospectively recruited. Ultrasonography was performed after MWA to assess the local tumor progression. Adverse events associated with MWA were recorded. The ablated volume (Va) and initial ablation ratio (IAR) were measured to assess their influences on the recurrence risk of PTMC. Results: The mean tumor volume of PTMC before MWA was 0.071 (0.039, 0.121) cm3, with a maximum diameter of 0.60 ± 0.18 cm. All PTMC patients were followed up for 6 (3, 18) months. Va increased immediately after MWA, then gradually decreased over time, till significantly smaller at 12 months than that before MWA (P 2.0 mU/L) of PTMC patients were not correlated with local tumor progression. Conclusion: MWA is an effective therapeutic strategy for low-risk PTMC with high safety. The maximum tumor diameter and IAR are predictive factors for the local tumor progression of PTMC after MWA

Directory of Open Access Journals

Overexpression of the Glutathione Peroxidase 5 (RcGPX5) Gene From Rhodiola crenulata Increases Drought Tolerance in Salvia miltiorrhiza

Author: Chengbin Chen
Deshui Yu
Lipeng Zhang
Mei Wu
Shuhang Jia
Tao Wei
Wenqin Song
Yanjiao Teng
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2019
Field of study

Excessive cellular accumulation of reactive oxygen species (ROS) due to environmental stresses can critically disrupt plant development and negatively affect productivity. Plant glutathione peroxidases (GPXs) play an important role in ROS scavenging by catalyzing the reduction of H2O2 and other organic hydroperoxides to protect plant cells from oxidative stress damage. RcGPX5, a member of the GPX gene family, was isolated from a traditional medicinal plant Rhodiola crenulata and constitutively expressed in Salvia miltiorrhiza under control of the CaMV 35S promoter. Transgenic plants showed increased tolerance to oxidative stress caused by application of H2O2 and drought, and had reduced production of malondialdehyde (MDA) compared with the wild type. Under drought stress, seedlings of the transgenic lines wilted later than the wild type and recovered growth 1 day after re-watering. In addition, the reduced glutathione (GSH) and total glutathione (T-GSH) contents were higher in the transgenic lines, with increased enzyme activities including glutathione reductase (GR), ascorbate peroxidase (APX), and GPX. These changes prevent H2O2 and O2- accumulation in cells of the transgenic lines compared with wild type. Overexpression of RcGPX5 alters the relative expression levels of multiple endogenous genes in S. miltiorrhiza, including transcription factor genes and genes in the ROS and ABA pathways. In particular, RcGPX5 expression increases the mass of S. miltiorrhiza roots while reducing the concentration of the active ingredients. These results show that heterologous expression of RcGPX5 in S. miltiorrhiza can affect the regulation of multiple biochemical pathways to confer tolerance to drought stress, and RcGPX5 might act as a competitor with secondary metabolites in the S. miltiorrhiza response to environmental stimuli

Directory of Open Access Journals

Frontiers - Publisher Connector

FigShare