5,490 research outputs found
Sieve estimation of constant and time-varying coefficients in nonlinear ordinary differential equation models by considering both numerical error and measurement error
This article considers estimation of constant and time-varying coefficients
in nonlinear ordinary differential equation (ODE) models where analytic
closed-form solutions are not available. The numerical solution-based nonlinear
least squares (NLS) estimator is investigated in this study. A numerical
algorithm such as the Runge--Kutta method is used to approximate the ODE
solution. The asymptotic properties are established for the proposed estimators
considering both numerical error and measurement error. The B-spline is used to
approximate the time-varying coefficients, and the corresponding asymptotic
theories in this case are investigated under the framework of the sieve
approach. Our results show that if the maximum step size of the -order
numerical algorithm goes to zero at a rate faster than , the
numerical error is negligible compared to the measurement error. This result
provides a theoretical guidance in selection of the step size for numerical
evaluations of ODEs. Moreover, we have shown that the numerical solution-based
NLS estimator and the sieve NLS estimator are strongly consistent. The sieve
estimator of constant parameters is asymptotically normal with the same
asymptotic co-variance as that of the case where the true ODE solution is
exactly known, while the estimator of the time-varying parameter has the
optimal convergence rate under some regularity conditions. The theoretical
results are also developed for the case when the step size of the ODE numerical
solver does not go to zero fast enough or the numerical error is comparable to
the measurement error. We illustrate our approach with both simulation studies
and clinical data on HIV viral dynamics.Comment: Published in at http://dx.doi.org/10.1214/09-AOS784 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Zap Q-Learning for Optimal Stopping Time Problems
The objective in this paper is to obtain fast converging reinforcement
learning algorithms to approximate solutions to the problem of discounted cost
optimal stopping in an irreducible, uniformly ergodic Markov chain, evolving on
a compact subset of . We build on the dynamic programming
approach taken by Tsitsikilis and Van Roy, wherein they propose a Q-learning
algorithm to estimate the optimal state-action value function, which then
defines an optimal stopping rule. We provide insights as to why the convergence
rate of this algorithm can be slow, and propose a fast-converging alternative,
the "Zap-Q-learning" algorithm, designed to achieve optimal rate of
convergence. For the first time, we prove the convergence of the Zap-Q-learning
algorithm under the assumption of linear function approximation setting. We use
ODE analysis for the proof, and the optimal asymptotic variance property of the
algorithm is reflected via fast convergence in a finance example
- …