287 research outputs found

    Laws of Little in an open queueing network

    Get PDF
    The object of this research in the queueing theory is theorems about the functional strong laws of large numbers (FSLLN) under the conditions of heavy traffic in an open queueing network (OQN). The FSLLN is known as a fluid limit or fluid approximation. In this paper, FSLLN are proved for the values of important probabilistic characteristics of the OQN investigated as well as the virtual waiting time of a customer and the queue length of customers. As applications of the proved theorems laws of Little in OQN are presented

    The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning

    Full text link
    The paper concerns convergence and asymptotic statistics for stochastic approximation driven by Markovian noise: θn+1=θn+αn+1f(θn,Φn+1),n0, \theta_{n+1}= \theta_n + \alpha_{n + 1} f(\theta_n, \Phi_{n+1}) \,,\quad n\ge 0, in which each θnd\theta_n\in\Re^d, {Φn} \{ \Phi_n \} is a Markov chain on a general state space X with stationary distribution π\pi, and f:d×Xdf:\Re^d\times \text{X} \to\Re^d. In addition to standard Lipschitz bounds on ff, and conditions on the vanishing step-size sequence {αn}\{\alpha_n\}, it is assumed that the associated ODE is globally asymptotically stable with stationary point denoted θ\theta^*, where fˉ(θ)=E[f(θ,Φ)]\bar f(\theta)=E[f(\theta,\Phi)] with Φπ\Phi\sim\pi. Moreover, the ODE@\infty defined with respect to the vector field, fˉ(θ):=limrr1fˉ(rθ),θd, \bar f_\infty(\theta):= \lim_{r\to\infty} r^{-1} \bar f(r\theta) \,,\qquad \theta\in\Re^d, is asymptotically stable. The main contributions are summarized as follows: (i) The sequence θ\theta is convergent if Φ\Phi is geometrically ergodic, and subject to compatible bounds on ff. The remaining results are established under a stronger assumption on the Markov chain: A slightly weaker version of the Donsker-Varadhan Lyapunov drift condition known as (DV3). (ii) A Lyapunov function is constructed for the joint process {θn,Φn}\{\theta_n,\Phi_n\} that implies convergence of {θn}\{ \theta_n\} in L4L_4. (iii) A functional CLT is established, as well as the usual one-dimensional CLT for the normalized error zn:=(θnθ)/αnz_n:= (\theta_n-\theta^*)/\sqrt{\alpha_n}. Moment bounds combined with the CLT imply convergence of the normalized covariance, limnE[znznT]=Σθ, \lim_{n \to \infty} E [ z_n z_n^T ] = \Sigma_\theta, where Σθ\Sigma_\theta is the asymptotic covariance appearing in the CLT. (iv) An example is provided where the Markov chain Φ\Phi is geometrically ergodic but it does not satisfy (DV3). While the algorithm is convergent, the second moment is unbounded

    Twentieth conference on stochastic processes and their applications

    Get PDF

    Applications of robust optimization to queueing and inventory systems

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 105-111).This thesis investigates the application of robust optimization in the performance analysis of queueing and inventory systems. In the first part of the thesis, we propose a new approach for performance analysis of queueing systems based on robust optimization. We first derive explicit upper bounds on performance for tandem single class, multiclass single server, and single class multi-server queueing systems by solving appropriate robust optimization problems. We then show that these bounds derived by solving deterministic optimization problems translate to upper bounds on the expected steady-state performance for a variety of widely used performance measures such as waiting times and queue lengths. Additionally, these explicit bounds agree qualitatively with known results. In the second part of the thesis, we propose methods to compute (s,S) policies in supply chain networks using robust and stochastic optimization and compare their performance. Our algorithms handle general uncertainty sets, arbitrary network topologies, and flexible cost functions including the presence of fixed costs. The algorithms exhibit empirically practical running times. We contrast the performance of robust and stochastic (s,S) policies in a numerical study, and we find that the robust policy is comparable to the average performance of the stochastic policy, but has a considerably lower standard deviation across a variety of networks and realized demand distributions. Additionally, we identify regimes when the robust policy exhibits particular strengths even in average performance and tail behavior as compared with the stochastic policy.by Alexander Anatolyevich Rikun.Ph.D

    The application of non-linear dynamics to teletraffic modelling.

    Get PDF
    PhDAbstract not availableEngineering and Physical Science Research Council (EPSRC) and NORTE

    Diffusion Asymptotics for Sequential Experiments

    Full text link
    We propose a new diffusion-asymptotic analysis for sequentially randomized experiments, including those that arise in solving multi-armed bandit problems. In an experiment with n n time steps, we let the mean reward gaps between actions scale to the order 1/n1/\sqrt{n} so as to preserve the difficulty of the learning task as nn grows. In this regime, we show that the behavior of a class of sequentially randomized Markov experiments converges to a diffusion limit, given as the solution of a stochastic differential equation. The diffusion limit thus enables us to derive refined, instance-specific characterization of the stochastic dynamics of adaptive experiments. As an application of this framework, we use the diffusion limit to obtain several new insights on the regret and belief evolution of Thompson sampling. We show that a version of Thompson sampling with an asymptotically uninformative prior variance achieves nearly-optimal instance-specific regret scaling when the reward gaps are relatively large. We also demonstrate that, in this regime, the posterior beliefs underlying Thompson sampling are highly unstable over time
    corecore