29 research outputs found

    A central limit theorem for temporally non-homogenous Markov chains with applications to dynamic programming

    Get PDF
    We prove a central limit theorem for a class of additive processes that arise naturally in the theory of finite horizon Markov decision problems. The main theorem generalizes a classic result of Dobrushin (1956) for temporally non-homogeneous Markov chains, and the principal innovation is that here the summands are permitted to depend on both the current state and a bounded number of future states of the chain. We show through several examples that this added flexibility gives one a direct path to asymptotic normality of the optimal total reward of finite horizon Markov decision problems. The same examples also explain why such results are not easily obtained by alternative Markovian techniques such as enlargement of the state space.Comment: 27 pages, 1 figur

    Essays in Problems of Optimal Sequential Decisions

    Get PDF
    In this dissertation, we study several Markovian problems of optimal sequential decisions by focusing on research questions that are driven by probabilistic and operations-management considerations. Our probabilistic interest is in understanding the distribution of the total reward that one obtains when implementing a policy that maximizes its expected value. With this respect, we study the sequential selection of unimodal and alternating subsequences from a random sample, and we prove accurate bounds for the expected values and exact asymptotics. In the unimodal problem, we also note that the variance of the optimal total reward can be bounded in terms of its expected value. This fact then motivates a much broader analysis that characterizes a class of Markov decision problems that share this important property. In the alternating subsequence problem, we also outline how one could be able to prove a Central Limit Theorem for the number of alternating selections in a finite random sample, as the size of the sample grows to infinity. Our operations-management interest is in studying the interaction of on-the-job learning and learning-by-doing in a workforce-related problem. Specifically, we study the sequential hiring and retention of heterogeneous workers who learn over time. We model the hiring and retention problem as a Bayesian infinite-armed bandit, and we characterize the optimal policy in detail. Through an extensive set of numerical examples, we gain insights into the managerial nature of the problem, and we demonstrate that the value of active monitoring and screening of employees can be substantial

    Quickest Online Selection of an Increasing Subsequence of Specified Size

    Get PDF
    Given a sequence of independent random variables with a common continuous distribution, we consider the online decision problem where one seeks to minimize the expected value of the time that is needed to complete the selection of a monotone increasing subsequence of a prespecified length nn. This problem is dual to some online decision problems that have been considered earlier, and this dual problem has some notable advantages. In particular, the recursions and equations of optimality lead with relative ease to asymptotic formulas for mean and variance of the minimal selection time.Comment: 17 page

    Optimal Online Selection of a Monotone Subsequence: a Central Limit Theorem

    Get PDF
    Consider a sequence of nn independent random variables with a common continuous distribution FF, and consider the task of choosing an increasing subsequence where the observations are revealed sequentially and where an observation must be accepted or rejected when it is first revealed. There is a unique selection policy πn∗\pi_n^* that is optimal in the sense that it maximizes the expected value of Ln(πn∗)L_n(\pi_n^*), the number of selected observations. We investigate the distribution of Ln(πn∗)L_n(\pi_n^*); in particular, we obtain a central limit theorem for Ln(πn∗)L_n(\pi_n^*) and a detailed understanding of its mean and variance for large nn. Our results and methods are complementary to the work of Bruss and Delbaen (2004) where an analogous central limit theorem is found for monotone increasing selections from a finite sequence with cardinality NN where NN is a Poisson random variable that is independent of the sequence.Comment: 26 page

    Optimal Online Selection of an Alternating Subsequence: A Central Limit Theorem

    Get PDF
    We analyze the optimal policy for the sequential selection of an alternating subsequence from a sequence of n independent observations from a continuous distribution F, and we prove a central limit theorem for the number of selections made by that policy. The proof exploits the backward recursion of dynamic programming and assembles a detailed understanding of the associated value functions and selection rules

    Beardwood-Halton-Hammersly Theorem for Stationary Ergodic Sequences: A Counterexample

    Get PDF
    We construct a stationary ergodic process X1,X2,…such that each Xt has the uniform distribution on the unit square and the length Ln of the shortest path through the points X1,X2,…,Xn is not asymptotic to a constant times the square root of n. In other words, we show that the Beardwood, Halton, and Hammersley theorem does not extend from the case of independent uniformly distributed random variables to the case of stationary ergodic sequences with uniform marginal distributions

    Markov Decision Problems Where Means Bound Variances

    Get PDF
    We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of the optimal total reward can be bounded by a simple linear function of its expected value. The class is characterized by three natural properties: reward nonnegativity and boundedness, existence of a do-nothing action, and optimal action monotonicity. These properties are commonly present and typically easy to check. Implications of the class properties and of the variance bound are illustrated by examples of MDPs from operations research, operations management, financial engineering, and combinatorial optimization

    Optimal Sequential Selection of a Unimodal Subsequence of a Random Sequence

    Get PDF
    We consider the problem of selecting sequentially a unimodal subsequence from a sequence of independent identically distributed random variables, and we find that a person doing optimal sequential selection does within a factor of the square root of two as well as a prophet who knows all of the random observations in advance of any selections. Our analysis applies in fact to selections of subsequences that have d+1 monotone blocks, and, by including the case d=0, our analysis also covers monotone subsequences
    corecore