Search CORE

29 research outputs found

A central limit theorem for temporally non-homogenous Markov chains with applications to dynamic programming

Author: Arlotto Alessandro
Steele J. Michael
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 06/12/2015
Field of study

We prove a central limit theorem for a class of additive processes that arise naturally in the theory of finite horizon Markov decision problems. The main theorem generalizes a classic result of Dobrushin (1956) for temporally non-homogeneous Markov chains, and the principal innovation is that here the summands are permitted to depend on both the current state and a bounded number of future states of the chain. We show through several examples that this added flexibility gives one a direct path to asymptotic normality of the optimal total reward of finite horizon Markov decision problems. The same examples also explain why such results are not easily obtained by alternative Markovian techniques such as enlargement of the state space.Comment: 27 pages, 1 figur

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Essays in Problems of Optimal Sequential Decisions

Author: Arlotto Alessandro
Publication venue: ScholarlyCommons
Publication date: 01/01/2012
Field of study

In this dissertation, we study several Markovian problems of optimal sequential decisions by focusing on research questions that are driven by probabilistic and operations-management considerations. Our probabilistic interest is in understanding the distribution of the total reward that one obtains when implementing a policy that maximizes its expected value. With this respect, we study the sequential selection of unimodal and alternating subsequences from a random sample, and we prove accurate bounds for the expected values and exact asymptotics. In the unimodal problem, we also note that the variance of the optimal total reward can be bounded in terms of its expected value. This fact then motivates a much broader analysis that characterizes a class of Markov decision problems that share this important property. In the alternating subsequence problem, we also outline how one could be able to prove a Central Limit Theorem for the number of alternating selections in a finite random sample, as the size of the sample grows to infinity. Our operations-management interest is in studying the interaction of on-the-job learning and learning-by-doing in a workforce-related problem. Specifically, we study the sequential hiring and retention of heterogeneous workers who learn over time. We model the hiring and retention problem as a Bayesian infinite-armed bandit, and we characterize the optimal policy in detail. Through an extensive set of numerical examples, we gain insights into the managerial nature of the problem, and we demonstrate that the value of active monitoring and screening of employees can be substantial

ScholarlyCommons@Penn

Quickest Online Selection of an Increasing Subsequence of Specified Size

Author: Arlotto Alessandro
Mossel Elchanan
Steele J. Michael
Publication venue: 'Wiley'
Publication date: 09/08/2015
Field of study

Given a sequence of independent random variables with a common continuous distribution, we consider the online decision problem where one seeks to minimize the expected value of the time that is needed to complete the selection of a monotone increasing subsequence of a prespecified length

n

. This problem is dual to some online decision problems that have been considered earlier, and this dual problem has some notable advantages. In particular, the recursions and equations of optimality lead with relative ease to asymptotic formulas for mean and variance of the minimal selection time.Comment: 17 page

arXiv.org e-Print Archive

ScholarlyCommons@Penn

Optimal Online Selection of a Monotone Subsequence: a Central Limit Theorem

Author: Arlotto Alessandro
Nguyen Vinh V.
Steele J. Michael
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Consider a sequence of

n

independent random variables with a common continuous distribution

F

, and consider the task of choosing an increasing subsequence where the observations are revealed sequentially and where an observation must be accepted or rejected when it is first revealed. There is a unique selection policy

\pi_n^*

that is optimal in the sense that it maximizes the expected value of

L_n(\pi_n^*)

, the number of selected observations. We investigate the distribution of

L_n(\pi_n^*)

; in particular, we obtain a central limit theorem for

L_n(\pi_n^*)

and a detailed understanding of its mean and variance for large

n

. Our results and methods are complementary to the work of Bruss and Delbaen (2004) where an analogous central limit theorem is found for monotone increasing selections from a finite sequence with cardinality

N

where

N

is a Poisson random variable that is independent of the sequence.Comment: 26 page

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Optimal Online Selection of an Alternating Subsequence: A Central Limit Theorem

Author: Arlotto Alessandro
Steele J. Michael
Publication venue: ScholarlyCommons
Publication date: 01/01/2013
Field of study

We analyze the optimal policy for the sequential selection of an alternating subsequence from a sequence of n independent observations from a continuous distribution F, and we prove a central limit theorem for the number of selections made by that policy. The proof exploits the backward recursion of dynamic programming and assembles a detailed understanding of the associated value functions and selection rules

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Beardwood-Halton-Hammersly Theorem for Stationary Ergodic Sequences: A Counterexample

Author: Arlotto Alessandro
Steele J. Michael
Publication venue: ScholarlyCommons
Publication date: 24/08/2015
Field of study

We construct a stationary ergodic process X1,X2,…such that each Xt has the uniform distribution on the unit square and the length Ln of the shortest path through the points X1,X2,…,Xn is not asymptotic to a constant times the square root of n. In other words, we show that the Beardwood, Halton, and Hammersley theorem does not extend from the case of independent uniformly distributed random variables to the case of stationary ergodic sequences with uniform marginal distributions

arXiv.org e-Print Archive

CiteSeerX

ScholarlyCommons@Penn

Markov Decision Problems Where Means Bound Variances

Author: Arlotto Alessandro
Gans Noah
Steele John M
Publication venue: ScholarlyCommons
Publication date: 08/04/2013
Field of study

We identify a rich class of finite-horizon Markov decision problems (MDPs) for which the variance of the optimal total reward can be bounded by a simple linear function of its expected value. The class is characterized by three natural properties: reward nonnegativity and boundedness, existence of a do-nothing action, and optimal action monotonicity. These properties are commonly present and typically easy to check. Implications of the class properties and of the variance bound are illustrated by examples of MDPs from operations research, operations management, financial engineering, and combinatorial optimization

CiteSeerX

ScholarlyCommons@Penn

Optimal Sequential Selection of a Unimodal Subsequence of a Random Sequence

Author: ALESSANDRO ARLOTTO
Bertsekas
Bruss
Erdős
J. MICHAEL STEELE
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2011
Field of study

We consider the problem of selecting sequentially a unimodal subsequence from a sequence of independent identically distributed random variables, and we find that a person doing optimal sequential selection does within a factor of the square root of two as well as a prophet who knows all of the random observations in advance of any selections. Our analysis applies in fact to selections of subsequences that have d+1 monotone blocks, and, by including the case d=0, our analysis also covers monotone subsequences

arXiv.org e-Print Archive

CiteSeerX

Crossref

ScholarlyCommons@Penn