63 research outputs found
On the Markov Chain Monte Carlo (MCMC) method
Markov Chain Monte Carlo (MCMC) is a popular method used to generate samples from arbitrary distributions, which may be specified indirectly. In this article, we give an introduction to this method along with some examples
On the Second Fundamental Theorem of Asset Pricing
Let be sigma-martingales on . We show
that every bounded martingale (with respect to the underlying filtration)
admits an integral representation w.r.t. if and only if there
is no equivalent probability measure (other than ) under which
are sigma-martingales.
From this we deduce the second fundamental theorem of asset pricing- that
completeness of a market is equivalent to uniqueness of Equivalent
Sigma-Martingale Measure (ESMM)
Convergence of Batch Asynchronous Stochastic Approximation With Applications to Reinforcement Learning
The stochastic approximation algorithm is a widely used probabilistic method
for finding a zero of a vector-valued funtion, when only noisy measurements of
the function are available. In the literature to date, one can make a
distinction between "synchronous" updating, whereby every component of the
current guess is updated at each time, and `"synchronous" updating, whereby
only one component is updated. In principle, it is also possible to update, at
each time instant, some but not all components of , which might be
termed as "batch asynchronous stochastic approximation" (BASA). Also, one can
also make a distinction between using a "local" clock versus a "global" clock.
In this paper, we propose a unified formulation of batch asynchronous
stochastic approximation (BASA) algorithms, and develop a general methodology
for proving that such algorithms converge, irrespective of whether global or
local clocks are used. These convergence proofs make use of weaker hypotheses
than existing results. For example: existing convergence proofs when a local
clock is used require that the measurement noise is an i.i.d sequence. Here, it
is assumed that the measurement errors form a martingale difference sequence.
Also, all results to date assume that the stochastic step sizes satisfy a
probabilistic analog of the Robbins-Monro conditions. We replace this by a
purely deterministic condition on the irreducibility of the underlying Markov
processes.
As specific applications to Reinforcement Learning, we introduce ``batch''
versions of the temporal difference algorithm for value iteration, and
the -learning algorithm for finding the optimal action-value function, and
also permit the use of local clocks instead of a global clock. In all cases, we
establish the convergence of these algorithms, under milder conditions than in
the existing literature.Comment: 27 page
Robustness of the nonlinear filter: the correlated case
We consider the question of robustness of the optimal nonlinear filter when the signal process X and the observation noise are possibly correlated. The signal X and observations Y are given by a SDE where the coefficients can depend on the entire past. Using results on pathwise solutions of stochastic differential equations we express X as a functional of two independent Brownian motions under the reference probability measure P0. This allows us to write the filter p as a ratio of two expectations. This is the main step in proving robustness. In this framework we show that when (Xn,Yn) converge to (X,Y) in law, then the corresponding filters also converge in law. Moreover, when the signal and observation processes converge in probability, so do the filters. We also prove that the paths of the filter are continuous in this framework
Measure free martingales
We give a necessary and sufficient condition on a sequence of functions on a set Ω under which there is a measure on Ω which renders the given sequence of functions a martingale. Further such a measure is unique if we impose a natural maximum entropy condition on the conditional probabilities
Limiting distributions of functionals of Markov chains
Let {Xn, n ≥ 0\s} and {Yn, n ≥ 0} be two stochastic processes such that Yn depends on Xn in a stationary manner, i.e. P(Yn ∊ A|Xn) does not depend on n. Sufficient conditions are derived for Yn to have a limiting distribution. If Xn is a Markov chain with stationary transition probabilities and Yn = f(Xn,..., Xn+k) then Yn depends on Xn is a stationary way. Two situations are considered: (i) {Xn, n ≥ 0} has a limiting distribution (ii) {Xn, n ≥ 0} does not have a limiting distribution and exits every finite set with probability 1. Several examples are considered including that of a non-homogeneous Poisson process with periodic rate function where we obtain the limiting distribution of the interevent times
- …