1,857 research outputs found
A Linear Programming Approach to Error Bounds for Random Walks in the Quarter-plane
We consider the approximation of the performance of random walks in the
quarter-plane. The approximation is in terms of a random walk with a
product-form stationary distribution, which is obtained by perturbing the
transition probabilities along the boundaries of the state space. A Markov
reward approach is used to bound the approximation error. The main contribution
of the work is the formulation of a linear program that provides the
approximation error
Backward Error Analysis of Factorization Algorithms for Symmetric and Symmetric Triadic Matrices
We consider the factorization of a symmetric matrix where is
unit lower triangular and is block diagonal with diagonal blocks of
order or . This is a generalization of the Cholesky factorization,
and pivoting is incorporated for stability. However, the reliability of
the Bunch-Kaufman pivoting strategy and Bunch's pivoting method for
symmetric tridiagonal matrices could be questioned, because they may
result in unbounded . In this paper, we give a condition under which
factorization will run to completion in inexact arithmetic with
inertia preserved. In addition, we present a new proof of the
componentwise backward stability of the factorization using the inner
product formulation, giving a slight improvement of the bounds in Higham's
proofs, which relied on the outer product formulation and normwise
analysis.
We also analyze the stability of rank estimation of symmetric indefinite
matrices by factorization incorporated with the Bunch-Parlett
pivoting strategy, generalizing results of Higham for the symmetric
semidefinite case.
We call a matrix triadic if it has no more than two non-zero off-diagonal
elements in any column. A symmetric tridiagonal matrix is a special case.
In this paper, we display the improvement in stability bounds when the
matrix is triadic
Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee
Local Policy Search is a popular reinforcement learning approach for handling
large state spaces. Formally, it searches locally in a paramet erized policy
space in order to maximize the associated value function averaged over some
predefined distribution. It is probably commonly b elieved that the best one
can hope in general from such an approach is to get a local optimum of this
criterion. In this article, we show th e following surprising result:
\emph{any} (approximate) \emph{local optimum} enjoys a \emph{global performance
guarantee}. We compare this g uarantee with the one that is satisfied by Direct
Policy Iteration, an approximate dynamic programming algorithm that does some
form of Poli cy Search: if the approximation error of Local Policy Search may
generally be bigger (because local search requires to consider a space of s
tochastic policies), we argue that the concentrability coefficient that appears
in the performance bound is much nicer. Finally, we discuss several practical
and theoretical consequences of our analysis
- …