8,360 research outputs found
Reduced Complexity Filtering with Stochastic Dominance Bounds: A Convex Optimization Approach
This paper uses stochastic dominance principles to construct upper and lower
sample path bounds for Hidden Markov Model (HMM) filters. Given a HMM, by using
convex optimization methods for nuclear norm minimization with copositive
constraints, we construct low rank stochastic marices so that the optimal
filters using these matrices provably lower and upper bound (with respect to a
partially ordered set) the true filtered distribution at each time instant.
Since these matrices are low rank (say R), the computational cost of evaluating
the filtering bounds is O(XR) instead of O(X2). A Monte-Carlo importance
sampling filter is presented that exploits these upper and lower bounds to
estimate the optimal posterior. Finally, using the Dobrushin coefficient,
explicit bounds are given on the variational norm between the true posterior
and the upper and lower bounds
Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint
The classic objective in a reinforcement learning (RL) problem is to find a
policy that minimizes, in expectation, a long-run objective such as the
infinite-horizon discounted or long-run average cost. In many practical
applications, optimizing the expected value alone is not sufficient, and it may
be necessary to include a risk measure in the optimization process, either as
the objective or as a constraint. Various risk measures have been proposed in
the literature, e.g., mean-variance tradeoff, exponential utility, the
percentile performance, value at risk, conditional value at risk, prospect
theory and its later enhancement, cumulative prospect theory. In this article,
we focus on the combination of risk criteria and reinforcement learning in a
constrained optimization framework, i.e., a setting where the goal to find a
policy that optimizes the usual objective of infinite-horizon
discounted/average cost, while ensuring that an explicit risk constraint is
satisfied. We introduce the risk-constrained RL framework, cover popular risk
measures based on variance, conditional value-at-risk and cumulative prospect
theory, and present a template for a risk-sensitive RL algorithm. We survey
some of our recent work on this topic, covering problems encompassing
discounted cost, average cost, and stochastic shortest path settings, together
with the aforementioned risk measures in a constrained framework. This
non-exhaustive survey is aimed at giving a flavor of the challenges involved in
solving a risk-sensitive RL problem, and outlining some potential future
research directions
Data-driven satisficing measure and ranking
We propose an computational framework for real-time risk assessment and
prioritizing for random outcomes without prior information on probability
distributions. The basic model is built based on satisficing measure (SM) which
yields a single index for risk comparison. Since SM is a dual representation
for a family of risk measures, we consider problems constrained by general
convex risk measures and specifically by Conditional value-at-risk. Starting
from offline optimization, we apply sample average approximation technique and
argue the convergence rate and validation of optimal solutions. In online
stochastic optimization case, we develop primal-dual stochastic approximation
algorithms respectively for general risk constrained problems, and derive their
regret bounds. For both offline and online cases, we illustrate the
relationship between risk ranking accuracy with sample size (or iterations).Comment: 26 Pages, 6 Figure
Probabilistic analysis of Online Bin Coloring algorithms via Stochastic Comparison
This paper proposes a new method for probabilistic analysis of online algorithms that is based on the notion of stochastic dominance. We develop the method for the Online Bin Coloring problem introduced by Krumke et al. Using methods for the stochastic comparison of Markov chains we establish the strong result that the performance of the online algorithm GreedyFit is stochastically dominated by the performance of the algorithm OneBin for any number of items processed. This result gives a more realistic picture than competitive analysis and explains the behavior observed in simulations.mathematical applications;
- …