24,379 research outputs found
Tractable Pathfinding for the Stochastic On-Time Arrival Problem
We present a new and more efficient technique for computing the route that
maximizes the probability of on-time arrival in stochastic networks, also known
as the path-based stochastic on-time arrival (SOTA) problem. Our primary
contribution is a pathfinding algorithm that uses the solution to the
policy-based SOTA problem---which is of pseudo-polynomial-time complexity in
the time budget of the journey---as a search heuristic for the optimal path. In
particular, we show that this heuristic can be exceptionally efficient in
practice, effectively making it possible to solve the path-based SOTA problem
as quickly as the policy-based SOTA problem. Our secondary contribution is the
extension of policy-based preprocessing to path-based preprocessing for the
SOTA problem. In the process, we also introduce Arc-Potentials, a more
efficient generalization of Stochastic Arc-Flags that can be used for both
policy- and path-based SOTA. After developing the pathfinding and preprocessing
algorithms, we evaluate their performance on two different real-world networks.
To the best of our knowledge, these techniques provide the most efficient
computation strategy for the path-based SOTA problem for general probability
distributions, both with and without preprocessing.Comment: Submission accepted by the International Symposium on Experimental
Algorithms 2016 and published by Springer in the Lecture Notes in Computer
Science series on June 1, 2016. Includes typographical corrections and
modifications to pre-processing made after the initial submission to SODA'15
(July 7, 2014
A Survey on Delay-Aware Resource Control for Wireless Systems --- Large Deviation Theory, Stochastic Lyapunov Drift and Distributed Stochastic Learning
In this tutorial paper, a comprehensive survey is given on several major
systematic approaches in dealing with delay-aware control problems, namely the
equivalent rate constraint approach, the Lyapunov stability drift approach and
the approximate Markov Decision Process (MDP) approach using stochastic
learning. These approaches essentially embrace most of the existing literature
regarding delay-aware resource control in wireless systems. They have their
relative pros and cons in terms of performance, complexity and implementation
issues. For each of the approaches, the problem setup, the general solution and
the design methodology are discussed. Applications of these approaches to
delay-aware resource allocation are illustrated with examples in single-hop
wireless networks. Furthermore, recent results regarding delay-aware multi-hop
routing designs in general multi-hop networks are elaborated. Finally, the
delay performance of the various approaches are compared through simulations
using an example of the uplink OFDMA systems.Comment: 58 pages, 8 figures; IEEE Transactions on Information Theory, 201
Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint
The classic objective in a reinforcement learning (RL) problem is to find a
policy that minimizes, in expectation, a long-run objective such as the
infinite-horizon discounted or long-run average cost. In many practical
applications, optimizing the expected value alone is not sufficient, and it may
be necessary to include a risk measure in the optimization process, either as
the objective or as a constraint. Various risk measures have been proposed in
the literature, e.g., mean-variance tradeoff, exponential utility, the
percentile performance, value at risk, conditional value at risk, prospect
theory and its later enhancement, cumulative prospect theory. In this article,
we focus on the combination of risk criteria and reinforcement learning in a
constrained optimization framework, i.e., a setting where the goal to find a
policy that optimizes the usual objective of infinite-horizon
discounted/average cost, while ensuring that an explicit risk constraint is
satisfied. We introduce the risk-constrained RL framework, cover popular risk
measures based on variance, conditional value-at-risk and cumulative prospect
theory, and present a template for a risk-sensitive RL algorithm. We survey
some of our recent work on this topic, covering problems encompassing
discounted cost, average cost, and stochastic shortest path settings, together
with the aforementioned risk measures in a constrained framework. This
non-exhaustive survey is aimed at giving a flavor of the challenges involved in
solving a risk-sensitive RL problem, and outlining some potential future
research directions
- …