809 research outputs found
Adaptive Online Prediction by Following the Perturbed Leader
When applying aggregating strategies to Prediction with Expert Advice, the
learning rate must be adaptively tuned. The natural choice of
sqrt(complexity/current loss) renders the analysis of Weighted Majority
derivatives quite complicated. In particular, for arbitrary weights there have
been no results proven so far. The analysis of the alternative "Follow the
Perturbed Leader" (FPL) algorithm from Kalai & Vempala (2003) (based on
Hannan's algorithm) is easier. We derive loss bounds for adaptive learning rate
and both finite expert classes with uniform weights and countable expert
classes with arbitrary weights. For the former setup, our loss bounds match the
best known results so far, while for the latter our results are new.Comment: 25 page
Erlang loss bounds for OT-ICU systems
In hospitals, patients can be rejected at both the operating theater (OT) and the intensive care unit (ICU) due to limited ICU capacity. The corresponding ICU rejection probability is an important service factor for hospitals. Rejection of an ICU request may lead to health deterioration for patients, and for hospitals to costly actions and a loss of precious capacity when an operation is canceled.\ud
There is no simple expression available for this ICU rejection probability that takes the interaction with the OT into account. With c the ICU capacity (number of ICU beds), this paper proves and numerically illustrates a lower bound by an system and an upper bound by an system, hence by simple Erlang loss expressions.\ud
The result is based on a product form modification for a special OT–ICU tandem formulation and proved by a technically complicated Markov reward comparison approach. The upper bound result is of particular practical interest for dimensioning an ICU to secure a prespecified service quality. The numerical results include a case study.\u
Bounded Optimal Exploration in MDP
Within the framework of probably approximately correct Markov decision
processes (PAC-MDP), much theoretical work has focused on methods to attain
near optimality after a relatively long period of learning and exploration.
However, practical concerns require the attainment of satisfactory behavior
within a short period of time. In this paper, we relax the PAC-MDP conditions
to reconcile theoretically driven exploration methods and practical needs. We
propose simple algorithms for discrete and continuous state spaces, and
illustrate the benefits of our proposed relaxation via theoretical analyses and
numerical examples. Our algorithms also maintain anytime error bounds and
average loss bounds. Our approach accommodates both Bayesian and non-Bayesian
methods.Comment: In Proceedings of the 30th AAAI Conference on Artificial Intelligence
(AAAI), 201
- …