2,696 research outputs found
Reinforcement Learning for the Unit Commitment Problem
In this work we solve the day-ahead unit commitment (UC) problem, by
formulating it as a Markov decision process (MDP) and finding a low-cost policy
for generation scheduling. We present two reinforcement learning algorithms,
and devise a third one. We compare our results to previous work that uses
simulated annealing (SA), and show a 27% improvement in operation costs, with
running time of 2.5 minutes (compared to 2.5 hours of existing
state-of-the-art).Comment: Accepted and presented in IEEE PES PowerTech, Eindhoven 2015, paper
ID 46273
Mean-Variance Optimization in Markov Decision Processes
We consider finite horizon Markov decision processes under performance
measures that involve both the mean and the variance of the cumulative reward.
We show that either randomized or history-based policies can improve
performance. We prove that the complexity of computing a policy that maximizes
the mean reward under a variance constraint is NP-hard for some cases, and
strongly NP-hard for others. We finally offer pseudopolynomial exact and
approximation algorithms.Comment: A full version of an ICML 2011 pape
A Geometric Proof of Calibration
We provide yet another proof of the existence of calibrated forecasters; it
has two merits. First, it is valid for an arbitrary finite number of outcomes.
Second, it is short and simple and it follows from a direct application of
Blackwell's approachability theorem to carefully chosen vector-valued payoff
function and convex target set. Our proof captures the essence of existing
proofs based on approachability (e.g., the proof by Foster, 1999 in case of
binary outcomes) and highlights the intrinsic connection between
approachability and calibration
- …