Search CORE

5 research outputs found

Min Max Generalization for Two-stage Deterministic Batch Mode Reinforcement Learning: Relaxation Schemes

Author: Boigelot Bernard
Ernst Damien
Fonteneau Raphael
Louveaux Quentin
Publication venue
Publication date: 01/01/2012
Field of study

We study the minmax optimization problem introduced in [22] for computing policies for batch mode reinforcement learning in a deterministic setting. First, we show that this problem is NP-hard. In the two-stage case, we provide two relaxation schemes. The first relaxation scheme works by dropping some constraints in order to obtain a problem that is solvable in polynomial time. The second relaxation scheme, based on a Lagrangian relaxation where all constraints are dualized, leads to a conic quadratic programming problem. We also theoretically prove and empirically illustrate that both relaxation schemes provide better results than those given in [22]

arXiv.org e-Print Archive

Open Repository and Bibliography - Liège

A survey of time consistency of dynamic risk measures and dynamic performance measures in discrete time : LM-measure perspective

Author: Bielecki Tomasz R.
Cialenco Igor
Pitera Marcin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

In this work we give a comprehensive overview of the time consistency property of dynamic risk and performance measures, focusing on a the discrete time setup. The two key operational concepts used throughout are the notion of the LM-measure and the notion of the update rule that, we believe, are the key tools for studying time consistency in a unified framework

arXiv.org e-Print Archive

Springer - Publisher Connector

Jagiellonian Univeristy Repository

A survey of time consistency of dynamic risk measures and dynamic performance measures in discrete time: LM-measure perspective

Author: A Cherny
A Cherny
A Cherny
A Cherny
A Cherny
A Cherny
A Hamel
A Hamel
A Jobert
A Ruszczyński
A Ruszczyński
A Ruszczyński
A Shapiro
A Shapiro
A Shapiro
B Acciaio
B Roorda
B Roorda
B Roorda
D Filipovic
DA Iancu
DM Kreps
E Rosazza Gianin
E Rosazza Gianin
E Rosazza Gianin
EN Barron
F Coquet
F Delbaen
F Delbaen
F Riedel
G Scandolo
G Szegö
H Föllmer
H Föllmer
H Föllmer
H Geman
HU Gerber
I Gilboa
I Penner
J Bion-Nadal
J Bion-Nadal
J Bion-Nadal
J Bion-Nadal
K Boda
K Detlefsen
L Jiang
LG Epstein
M Frittelli
M Frittelli
M Frittelli
M Frittelli
M Kaina
M Kupper
M Nutz
M Stadje
MJ Goovaerts
P Artzner
P Artzner
P Artzner
P Barrieu
P Carpentier
P Cheridito
P Cheridito
P Cheridito
P Cheridito
P Cheridito
P Cheridito
R Sircar
RJ Elliott
S Biagini
S Drapeau
S Klöppel
S Peng
S Peng
S Tutsch
S Weber
SN Cohen
SN Cohen
T Wang
T Zariphopoulou
TC Koopmans
TR Bielecki
TR Bielecki
TR Bielecki
TR Bielecki
TR Bielecki
TR Bielecki
V Fasen
Z Feinstein
Z Feinstein
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref