Strategic Choices in Optimization

Chou, Cheng-Wei; Chou, Ping-Chiang; Christophe, Jean-Joseph; Couetoux, Adrien; De Freminville, Pierre; Galichet, Nicolas; Lee, Chang-Shing; Liu, Jialin; Sebag, Michèle; St-Pierre, David L.; Teytaud, Olivier; Wang, Mei-Hui; Wu, Li-Wen; Yen, Shi-Jim

Strategic Choices in Optimization

Authors: Cheng-Wei Chou
Ping-Chiang Chou
Jean-Joseph Christophe
Adrien Couetoux
Pierre De Freminville
Nicolas Galichet
Chang-Shing Lee
Jialin Liu
Michèle Sebag
David L. St-Pierre
Olivier Teytaud
Mei-Hui Wang
Li-Wen Wu
Shi-Jim Yen
Publication date: 1 January 2013
Publisher: 'Institute of Statistical Science'
Doi

Abstract

International audienceMany decision problems have two levels: one for strategic decisions, and an- other for tactical management. This paper focuses on the strategic level, more specifically the sequential exploration of the possible options and the final selec- tion (recommendation) of the best option. Several sequential exploration and recommendation criteria are considered and empirically compared on real world problems (board games, card games and energy management problems) in the uniform (1-player) and adversarial (2-player) settings. W.r.t. the sequential ex- ploration part, the classical upper confidence bound algorithm, the exponential exploration-exploitation algorithm, the successive reject algorithm (designed specifically for simple regret), and the Bernstein races, are considered. W.r.t. the recommendation part, the selection is based on the empirically best arm, most played arm, lower confidence bounds, based on the reward distribution or variants thereof designed for risk control. This paper presents a systematic study, comparing the coupling of the sequential exploration and recommenda- tion variants on the considered problems in terms of their simple regret. A secondary contribution is that, to the best of our knowledge, this is the first win ever of a computer-kill-all Go player against professional human players