Search CORE

2,869 research outputs found

A Neural Networks Committee for the Contextual Bandit Problem

Author: D.E. Rumelhart
E. Kaufmann
G. Tesauro
K. Hornik
L. Bottou
L. Kocsis
P. Auer
P. Auer
P. Auer
R. Feraud
S.M. Kakade
T.L. Lai
W. Thompson
Publication venue
Publication date: 01/01/2014
Field of study

This paper presents a new contextual bandit algorithm, NeuralBandit, which does not need hypothesis on stationarity of contexts and rewards. Several neural networks are trained to modelize the value of rewards knowing the context. Two variants, based on multi-experts approach, are proposed to choose online the parameters of multi-layer perceptrons. The proposed algorithms are successfully tested on a large dataset with and without stationarity of rewards.Comment: 21st International Conference on Neural Information Processin

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Concurrent bandits and cognitive radio networks

Author: A. Anandkumar
D. Niyato
E. Even-Dar
J. Mitola
K. Liu
L. Lai
N. Nie
P. Auer
P. Auer
X. Fang
X. Li
Publication venue
Publication date: 01/01/2014
Field of study

We consider the problem of multiple users targeting the arms of a single multi-armed stochastic bandit. The motivation for this problem comes from cognitive radio networks, where selfish users need to coexist without any side communication between them, implicit cooperation or common control. Even the number of users may be unknown and can vary as users join or leave the network. We propose an algorithm that combines an

\epsilon

-greedy learning rule with a collision avoidance mechanism. We analyze its regret with respect to the system-wide optimum and show that sub-linear regret can be obtained in this setting. Experiments show dramatic improvement compared to other algorithms for this setting

arXiv.org e-Print Archive

Crossref

Bootstrapping Monte Carlo Tree Search with an Imperfect Heuristic

Author: G. Chaslot
L. Kocsis
M. Kearns
P. Auer
R. Bellman
R. Coulom
S. Gelly
Publication venue
Publication date: 01/01/2012
Field of study

We consider the problem of using a heuristic policy to improve the value approximation by the Upper Confidence Bound applied in Trees (UCT) algorithm in non-adversarial settings such as planning with large-state space Markov Decision Processes. Current improvements to UCT focus on either changing the action selection formula at the internal nodes or the rollout policy at the leaf nodes of the search tree. In this work, we propose to add an auxiliary arm to each of the internal nodes, and always use the heuristic policy to roll out simulations at the auxiliary arms. The method aims to get fast convergence to optimal values at states where the heuristic policy is optimal, while retaining similar approximation as the original UCT in other states. We show that bootstrapping with the proposed method in the new algorithm, UCT-Aux, performs better compared to the original UCT algorithm and its variants in two benchmark experiment settings. We also examine conditions under which UCT-Aux works well.Comment: 16 pages, accepted for presentation at ECML'1

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University

ScholarBank@NUS

Pilot, Rollout and Monte Carlo Tree Search Methods for Job Shop Scheduling

Author: C. Duin
D. Bertsekas
E. Taillard
L. Kocsis
L. Xu
M.J. Streeter
P. Auer
P. Rolet
S. Panwalkar
S. Voß
T. Lai
Á. Fialho
Publication venue
Publication date: 01/01/2012
Field of study

Greedy heuristics may be attuned by looking ahead for each possible choice, in an approach called the rollout or Pilot method. These methods may be seen as meta-heuristics that can enhance (any) heuristic solution, by repetitively modifying a master solution: similarly to what is done in game tree search, better choices are identified using lookahead, based on solutions obtained by repeatedly using a greedy heuristic. This paper first illustrates how the Pilot method improves upon some simple well known dispatch heuristics for the job-shop scheduling problem. The Pilot method is then shown to be a special case of the more recent Monte Carlo Tree Search (MCTS) methods: Unlike the Pilot method, MCTS methods use random completion of partial solutions to identify promising branches of the tree. The Pilot method and a simple version of MCTS, using the

\varepsilon

-greedy exploration paradigms, are then compared within the same framework, consisting of 300 scheduling problems of varying sizes with fixed-budget of rollouts. Results demonstrate that MCTS reaches better or same results as the Pilot methods in this context.Comment: Learning and Intelligent OptimizatioN (LION'6) 7219 (2012

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

HAL-Rennes 1

Polarization observables in high-energy deuteron photodisintegration within the Quark-Gluon Strings Model

Author: Adamian
Auer
Barannik
Bochna
Brodsky
E. De Sanctis
F. Ronchetti
Grishina
Kaidalov
L. A. Kondratyuk
M. Mirazita
Nagornyi
P. Rossi
Schulte
V. Yu Grishina
W. Cassing
Wijesooriya
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/09/2002
Field of study

Deuteron two-body photodisintegration is analysed within the framework of the Quark-Gluon Strings Model. The model describes fairly well the recent experimental data from TJNAF in the few GeV region. Angular distributions at different

\gamma

-energies are presented and the effect of a forward-backward asymmetry is discussed. New results from the QGSM for polarization observables from 1.5 -- 6 GeV are presented and compared with the available data.Comment: 3 pages, LaTeX, 4 postscript figures; contribution to QNP2002, Juelich, June 10-14, 200

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

CERN Document Server

Hybridizing Constraint Programming and Monte-Carlo Tree Search: Application to the Job Shop problem

Author: J-P Watson
JC Beck
L Kocsis
M Luby
P Auer
R Mathon
S Gelly
TP Runarsson
Publication venue: Springer Verlag
Publication date: 07/01/2013
Field of study

International audienceConstraint Programming (CP) solvers classically explore the solution space using tree search-based heuristics. Monte-Carlo Tree-Search (MCTS), a tree-search based method aimed at sequential decision making under uncertainty, simultaneously estimates the reward associated to the sub-trees, and gradually biases the exploration toward the most promising regions. This paper examines the tight combination of MCTS and CP on the job shop problem (JSP). The contribution is twofold. Firstly, a reward function compliant with the CP setting is proposed. Secondly, a biased MCTS node-selection rule based on this reward is proposed, that is suitable in a multiple-restarts context. Its integration within the Gecode constraint solver is shown to compete with JSP-specific CP approaches on difficult JSP instances

HAL-CentraleSupelec

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

HAL-Polytechnique

HAL-Rennes 1

Efeito do tipo de minijardim e da época de plantio sobre o desenvolvimento de estacas de Pinus radiata.

Author: AUER C. G.
BARROS L. T. S.
HIGA A. R.
RABELO P. R.
SCHULTZ B.
Publication venue
Publication date: 01/01/2014
Field of study

Repository Open Access to Scientific Information from Embrapa

RCAAP - Repositório Científico de Acesso Aberto de Portugal

Thermodynamics and magnetic field profiles in low-kappa type-II superconductors

Author: A. Kung
E. H. Brandt
E. H. Brandt
F. Mohamed
I. Luk’yanchuk
J. Auer
J. M. Delrieu
K. Machida
L. Kramer
M. Ichioka
M. Ichioka
N. Schopohl
P. Miranović
U. Klein
Publication venue: 'American Physical Society (APS)'
Publication date: 29/09/2002
Field of study

Two-dimensional low-kappa type-II superconductors are studied numerically within the Eilenberger equations of superconductivity. Depending on the Ginzburg-Landau parameter \kappa=\lambda/\xi vortex-vortex interaction can be attractive or purely repulsive. The sign of interaction is manifested as a first (second) order phase transition from Meissner to the mixed state. Temperature and field dependence of the magnetic field distribution in low-kappa type-II superconductors with attractive intervortex interaction is calculated. Theoretical results are compared to the experiment.Comment: 4 pages, 3 figure

arXiv.org e-Print Archive

Crossref

Puccinia psidii X Eucalyptus benthamii: patogenicidade e efeito do aumento da concentração de CO2 do ar em sala climatizada.

Author: AUER C. G.
BETTIOL W.
GHINI R.
SANTOS A. F. dos
VIEIRA A. L. P. A.
Publication venue
Publication date: 17/01/2013
Field of study

Repository Open Access to Scientific Information from Embrapa