Search CORE

110 research outputs found

Scalable Methods for Adaptively Seeding a Social Network

Author: Borgs C.
Chen N.
Even-Dar E.
Yang J.
Publication venue
Publication date: 05/03/2015
Field of study

In recent years, social networking platforms have developed into extraordinary channels for spreading and consuming information. Along with the rise of such infrastructure, there is continuous progress on techniques for spreading information effectively through influential users. In many applications, one is restricted to select influencers from a set of users who engaged with the topic being promoted, and due to the structure of social networks, these users often rank low in terms of their influence potential. An alternative approach one can consider is an adaptive method which selects users in a manner which targets their influential neighbors. The advantage of such an approach is that it leverages the friendship paradox in social networks: while users are often not influential, they often know someone who is. Despite the various complexities in such optimization problems, we show that scalable adaptive seeding is achievable. In particular, we develop algorithms for linear influence models with provable approximation guarantees that can be gracefully parallelized. To show the effectiveness of our methods we collected data from various verticals social network users follow. For each vertical, we collected data on the users who responded to a certain post as well as their neighbors, and applied our methods on this data. Our experiments show that adaptive seeding is scalable, and importantly, that it obtains dramatic improvements over standard approaches of information dissemination.Comment: Full version of the paper appearing in WWW 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

Answer Set Programming for Non-Stationary Markov Decision Processes

Author: C Baral
CJCH Watkins
E Even-dar
E Even-Dar
J Babb
JY Yu
Leonardo A. Ferreira
M Balduccini
M Balduccini
M Gelfond
M Nogueira
Paulo E. Santos
R Bellman
R Bellman
Ramon Lopez de Mantaras
Reinaldo A. C. Bianchi
S Zhang
V Lifschitz
Publication venue
Publication date: 03/05/2017
Field of study

Non-stationary domains, where unforeseen changes happen, present a challenge for agents to find an optimal policy for a sequential decision making problem. This work investigates a solution to this problem that combines Markov Decision Processes (MDP) and Reinforcement Learning (RL) with Answer Set Programming (ASP) in a method we call ASP(RL). In this method, Answer Set Programming is used to find the possible trajectories of an MDP, from where Reinforcement Learning is applied to learn the optimal policy of the problem. Results show that ASP(RL) is capable of efficiently finding the optimal solution of an MDP representing non-stationary domains

arXiv.org e-Print Archive

Crossref

Digital.CSIC

Kern R. TREMBATH, Divine Revelation. Our Moral Relation with God, Oxford University Press, New York Oxford 1991, X+230 pp., 14 x 21,5. [RECENSIÓN]

Author: E Even-Dar
IY Kim
K Moffaert Van
P Auer
T Voß
Publication venue: Servicio de Publicaciones de la Universidad de Navarra
Publication date: 20/02/2018
Field of study

Crossref

Dadun, University of Navarra

Concurrent bandits and cognitive radio networks

Author: A. Anandkumar
D. Niyato
E. Even-Dar
J. Mitola
K. Liu
L. Lai
N. Nie
P. Auer
P. Auer
X. Fang
X. Li
Publication venue
Publication date: 01/01/2014
Field of study

We consider the problem of multiple users targeting the arms of a single multi-armed stochastic bandit. The motivation for this problem comes from cognitive radio networks, where selfish users need to coexist without any side communication between them, implicit cooperation or common control. Even the number of users may be unknown and can vary as users join or leave the network. We propose an algorithm that combines an

\epsilon

-greedy learning rule with a collision avoidance mechanism. We analyze its regret with respect to the system-wide optimum and show that sub-linear regret can be obtained in this setting. Experiments show dramatic improvement compared to other algorithms for this setting

arXiv.org e-Print Archive

Crossref

Rollout Sampling Approximate Policy Iteration

Author: A. Antos
A. Fern
Christos Dimitrakakis
E. Even-Dar
H. O. Wang
M. G. Lagoudakis
Michail G. Lagoudakis
P. Auer
R. A. Howard
R. Sutton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schemes without value functions which focus on policy representation using classifiers and address policy learning as a supervised learning problem. This paper proposes variants of an improved policy iteration scheme which addresses the core sampling problem in evaluating a policy through simulation as a multi-armed bandit machine. The resulting algorithm offers comparable performance to the previous algorithm achieved, however, with significantly less computational effort. An order of magnitude improvement is demonstrated experimentally in two standard reinforcement learning domains: inverted pendulum and mountain-car.Comment: 18 pages, 2 figures, to appear in Machine Learning 72(3). Presented at EWRL08, to be presented at ECML 200

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

International Migration, Integration and Social Cohesion online publications

Institutional Repository of the Technical University of Crete

Bayesian reinforcement learning with exploration

Author: E. Even-Dar
I. Szita
K. Dyagilev
L. Orseau
M. Hutter
M. Hutter
M. Hutter
M. Kearns
M.G. Azar
P. Auer
P. Sunehag
S. Mannor
T. Lattimore
T. Lattimore
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

We consider a general reinforcement learning problem and show that carefully combining the Bayesian optimal policy and an exploring policy leads to minimax sample-complexity bounds in a very general class of (history-based) environments. We also prove lower bounds and show that the new algorithm displays adaptive behaviour when the environment is easier than worst-case

Crossref

The Australian National University

Sequential decision making with vector outcomes

Author: Audibert J. Y.
Azar Y.
Blum A.
Even-Dar E.
Feldman M.
Kalai A.
Kleinberg R.
Zinkevich M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

We study a multi-round optimization setting in which in each round a player may select one of several actions, and each action produces an outcome vector, not observable to the player until the round ends. The final payoff for the player is computed by applying some known function f to the sum of all outcome vectors (e.g., the minimum of all coordinates of the sum). We show that standard notions of performance measure (such as comparison to the best single action) used in related expert and bandit settings (in which the payoff in each round is scalar) are not useful in our vector setting. Instead, we propose a different performance measure, and design algorithms that have vanishing regret with respect to our new measure

CiteSeerX

Crossref

Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies

Author: C Fraser
C Jennison
Catharine Paules
DJ Watts
DL Chao
DL Chao
E Even-Dar
E Kaufmann
I Dorigatti
J Medlock
JO Lloyd-Smith
JT Wu
L Fumanelli
L Willem
M Enserink
M Hartfield
ME Halloran
NE Basta
NM Ferguson
P Libin
R Herbert
S Bubeck
S Bubeck
S Eubank
T Britton
TC Germann
WB Powell
Publication venue
Publication date: 15/06/2018
Field of study

Pandemic influenza has the epidemic potential to kill millions of people. While various preventive measures exist (i.a., vaccination and school closures), deciding on strategies that lead to their most effective and efficient use remains challenging. To this end, individual-based epidemiological models are essential to assist decision makers in determining the best strategy to curb epidemic spread. However, individual-based models are computationally intensive and it is therefore pivotal to identify the optimal strategy using a minimal amount of model evaluations. Additionally, as epidemiological modeling experiments need to be planned, a computational budget needs to be specified a priori. Consequently, we present a new sampling technique to optimize the evaluation of preventive strategies using fixed budget best-arm identification algorithms. We use epidemiological modeling theory to derive knowledge about the reward distribution which we exploit using Bayesian best-arm identification algorithms (i.e., Top-two Thompson sampling and BayesGap). We evaluate these algorithms in a realistic experimental setting and demonstrate that it is possible to identify the optimal strategy using only a limited number of model evaluations, i.e., 2-to-3 times faster compared to the uniform sampling method, the predominant technique used for epidemiological decision making in the literature. Finally, we contribute and evaluate a statistic for Top-two Thompson sampling to inform the decision makers about the confidence of an arm recommendation

arXiv.org e-Print Archive

VU Research Portal

Crossref

DI-fusion

Faster Hoeffding Racing: Bernstein Races via Jackknife Estimates

Author: A. Antos
B. Efron
C. McDiarmid
E. Even-Dar
J.-Y. Audibert
J.-Y. Audibert
J.M. Steele
L. Paninski
M. Arcones
R. Jin
S. Boucheron
S.N. Bernstein
T. Peel
W. Hoeffding
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Multi-armed Bandit Algorithms and Empirical Evaluation

Author: D. Luce
E. Even-Dar
H. Robbins
J.C. Gittins
J.P. Hardwick
L.P. Kaelbling
N. Cesa-Bianchi
N. Meuleau
P. Auer
P. Auer
P. Auer
P. Varaiya
R.S. Sutton
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

Crossref