Search CORE

383 research outputs found

Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits

Author: Seldin Yevgeny
Zimmert Julian
Publication venue
Publication date: 23/03/2020
Field of study

We derive an algorithm that achieves the optimal (within constants) pseudo-regret in both adversarial and stochastic multi-armed bandits without prior knowledge of the regime and time horizon. The algorithm is based on online mirror descent (OMD) with Tsallis entropy regularization with power

\alpha=1/2

and reduced-variance loss estimators. More generally, we define an adversarial regime with a self-bounding constraint, which includes stochastic regime, stochastically constrained adversarial regime (Wei and Luo), and stochastic regime with adversarial corruptions (Lykouris et al.) as special cases, and show that the algorithm achieves logarithmic regret guarantee in this regime and all of its special cases simultaneously with the adversarial regret guarantee.} The algorithm also achieves adversarial and stochastic optimality in the utility-based dueling bandit setting. We provide empirical evaluation of the algorithm demonstrating that it significantly outperforms UCB1 and EXP3 in stochastic environments. We also provide examples of adversarial environments, where UCB1 and Thompson Sampling exhibit almost linear regret, whereas our algorithm suffers only logarithmic regret. To the best of our knowledge, this is the first example demonstrating vulnerability of Thompson Sampling in adversarial environments. Last, but not least, we present a general stochastic analysis and a general adversarial analysis of OMD algorithms with Tsallis entropy regularization for

\alpha\in[0,1]

and explain the reason why

\alpha=1/2

works best

arXiv.org e-Print Archive

Copenhagen University Research Information System

Fighting Bandits with a New Kind of Smoothness

Author: Abernethy Jacob
Lee Chansoo
Tewari Ambuj
Publication venue
Publication date: 13/12/2015
Field of study

We define a novel family of algorithms for the adversarial multi-armed bandit problem, and provide a simple analysis technique based on convex smoothing. We prove two main results. First, we show that regularization via the \emph{Tsallis entropy}, which includes EXP3 as a special case, achieves the

\Theta(\sqrt{TN})

minimax regret. Second, we show that a wide class of perturbation methods achieve a near-optimal regret as low as

O(\sqrt{TN \log N})

if the perturbation distribution has a bounded hazard rate. For example, the Gumbel, Weibull, Frechet, Pareto, and Gamma distributions all satisfy this key property.Comment: In Proceedings of NIPS, 201

arXiv.org e-Print Archive

CiteSeerX

Statistical Mechanics and Information-Theoretic Perspectives on Complexity in the Earth System

Author: Balasis Georgios
Daglis Ioannis A.
Donner Reik V.
Eftaxias Konstantinos
Kurths Juergen
Papadimitriou Constantinos
Potirakis Stelios M.
Runge Jakob
Publication venue: 'MDPI AG'
Publication date: 01/01/2013
Field of study

Peer reviewedPublisher PD

Multidisciplinary Digital Publishing Institute

Aberdeen University Research

Directory of Open Access Journals

Repositorium für Naturwissenschaften und Technik

MPG.PuRe

Classification of partial discharge signals by combining adaptive local iterative filtering and entropy features

Author: Alan Nesbitt
Bishop
Brian Stewart
Gordon Morison
Imene Mitiche
Michael Hughes-Narborough
Philip Boreham
Rathie
Vapnik
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2017
Field of study

Electro-Magnetic Interference (EMI) is a measurement technique for Partial Discharge (PD) signals which arise in operating electrical machines, generators and other auxiliary equipment due to insulation degradation. Assessment of PD can help to reduce machine downtime and circumvent high replacement and maintenance costs. EMI signals can be complex to analyze due to their nonstationary nature. In this paper, a software condition-monitoring model is presented and a novel feature extraction technique, suitable for nonstationary EMI signals, is developed. This method maps multiple discharge sources signals, including PD, from the time domain to a feature space which aids interpretation of subsequent fault information. Results show excellent performance in classifying the different discharge sources

Multidisciplinary Digital Publishing Institute

Crossref

University of Strathclyde Institutional Repository

Directory of Open Access Journals

ResearchOnline@GCU