383 research outputs found
Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits
We derive an algorithm that achieves the optimal (within constants)
pseudo-regret in both adversarial and stochastic multi-armed bandits without
prior knowledge of the regime and time horizon. The algorithm is based on
online mirror descent (OMD) with Tsallis entropy regularization with power
and reduced-variance loss estimators. More generally, we define an
adversarial regime with a self-bounding constraint, which includes stochastic
regime, stochastically constrained adversarial regime (Wei and Luo), and
stochastic regime with adversarial corruptions (Lykouris et al.) as special
cases, and show that the algorithm achieves logarithmic regret guarantee in
this regime and all of its special cases simultaneously with the adversarial
regret guarantee.} The algorithm also achieves adversarial and stochastic
optimality in the utility-based dueling bandit setting. We provide empirical
evaluation of the algorithm demonstrating that it significantly outperforms
UCB1 and EXP3 in stochastic environments. We also provide examples of
adversarial environments, where UCB1 and Thompson Sampling exhibit almost
linear regret, whereas our algorithm suffers only logarithmic regret. To the
best of our knowledge, this is the first example demonstrating vulnerability of
Thompson Sampling in adversarial environments. Last, but not least, we present
a general stochastic analysis and a general adversarial analysis of OMD
algorithms with Tsallis entropy regularization for and explain
the reason why works best
Fighting Bandits with a New Kind of Smoothness
We define a novel family of algorithms for the adversarial multi-armed bandit
problem, and provide a simple analysis technique based on convex smoothing. We
prove two main results. First, we show that regularization via the
\emph{Tsallis entropy}, which includes EXP3 as a special case, achieves the
minimax regret. Second, we show that a wide class of
perturbation methods achieve a near-optimal regret as low as if the perturbation distribution has a bounded hazard rate. For example,
the Gumbel, Weibull, Frechet, Pareto, and Gamma distributions all satisfy this
key property.Comment: In Proceedings of NIPS, 201
Statistical Mechanics and Information-Theoretic Perspectives on Complexity in the Earth System
Peer reviewedPublisher PD
Classification of partial discharge signals by combining adaptive local iterative filtering and entropy features
Electro-Magnetic Interference (EMI) is a measurement technique for Partial Discharge (PD) signals which arise in operating electrical machines, generators and other auxiliary equipment due to insulation degradation. Assessment of PD can help to reduce machine downtime and circumvent high replacement and maintenance costs. EMI signals can be complex to analyze due to their nonstationary nature. In this paper, a software condition-monitoring model is presented and a novel feature extraction technique, suitable for nonstationary EMI signals, is developed. This method maps multiple discharge sources signals, including PD, from the time domain to a feature space which aids interpretation of subsequent fault information. Results show excellent performance in classifying the different discharge sources
- …