Search CORE

554 research outputs found

Adaptive Monte Carlo Multiple Testing via Multi-Armed Bandits

Author: Zhang Martin J.
Zou James
Tse David
Publication venue
Publication date: 01/01/2019
Field of study

Monte Carlo (MC) permutation test is considered the gold standard for statistical hypothesis testing, especially when standard parametric assumptions are not clear or likely to fail. However, in modern data science settings where a large number of hypothesis tests need to be performed simultaneously, it is rarely used due to its prohibitive computational cost. In genome-wide association studies, for example, the number of hypothesis tests

m

is around

10^6

while the number of MC samples

n

for each test could be greater than

10^8

, totaling more than

nm

10^{14}

samples. In this paper, we propose Adaptive MC multiple Testing (AMT) to estimate MC p-values and control false discovery rate in multiple testing. The algorithm outputs the same result as the standard full MC approach with high probability while requiring only

\tilde{O}(\sqrt{n}m)

samples. This sample complexity is shown to be optimal. On a Parkinson GWAS dataset, the algorithm reduces the running time from 2 months for full MC to an hour. The AMT algorithm is derived based on the theory of multi-armed bandits

arXiv.org e-Print Archive

Publikationer från KTH

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Sequential Design for Ranking Response Surfaces

Author: Hu Ruimeng
Ludkovski Mike
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 12/07/2016
Field of study

We propose and analyze sequential design methods for the problem of ranking several response surfaces. Namely, given

L \ge 2

response surfaces over a continuous input space

\cal X

, the aim is to efficiently find the index of the minimal response across the entire

\cal X

. The response surfaces are not known and have to be noisily sampled one-at-a-time. This setting is motivated by stochastic control applications and requires joint experimental design both in space and response-index dimensions. To generate sequential design heuristics we investigate stepwise uncertainty reduction approaches, as well as sampling based on posterior classification complexity. We also make connections between our continuous-input formulation and the discrete framework of pure regret in multi-armed bandits. To model the response surfaces we utilize kriging surrogates. Several numerical examples using both synthetic data and an epidemics control problem are provided to illustrate our approach and the efficacy of respective adaptive designs.Comment: 26 pages, 7 figures (updated several sections and figures

arXiv.org e-Print Archive

eScholarship - University of California

Adaptive Data Depth via Multi-Armed Bandits

Author: Baharav Tavor Z.
Lai Tze Leung
Publication venue
Publication date: 09/11/2022
Field of study

Data depth, introduced by Tukey (1975), is an important tool in data science, robust statistics, and computational geometry. One chief barrier to its broader practical utility is that many common measures of depth are computationally intensive, requiring on the order of

n^d

operations to exactly compute the depth of a single point within a data set of

n

points in

d

-dimensional space. Often however, we are not directly interested in the absolute depths of the points, but rather in their relative ordering. For example, we may want to find the most central point in a data set (a generalized median), or to identify and remove all outliers (points on the fringe of the data set with low depth). With this observation, we develop a novel and instance-adaptive algorithm for adaptive data depth computation by reducing the problem of exactly computing

n

depths to an

n

-armed stochastic multi-armed bandit problem which we can efficiently solve. We focus our exposition on simplicial depth, developed by Liu (1990), which has emerged as a promising notion of depth due to its interpretability and asymptotic properties. We provide general instance-dependent theoretical guarantees for our proposed algorithms, which readily extend to many other common measures of data depth including majority depth, Oja depth, and likelihood depth. When specialized to the case where the gaps in the data follow a power law distribution with parameter

\alpha<2

, we show that we can reduce the complexity of identifying the deepest point in the data set (the simplicial median) from

O(n^d)

\tilde{O}(n^{d-(d-1)\alpha/2})

, where

\tilde{O}

suppresses logarithmic factors. We corroborate our theoretical results with numerical experiments on synthetic data, showing the practical utility of our proposed methods.Comment: Keywords: multi-armed bandits, data depth, adaptivity, large-scale computation, simplicial dept

arXiv.org e-Print Archive

Exploration vs. Exploitation in the Information Filtering Problem

Author: Frazier Peter I.
Zhao Xiaoting
Publication venue
Publication date: 08/02/2015
Field of study

We consider information filtering, in which we face a stream of items too voluminous to process by hand (e.g., scientific articles, blog posts, emails), and must rely on a computer system to automatically filter out irrelevant items. Such systems face the exploration vs. exploitation tradeoff, in which it may be beneficial to present an item despite a low probability of relevance, just to learn about future items with similar content. We present a Bayesian sequential decision-making model of this problem, show how it may be solved to optimality using a decomposition to a collection of two-armed bandit problems, and show structural results for the optimal policy. We show that the resulting method is especially useful when facing the cold start problem, i.e., when filtering items for new users without a long history of past interactions. We then present an application of this information filtering method to a historical dataset from the arXiv.org repository of scientific articles.Comment: 36 pages, 5 figure

arXiv.org e-Print Archive

CiteSeerX

A Survey of Monte Carlo Tree Search Methods

Author: Browne Cameron B
Colton Simon
Cowling Peter I
Lucas Simon M
Perez Diego
Powley Edward
Rohlfshagen Philipp
Samothrakis Spyridon
Tavener Stephen
Whitehouse Daniel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

University of Essex Research Repository

CiteSeerX

Maastricht University Research Portal