60 research outputs found

    Adaptive Stratified Sampling for Monte-Carlo integration of Differentiable functions

    Full text link
    We consider the problem of adaptive stratified sampling for Monte Carlo integration of a differentiable function given a finite number of evaluations to the function. We construct a sampling scheme that samples more often in regions where the function oscillates more, while allocating the samples such that they are well spread on the domain (this notion shares similitude with low discrepancy). We prove that the estimate returned by the algorithm is almost similarly accurate as the estimate that an optimal oracle strategy (that would know the variations of the function everywhere) would return, and provide a finite-sample analysis.Comment: 23 pages, 3 figures, to appear in NIPS 2012 conference proceeding

    Bandit Theory meets Compressed Sensing for high dimensional Stochastic Linear Bandit

    Get PDF
    We consider a linear stochastic bandit problem where the dimension KK of the unknown parameter θ\theta is larger than the sampling budget nn. In such cases, it is in general impossible to derive sub-linear regret bounds since usual linear bandit algorithms have a regret in O(Kn)O(K\sqrt{n}). In this paper we assume that θ\theta is SS-sparse, i.e. has at most SS-non-zero components, and that the space of arms is the unit ball for the .2||.||_2 norm. We combine ideas from Compressed Sensing and Bandit Theory and derive algorithms with regret bounds in O(Sn)O(S\sqrt{n})

    Finite-Time Analysis of Stratified Sampling for Monte Carlo

    Get PDF
    International audienceWe consider the problem of stratified sampling for Monte-Carlo integration. We model this problem in a multi-armed bandit setting, where the arms represent the strata, and the goal is to estimate a weighted average of the mean values of the arms. We propose a strategy that samples the arms according to an upper bound on their standard deviations and compare its estimation quality to an ideal allocation that would know the standard deviations of the strata. We provide two regret analyses: a distribution-dependent bound O~(n3/2)\widetilde O(n^{-3/2}) that depends on a measure of the disparity of the strata, and a distribution-free bound O~(n4/3)\widetilde O(n^{-4/3}) that does not

    Bandit Algorithms boost Brain Computer Interfaces for motor-task selection of a brain-controlled button

    Get PDF
    International audienceBrain-computer interfaces (BCI) allow users to ''communicate'' with a computer without using their muscles. BCI based on sensori-motor rhythms use imaginary motor tasks, such as moving the right or left hand, to send control signals. The performances of a BCI can vary greatly across users but also depend on the tasks used, making the problem of appropriate task selection an important issue. This study presents a new procedure to automatically select as fast as possible a discriminant motor task for a brain-controlled button. We develop for this purpose an adaptive algorithm, \textit{UCB-classif}, based on the stochastic bandit theory. This shortens the training stage, thereby allowing the exploration of a greater variety of tasks. By not wasting time on inefficient tasks, and focusing on the most promising ones, this algorithm results in a faster task selection and a more efficient use of the BCI training session. Comparing the proposed method to the standard practice in task selection, for a fixed time budget, \textit{UCB-classif} leads to an improved classification rate, and for a fixed classification rate, to a reduction of the time spent in training by 50%50\%

    Automatic motor task selection via a bandit algorithm for a brain-controlled button

    Get PDF
    This study presents a new procedure to automatically select a discriminant motor task for an asynchronous brain-controlled button. This type of control pertains to Brain Computer Interfaces (BCI). When using sensorimotor rythms in a BCI, several motor tasks, such as moving the right or left hand, the feet or the tongue, can be considered as candidates for the control. This report presents a method to select as fast as possible the most promising task. We develop for this purpose an adaptive algorithm UCB-classif based on the stochastic bandit theory and build an EEG experiment to test our method. By not wasting time on ineffi cient tasks, our algorithm can focus on the most promising ones, resulting in a faster task selection and a more e cient use of the BCI training session. This leads to better classi cation rates for a xed time budget, compared to a standard task selection.Cette étude présente une nouvelle procédure pour sélectionner automatiquement une tâche motrice discriminante pour contrôler un bouton par la pensée. Ce type de contrôle relève du domaine des interfaces cerveau-ordinateur, ou Brain Computer Interface (BCI). Dans les BCI basées sur les rythmes sensorimoteurs cérébraux, différentes tâches motrices peuvent être considérées, comme le mouvement de la main droite ou gauche, des pieds ou de la langue. Ce rapport présente une méthode a n de sélectionner le plus rapidement possible la tâche la plus prometteuse. Nous avons développé à cet effet un algorithme adaptatif UCB-classif basé sur la théorie bandit stochastique, et créé une nouvelle expérience EEG pour tester notre méthode. Cet algorithme évite de perdre du temps sur des tâches ine fficaces, ce qui permet une sélection plus rapide et une utilisation plus e fficace de la session d'apprentissage. Cela conduit à de meilleurs taux de classi fication pour un budget de temps fixé, par rapport à une sélection de tâche standard

    Minimax Number of Strata for Online Stratified Sampling given Noisy Samples

    Get PDF
    We consider the problem of online stratified sampling for Monte Carlo integration of a function given a finite budget of nn noisy evaluations to the function. More precisely we focus on the problem of choosing the number of strata KK as a function of the budget nn. We provide asymptotic and finite-time results on how an oracle that has access to the function would choose the partition optimally. In addition we prove a \textit{lower bound} on the learning rate for the problem of stratified Monte-Carlo. As a result, we are able to state, by improving the bound on its performance, that algorithm MC-UCB, defined in~\citep{MC-UCB}, is minimax optimal both in terms of the number of samples n and the number of strata K, up to a log(nK)\sqrt{\log(nK)}. This enables to deduce a minimax optimal bound on the difference between the performance of the estimate outputted by MC-UCB, and the performance of the estimate outputted by the best oracle static strategy, on the class of Hölder continuous functions, and upt to a log(n)\sqrt{\log(n)}

    Toward optimal stratification for stratified monte-carlo integration

    Get PDF
    International audienceWe consider the problem of adaptive stratified sampling for Monte Carlo integration of a noisy function, given a finite budget n of noisy evaluations to the function. We tackle in this paper the problem of adapting to the function at the same time the number of samples into each stratum and the partition itself. More precisely, it is interesting to refine the partition of the domain in area where the noise to the function, or where the variations of the function, are very heterogeneous. On the other hand, having a (too) refined stratification is not optimal. Indeed, the more refined the stratification, the more difficult it is to adjust the allocation of the samples to the stratification, i.e. sample more points where the noise or variations of the function are larger. We provide in this paper an algorithm that selects online, among a large class of partitions, the partition that provides the optimal trade-off, and allocates the samples almost optimally on this partitio

    Finite-Time Analysis of Stratified Sampling for Monte Carlo

    Get PDF
    International audienceWe consider the problem of stratified sampling for Monte-Carlo integration. We model this problem in a multi-armed bandit setting, where the arms represent the strata, and the goal is to estimate a weighted average of the mean values of the arms. We propose a strategy that samples the arms according to an upper bound on their standard deviations and compare its estimation quality to an ideal allocation that would know the standard deviations of the strata. We provide two regret analyses: a distribution-dependent bound O~(n3/2)\widetilde O(n^{-3/2}) that depends on a measure of the disparity of the strata, and a distribution-free bound O~(n4/3)\widetilde O(n^{-4/3}) that does not

    Optimizing P300-speller sequences by RIP-ping groups apart

    No full text
    International audienceSo far P300-speller design has put very little emphasis on the design of optimized flash patterns, a surprising fact given the importance of the sequence of flashes on the selection outcome. Previous work in this domain has consisted in studying consecutive flashes, to prevent the same letter or its neighbors from flashing consecutively. To this effect, the flashing letters form more random groups than the original row-column sequences for the P300 paradigm, but the groups remain fixed across repetitions. This has several important consequences, among which a lack of discrepancy between the scores of the different letters. The new approach proposed in this paper accumulates evidence for individual elements, and optimizes the sequences by relaxing the constraint that letters should belong to fixed groups across repetitions. The method is inspired by the theory of Restricted Isometry Property matrices in Compressed Sensing, and it can be applied to any display grid size, and for any target flash frequency. This leads to P300 sequences which are shown here to perform significantly better than the state of the art, in simulations and online tests

    Optimizing P300-speller sequences by RIP-ping groups apart

    Get PDF
    International audienceSo far P300-speller design has put very little emphasis on the design of optimized flash patterns, a surprising fact given the importance of the sequence of flashes on the selection outcome. Previous work in this domain has consisted in studying consecutive flashes, to prevent the same letter or its neighbors from flashing consecutively. To this effect, the flashing letters form more random groups than the original row-column sequences for the P300 paradigm, but the groups remain fixed across repetitions. This has several important consequences, among which a lack of discrepancy between the scores of the different letters. The new approach proposed in this paper accumulates evidence for individual elements, and optimizes the sequences by relaxing the constraint that letters should belong to fixed groups across repetitions. The method is inspired by the theory of Restricted Isometry Property matrices in Compressed Sensing, and it can be applied to any display grid size, and for any target flash frequency. This leads to P300 sequences which are shown here to perform significantly better than the state of the art, in simulations and online tests
    corecore