554 research outputs found

    Thompson Sampling: An Asymptotically Optimal Finite Time Analysis

    Full text link
    The question of the optimality of Thompson Sampling for solving the stochastic multi-armed bandit problem had been open since 1933. In this paper we answer it positively for the case of Bernoulli rewards by providing the first finite-time analysis that matches the asymptotic rate given in the Lai and Robbins lower bound for the cumulative regret. The proof is accompanied by a numerical comparison with other optimal policies, experiments that have been lacking in the literature until now for the Bernoulli case.Comment: 15 pages, 2 figures, submitted to ALT (Algorithmic Learning Theory

    Fast learning rates in statistical inference through aggregation

    Get PDF
    We develop minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set G\mathcal{G} up to the smallest possible additive term, called the convergence rate. When the reference set is finite and when nn denotes the size of the training data, we provide minimax convergence rates of the form C(logGn)vC(\frac{\log|\mathcal{G}|}{n})^v with tight evaluation of the positive constant CC and with exact 0<v10<v\le1, the latter value depending on the convexity of the loss function and on the level of noise in the output distribution. The risk upper bounds are based on a sequential randomized algorithm, which at each step concentrates on functions having both low risk and low variance with respect to the previous step prediction function. Our analysis puts forward the links between the probabilistic and worst-case viewpoints, and allows to obtain risk bounds unachievable with the standard statistical learning approach. One of the key ideas of this work is to use probabilistic inequalities with respect to appropriate (Gibbs) distributions on the prediction function space instead of using them with respect to the distribution generating the data. The risk lower bounds are based on refinements of the Assouad lemma taking particularly into account the properties of the loss function. Our key example to illustrate the upper and lower bounds is to consider the LqL_q-regression setting for which an exhaustive analysis of the convergence rates is given while qq ranges in [1;+[[1;+\infty[.Comment: Published in at http://dx.doi.org/10.1214/08-AOS623 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Functional Sequential Treatment Allocation

    Full text link
    Consider a setting in which a policy maker assigns subjects to treatments, observing each outcome before the next subject arrives. Initially, it is unknown which treatment is best, but the sequential nature of the problem permits learning about the effectiveness of the treatments. While the multi-armed-bandit literature has shed much light on the situation when the policy maker compares the effectiveness of the treatments through their mean, much less is known about other targets. This is restrictive, because a cautious decision maker may prefer to target a robust location measure such as a quantile or a trimmed mean. Furthermore, socio-economic decision making often requires targeting purpose specific characteristics of the outcome distribution, such as its inherent degree of inequality, welfare or poverty. In the present paper we introduce and study sequential learning algorithms when the distributional characteristic of interest is a general functional of the outcome distribution. Minimax expected regret optimality results are obtained within the subclass of explore-then-commit policies, and for the unrestricted class of all policies

    An efficient algorithm for learning with semi-bandit feedback

    Full text link
    We consider the problem of online combinatorial optimization under semi-bandit feedback. The goal of the learner is to sequentially select its actions from a combinatorial decision set so as to minimize its cumulative loss. We propose a learning algorithm for this problem based on combining the Follow-the-Perturbed-Leader (FPL) prediction method with a novel loss estimation procedure called Geometric Resampling (GR). Contrary to previous solutions, the resulting algorithm can be efficiently implemented for any decision set where efficient offline combinatorial optimization is possible at all. Assuming that the elements of the decision set can be described with d-dimensional binary vectors with at most m non-zero entries, we show that the expected regret of our algorithm after T rounds is O(m sqrt(dT log d)). As a side result, we also improve the best known regret bounds for FPL in the full information setting to O(m^(3/2) sqrt(T log d)), gaining a factor of sqrt(d/m) over previous bounds for this algorithm.Comment: submitted to ALT 201

    Spectral Sparsification and Regret Minimization Beyond Matrix Multiplicative Updates

    Full text link
    In this paper, we provide a novel construction of the linear-sized spectral sparsifiers of Batson, Spielman and Srivastava [BSS14]. While previous constructions required Ω(n4)\Omega(n^4) running time [BSS14, Zou12], our sparsification routine can be implemented in almost-quadratic running time O(n2+ε)O(n^{2+\varepsilon}). The fundamental conceptual novelty of our work is the leveraging of a strong connection between sparsification and a regret minimization problem over density matrices. This connection was known to provide an interpretation of the randomized sparsifiers of Spielman and Srivastava [SS11] via the application of matrix multiplicative weight updates (MWU) [CHS11, Vis14]. In this paper, we explain how matrix MWU naturally arises as an instance of the Follow-the-Regularized-Leader framework and generalize this approach to yield a larger class of updates. This new class allows us to accelerate the construction of linear-sized spectral sparsifiers, and give novel insights on the motivation behind Batson, Spielman and Srivastava [BSS14]

    Gain properties of dye-doped polymer thin films

    Full text link
    Hybrid pumping appears as a promising compromise in order to reach the much coveted goal of an electrically pumped organic laser. In such configuration the organic material is optically pumped by an electrically pumped inorganic device on chip. This engineering solution requires therefore an optimization of the organic gain medium under optical pumping. Here, we report a detailed study of the gain features of dye-doped polymer thin films. In particular we introduce the gain efficiency KK, in order to facilitate comparison between different materials and experimental conditions. The gain efficiency was measured with various setups (pump-probe amplification, variable stripe length method, laser thresholds) in order to study several factors which modify the actual gain of a layer, namely the confinement factor, the pump polarization, the molecular anisotropy, and the re-absorption. For instance, for a 600 nm thick 5 wt\% DCM doped PMMA layer, the different experimental approaches give a consistent value KK\simeq 80 cm.MW1^{-1}. On the contrary, the usual model predicting the gain from the characteristics of the material leads to an overestimation by two orders of magnitude, which raises a serious problem in the design of actual devices. In this context, we demonstrate the feasibility to infer the gain efficiency from the laser threshold of well-calibrated devices. Besides, temporal measurements at the picosecond scale were carried out to support the analysis.Comment: 15 pages, 17 figure

    PAC-Bayesian Bounds for Randomized Empirical Risk Minimizers

    Get PDF
    The aim of this paper is to generalize the PAC-Bayesian theorems proved by Catoni in the classification setting to more general problems of statistical inference. We show how to control the deviations of the risk of randomized estimators. A particular attention is paid to randomized estimators drawn in a small neighborhood of classical estimators, whose study leads to control the risk of the latter. These results allow to bound the risk of very general estimation procedures, as well as to perform model selection

    Syndrome de détresse respiratoire aiguë secondaire à une infection à Toxocara cati

    Get PDF
    Human toxocarosis is a helminthozoonosis due to the migration of toxocara species larvae throughout the human body. Lung manifestations vary and range from asymptomatic infection to severe disease. Dry cough and chest discomfort are the most common respiratory symptoms. Clinical manifestations include a transient form of Loeffler\u27s syndrome or an eosinophilic pneumonia. We report a case of bilateral pneumonia in an 80 year old caucasian man who developed very rapidly an acute respiratory distress syndrome, with a PaO2/FiO2 ratio of 55, requiring mechanical ventilation and adrenergic support. There was an increased eosinophilia in both blood and bronchoalveolar lavage fluid. Positive toxocara serology and the clinical picture confirmed the diagnosis of the "visceral larva migrans" syndrome. Intravenous corticosteroid therapy produced a rapid rise in PaO2/FiO2 before the administration of specific treatment. A few cases of acute pneumonia requiring mechanical ventilation due to toxocara have been published but this is, to our knowledge, is the first reported case of ARDS with multi-organ failure

    An insight into polarization states of solid-state organic lasers

    Full text link
    The polarization states of lasers are crucial issues both for practical applications and fundamental research. In general, they depend in a combined manner on the properties of the gain material and on the structure of the electromagnetic modes. In this paper, we address this issue in the case of solid-state organic lasers, a technology which enables to vary independently gain and mode properties. Different kinds of resonators are investigated: in-plane micro-resonators with Fabry-Perot, square, pentagon, stadium, disk, and kite shapes, and external vertical resonators. The degree of polarization P is measured in each case. It is shown that although TE modes prevail generally (P>0), kite-shaped micro-laser generates negative values for P, i.e. a flip of the dominant polarization which becomes mostly TM polarized. We at last investigated two degrees of freedom that are available to tailor the polarization of organic lasers, in addition to the pump polarization and the resonator geometry: upon using resonant energy transfer (RET) or upon pumping the laser dye to an higher excited state. We then demonstrate that significantly lower P factors can be obtained.Comment: 12 pages, 12 figure
    corecore