578 research outputs found

    Thompson Sampling: An Asymptotically Optimal Finite Time Analysis

    Full text link
    The question of the optimality of Thompson Sampling for solving the stochastic multi-armed bandit problem had been open since 1933. In this paper we answer it positively for the case of Bernoulli rewards by providing the first finite-time analysis that matches the asymptotic rate given in the Lai and Robbins lower bound for the cumulative regret. The proof is accompanied by a numerical comparison with other optimal policies, experiments that have been lacking in the literature until now for the Bernoulli case.Comment: 15 pages, 2 figures, submitted to ALT (Algorithmic Learning Theory

    Spectral Sparsification and Regret Minimization Beyond Matrix Multiplicative Updates

    Full text link
    In this paper, we provide a novel construction of the linear-sized spectral sparsifiers of Batson, Spielman and Srivastava [BSS14]. While previous constructions required Ω(n4)\Omega(n^4) running time [BSS14, Zou12], our sparsification routine can be implemented in almost-quadratic running time O(n2+Δ)O(n^{2+\varepsilon}). The fundamental conceptual novelty of our work is the leveraging of a strong connection between sparsification and a regret minimization problem over density matrices. This connection was known to provide an interpretation of the randomized sparsifiers of Spielman and Srivastava [SS11] via the application of matrix multiplicative weight updates (MWU) [CHS11, Vis14]. In this paper, we explain how matrix MWU naturally arises as an instance of the Follow-the-Regularized-Leader framework and generalize this approach to yield a larger class of updates. This new class allows us to accelerate the construction of linear-sized spectral sparsifiers, and give novel insights on the motivation behind Batson, Spielman and Srivastava [BSS14]

    Functional Sequential Treatment Allocation

    Full text link
    Consider a setting in which a policy maker assigns subjects to treatments, observing each outcome before the next subject arrives. Initially, it is unknown which treatment is best, but the sequential nature of the problem permits learning about the effectiveness of the treatments. While the multi-armed-bandit literature has shed much light on the situation when the policy maker compares the effectiveness of the treatments through their mean, much less is known about other targets. This is restrictive, because a cautious decision maker may prefer to target a robust location measure such as a quantile or a trimmed mean. Furthermore, socio-economic decision making often requires targeting purpose specific characteristics of the outcome distribution, such as its inherent degree of inequality, welfare or poverty. In the present paper we introduce and study sequential learning algorithms when the distributional characteristic of interest is a general functional of the outcome distribution. Minimax expected regret optimality results are obtained within the subclass of explore-then-commit policies, and for the unrestricted class of all policies

    An efficient algorithm for learning with semi-bandit feedback

    Full text link
    We consider the problem of online combinatorial optimization under semi-bandit feedback. The goal of the learner is to sequentially select its actions from a combinatorial decision set so as to minimize its cumulative loss. We propose a learning algorithm for this problem based on combining the Follow-the-Perturbed-Leader (FPL) prediction method with a novel loss estimation procedure called Geometric Resampling (GR). Contrary to previous solutions, the resulting algorithm can be efficiently implemented for any decision set where efficient offline combinatorial optimization is possible at all. Assuming that the elements of the decision set can be described with d-dimensional binary vectors with at most m non-zero entries, we show that the expected regret of our algorithm after T rounds is O(m sqrt(dT log d)). As a side result, we also improve the best known regret bounds for FPL in the full information setting to O(m^(3/2) sqrt(T log d)), gaining a factor of sqrt(d/m) over previous bounds for this algorithm.Comment: submitted to ALT 201

    Do Deep Neural Networks Contribute to Multivariate Time Series Anomaly Detection?

    Full text link
    Anomaly detection in time series is a complex task that has been widely studied. In recent years, the ability of unsupervised anomaly detection algorithms has received much attention. This trend has led researchers to compare only learning-based methods in their articles, abandoning some more conventional approaches. As a result, the community in this field has been encouraged to propose increasingly complex learning-based models mainly based on deep neural networks. To our knowledge, there are no comparative studies between conventional, machine learning-based and, deep neural network methods for the detection of anomalies in multivariate time series. In this work, we study the anomaly detection performance of sixteen conventional, machine learning-based and, deep neural network approaches on five real-world open datasets. By analyzing and comparing the performance of each of the sixteen methods, we show that no family of methods outperforms the others. Therefore, we encourage the community to reincorporate the three categories of methods in the anomaly detection in multivariate time series benchmarks

    Déterminants de la demande de soins en milieu péri-urbain dans un contexte de subvention à Pikine, Sénégal

    Get PDF
    Depuis les annĂ©es 2000, le SĂ©nĂ©gal a adoptĂ© des politiques nationales visant la suppression progressive du paiement direct au point de services pour rendre les soins de santĂ© plus accessibles. La mise en place de ces politiques de subvention et de gratuitĂ© dans un espace dense hĂ©tĂ©rogĂšne voire hĂ©tĂ©roclite, prĂ©sente une situation particuliĂšre. Pour comprendre ces interactions et Ă©tudier le comportement des mĂ©nages en matiĂšre de demande de soins, 5520 individus ont Ă©tĂ©s enquĂȘtĂ©s Ă  quatre reprises sur la pĂ©riode 2010-2011 dans la banlieue de Dakar (Pikine), un probit multinomial est estimĂ© pour Ă©tudier la demande de soins de la population face Ă  un Ă©pisode de maladie. Les rĂ©sultats montrent que l'effet nĂ©gatif du prix est en moyenne assez faible, mais qu'il varie en fonction du niveau de revenu et de la sĂ©vĂ©ritĂ© de la maladie. La qualitĂ© perçue des soins a un effet positif sur le recours aux services de santĂ© privĂ©s pour lesquels on observe une compensation de l'effet nĂ©gatif du prix par la qualitĂ©. L'effet de l'Ăąge n'est pas linĂ©aire et les enfants, plus touchĂ©s par la maladie, bĂ©nĂ©ficient de peu d'exemption ou du moins d'exemption partielle contrairement aux personnes ĂągĂ©es qui bĂ©nĂ©ficient d'exemption totale (plan SESAME)

    Syndrome de détresse respiratoire aiguë secondaire à une infection à Toxocara cati

    Get PDF
    Human toxocarosis is a helminthozoonosis due to the migration of toxocara species larvae throughout the human body. Lung manifestations vary and range from asymptomatic infection to severe disease. Dry cough and chest discomfort are the most common respiratory symptoms. Clinical manifestations include a transient form of Loeffler\u27s syndrome or an eosinophilic pneumonia. We report a case of bilateral pneumonia in an 80 year old caucasian man who developed very rapidly an acute respiratory distress syndrome, with a PaO2/FiO2 ratio of 55, requiring mechanical ventilation and adrenergic support. There was an increased eosinophilia in both blood and bronchoalveolar lavage fluid. Positive toxocara serology and the clinical picture confirmed the diagnosis of the "visceral larva migrans" syndrome. Intravenous corticosteroid therapy produced a rapid rise in PaO2/FiO2 before the administration of specific treatment. A few cases of acute pneumonia requiring mechanical ventilation due to toxocara have been published but this is, to our knowledge, is the first reported case of ARDS with multi-organ failure

    PAC-Bayesian Bounds for Randomized Empirical Risk Minimizers

    Get PDF
    The aim of this paper is to generalize the PAC-Bayesian theorems proved by Catoni in the classification setting to more general problems of statistical inference. We show how to control the deviations of the risk of randomized estimators. A particular attention is paid to randomized estimators drawn in a small neighborhood of classical estimators, whose study leads to control the risk of the latter. These results allow to bound the risk of very general estimation procedures, as well as to perform model selection

    Gain properties of dye-doped polymer thin films

    Full text link
    Hybrid pumping appears as a promising compromise in order to reach the much coveted goal of an electrically pumped organic laser. In such configuration the organic material is optically pumped by an electrically pumped inorganic device on chip. This engineering solution requires therefore an optimization of the organic gain medium under optical pumping. Here, we report a detailed study of the gain features of dye-doped polymer thin films. In particular we introduce the gain efficiency KK, in order to facilitate comparison between different materials and experimental conditions. The gain efficiency was measured with various setups (pump-probe amplification, variable stripe length method, laser thresholds) in order to study several factors which modify the actual gain of a layer, namely the confinement factor, the pump polarization, the molecular anisotropy, and the re-absorption. For instance, for a 600 nm thick 5 wt\% DCM doped PMMA layer, the different experimental approaches give a consistent value K≃K\simeq 80 cm.MW−1^{-1}. On the contrary, the usual model predicting the gain from the characteristics of the material leads to an overestimation by two orders of magnitude, which raises a serious problem in the design of actual devices. In this context, we demonstrate the feasibility to infer the gain efficiency from the laser threshold of well-calibrated devices. Besides, temporal measurements at the picosecond scale were carried out to support the analysis.Comment: 15 pages, 17 figure
    • 

    corecore