151 research outputs found

    Sequential Monte Carlo smoothing for general state space hidden Markov models

    Full text link
    Computing smoothing distributions, the distributions of one or more states conditional on past, present, and future observations is a recurring problem when operating on general hidden Markov models. The aim of this paper is to provide a foundation of particle-based approximation of such distributions and to analyze, in a common unifying framework, different schemes producing such approximations. In this setting, general convergence results, including exponential deviation inequalities and central limit theorems, are established. In particular, time uniform bounds on the marginal smoothing error are obtained under appropriate mixing conditions on the transition kernel of the latent chain. In addition, we propose an algorithm approximating the joint smoothing distribution at a cost that grows only linearly with the number of particles.Comment: Published in at http://dx.doi.org/10.1214/10-AAP735 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: text overlap with arXiv:1012.4183 by other author

    Poursuite d'étude après un IUT STID : l'exemple du Cursus de Master en Ingénierie Statistique et Informatique Décisionnelle de Toulouse

    Get PDF
    National audienceAfter obtaining their DUT, many students of IUT STID wish to continue their studies to obtain a Master, which gives them the opportunity to work as an expert engineer or as a project manager. A high specificity of data science is its joint aspect: it requires mastery of concepts and tools in computer science (databases, Business Intelligence) as well as in mathematics (statistics and stochastic modeling). The SID training at Paul Sabatier University offers, for a dozen years, a continued study program consistent with this dual skill. Previously professionalized university institute, it became this year a Master of Engineering. While maintaining a strong industrial component, it opens to the research training and continues to prepare for careers as engineers in all their dimensions, while relying heavily on real-life professional situations. We present here briefly the coordination between the IUT and the master regarding statistics, as well as its past and future. We return on the assets offered by the transition from the IUT to SID, as well as the specific difficulties they encounter.Après obtention de leur DUT, de nombreux étudiants des IUT STID souhaitent poursuivre leurs études pour obtenir un Master qui leur donne la possibilité de travailler comme ingénieur-expert ou chef de projet. Une spécificité forte de la science des données est son aspect bi-disciplinaire : elle nécessite la maîtrise de concepts et d'outils tant informatiques (bases de données et informatique décisionnelle) que mathématiques (statistique et modélisation stochastique). La formation SID de l'Université Paul Sabatier offre, depuis une douzaine d'années, une filière de poursuite d'étude cohérente qui respecte cette double compétence. Anciennement Institut Universitaire Professionnalisé, elle est devenue cette année un Cursus de Master en Ingénierie. Tout en conservant une forte composante industrielle, elle s'ouvre ainsi à la formation par la recherche et continue à préparer aux métiers d'ingénieurs dans toutes leurs dimensions, en s'appuyant fortement sur les activités de mise en situation. Nous présentons ici brièvement l'articulation entre l'IUT et le master pour l'{ensei\-gne\-ment} de la statistique, ainsi que son évolution passée et à venir. Nous revenons sur les atouts que leur passage en IUT donne aux étudiants, ainsi que les difficultés spécifiques qu'ils rencontrent

    Optimal Best Arm Identification with Fixed Confidence

    Get PDF
    International audienceWe give a complete characterization of the complexity of best-arm identification in one-parameter bandit problems. We prove a new, tight lower bound on the sample complexity. We propose the `Track-and-Stop' strategy, which we prove to be asymptotically optimal. It consists in a new sampling rule (which tracks the optimal proportions of arm draws highlighted by the lower bound) and in a stopping rule named after Chernoff, for which we give a new analysis

    On the Complexity of A/B Testing

    Get PDF
    A/B testing refers to the task of determining the best option among two alternatives that yield random outcomes. We provide distribution-dependent lower bounds for the performance of A/B testing that improve over the results currently available both in the fixed-confidence (or delta-PAC) and fixed-budget settings. When the distribution of the outcomes are Gaussian, we prove that the complexity of the fixed-confidence and fixed-budget settings are equivalent, and that uniform sampling of both alternatives is optimal only in the case of equal variances. In the common variance case, we also provide a stopping rule that terminates faster than existing fixed-confidence algorithms. In the case of Bernoulli distributions, we show that the complexity of fixed-budget setting is smaller than that of fixed-confidence setting and that uniform sampling of both alternatives -though not optimal- is advisable in practice when combined with an appropriate stopping criterion

    Coding on countably infinite alphabets

    Get PDF
    33 pagesInternational audienceThis paper describes universal lossless coding strategies for compressing sources on countably infinite alphabets. Classes of memoryless sources defined by an envelope condition on the marginal distribution provide benchmarks for coding techniques originating from the theory of universal coding over finite alphabets. We prove general upper-bounds on minimax regret and lower-bounds on minimax redundancy for such source classes. The general upper bounds emphasize the role of the Normalized Maximum Likelihood codes with respect to minimax regret in the infinite alphabet context. Lower bounds are derived by tailoring sharp bounds on the redundancy of Krichevsky-Trofimov coders for sources over finite alphabets. Up to logarithmic (resp. constant) factors the bounds are matching for source classes defined by algebraically declining (resp. exponentially vanishing) envelopes. Effective and (almost) adaptive coding techniques are described for the collection of source classes defined by algebraically vanishing envelopes. Those results extend ourknowledge concerning universal coding to contexts where the key tools from parametric inferenc
    • …
    corecore