13,578 research outputs found

    Two knowledge-based methods for High-Performance Sense Distribution Learning

    Get PDF
    Knowing the correct distribution of senses within a corpus can potentially boost the performance of Word Sense Disambiguation (WSD) systems by many points. We present two fully automatic and language-independent methods for computing the distribution of senses given a raw corpus of sentences. Intrinsic and extrinsic evaluations show that our methods outperform the current state of the art in sense distribution learning and the strongest baselines for the most frequent sense in multiple languages and on domain-specific test sets. Our sense distributions are available at http://trainomatic.org

    Measuring complexity with zippers

    Get PDF
    Physics concepts have often been borrowed and independently developed by other fields of science. In this perspective a significant example is that of entropy in Information Theory. The aim of this paper is to provide a short and pedagogical introduction to the use of data compression techniques for the estimate of entropy and other relevant quantities in Information Theory and Algorithmic Information Theory. We consider in particular the LZ77 algorithm as case study and discuss how a zipper can be used for information extraction.Comment: 10 pages, 3 figure

    Robust Estimators under the Imprecise Dirichlet Model

    Full text link
    Walley's Imprecise Dirichlet Model (IDM) for categorical data overcomes several fundamental problems which other approaches to uncertainty suffer from. Yet, to be useful in practice, one needs efficient ways for computing the imprecise=robust sets or intervals. The main objective of this work is to derive exact, conservative, and approximate, robust and credible interval estimates under the IDM for a large class of statistical estimators, including the entropy and mutual information.Comment: 16 LaTeX page

    Quantum query complexity of entropy estimation

    Full text link
    Estimation of Shannon and R\'enyi entropies of unknown discrete distributions is a fundamental problem in statistical property testing and an active research topic in both theoretical computer science and information theory. Tight bounds on the number of samples to estimate these entropies have been established in the classical setting, while little is known about their quantum counterparts. In this paper, we give the first quantum algorithms for estimating α\alpha-R\'enyi entropies (Shannon entropy being 1-Renyi entropy). In particular, we demonstrate a quadratic quantum speedup for Shannon entropy estimation and a generic quantum speedup for α\alpha-R\'enyi entropy estimation for all α≥0\alpha\geq 0, including a tight bound for the collision-entropy (2-R\'enyi entropy). We also provide quantum upper bounds for extreme cases such as the Hartley entropy (i.e., the logarithm of the support size of a distribution, corresponding to α=0\alpha=0) and the min-entropy case (i.e., α=+∞\alpha=+\infty), as well as the Kullback-Leibler divergence between two distributions. Moreover, we complement our results with quantum lower bounds on α\alpha-R\'enyi entropy estimation for all α≥0\alpha\geq 0.Comment: 43 pages, 1 figur

    Distributional Property Testing in a Quantum World

    Get PDF
    A fundamental problem in statistics and learning theory is to test properties of distributions. We show that quantum computers can solve such problems with significant speed-ups. We also introduce a novel access model for quantum distributions, enabling the coherent preparation of quantum samples, and propose a general framework that can naturally handle both classical and quantum distributions in a unified manner. Our framework generalizes and improves previous quantum algorithms for testing closeness between unknown distributions, testing independence between two distributions, and estimating the Shannon / von Neumann entropy of distributions. For classical distributions our algorithms significantly improve the precision dependence of some earlier results. We also show that in our framework procedures for classical distributions can be directly lifted to the more general case of quantum distributions, and thus obtain the first speed-ups for testing properties of density operators that can be accessed coherently rather than only via sampling
    • …
    corecore