4,657 research outputs found

    Quantum learning: optimal classification of qubit states

    Full text link
    Pattern recognition is a central topic in Learning Theory with numerous applications such as voice and text recognition, image analysis, computer diagnosis. The statistical set-up in classification is the following: we are given an i.i.d. training set (X1,Y1),...(Xn,Yn)(X_{1},Y_{1}),... (X_{n},Y_{n}) where XiX_{i} represents a feature and Yi∈{0,1}Y_{i}\in \{0,1\} is a label attached to that feature. The underlying joint distribution of (X,Y)(X,Y) is unknown, but we can learn about it from the training set and we aim at devising low error classifiers f:X→Yf:X\to Y used to predict the label of new incoming features. Here we solve a quantum analogue of this problem, namely the classification of two arbitrary unknown qubit states. Given a number of `training' copies from each of the states, we would like to `learn' about them by performing a measurement on the training set. The outcome is then used to design mesurements for the classification of future systems with unknown labels. We find the asymptotically optimal classification strategy and show that typically, it performs strictly better than a plug-in strategy based on state estimation. The figure of merit is the excess risk which is the difference between the probability of error and the probability of error of the optimal measurement when the states are known, that is the Helstrom measurement. We show that the excess risk has rate n−1n^{-1} and compute the exact constant of the rate.Comment: 24 pages, 4 figure

    Asymptotic Bayes-optimality under sparsity of some multiple testing procedures

    Full text link
    Within a Bayesian decision theoretic framework we investigate some asymptotic optimality properties of a large class of multiple testing rules. A parametric setup is considered, in which observations come from a normal scale mixture model and the total loss is assumed to be the sum of losses for individual tests. Our model can be used for testing point null hypotheses, as well as to distinguish large signals from a multitude of very small effects. A rule is defined to be asymptotically Bayes optimal under sparsity (ABOS), if within our chosen asymptotic framework the ratio of its Bayes risk and that of the Bayes oracle (a rule which minimizes the Bayes risk) converges to one. Our main interest is in the asymptotic scheme where the proportion p of "true" alternatives converges to zero. We fully characterize the class of fixed threshold multiple testing rules which are ABOS, and hence derive conditions for the asymptotic optimality of rules controlling the Bayesian False Discovery Rate (BFDR). We finally provide conditions under which the popular Benjamini-Hochberg (BH) and Bonferroni procedures are ABOS and show that for a wide class of sparsity levels, the threshold of the former can be approximated by a nonrandom threshold.Comment: Published in at http://dx.doi.org/10.1214/10-AOS869 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On the Bayes-optimality of F-measure maximizers

    Get PDF
    The F-measure, which has originally been introduced in information retrieval, is nowadays routinely used as a performance metric for problems such as binary classification, multi-label classification, and structured output prediction. Optimizing this measure is a statistically and computationally challenging problem, since no closed-form solution exists. Adopting a decision-theoretic perspective, this article provides a formal and experimental analysis of different approaches for maximizing the F-measure. We start with a Bayes-risk analysis of related loss functions, such as Hamming loss and subset zero-one loss, showing that optimizing such losses as a surrogate of the F-measure leads to a high worst-case regret. Subsequently, we perform a similar type of analysis for F-measure maximizing algorithms, showing that such algorithms are approximate, while relying on additional assumptions regarding the statistical distribution of the binary response variables. Furthermore, we present a new algorithm which is not only computationally efficient but also Bayes-optimal, regardless of the underlying distribution. To this end, the algorithm requires only a quadratic (with respect to the number of binary responses) number of parameters of the joint distribution. We illustrate the practical performance of all analyzed methods by means of experiments with multi-label classification problems
    • …
    corecore