27,787 research outputs found

    Low Complexity Sequential Probability Estimation and Universal Compression for Binary Sequences with Constrained Distributions

    Get PDF
    Two low-complexity methods are proposed for sequential probability assignment for binary independent and identically distributed (i.i.d.) individual sequences with empirical distributions whose governing parameters are known to be bounded within a limited interval. The methods can be applied to different problems where fast accurate estimation of the maximizing sequence probability is very essential to minimizing some loss. Such applications include applications in finance, learning, channel estimation and decoding, prediction, and universal compression. The application of the new methods to universal compression is studied, and their universal coding redundancies are analyzed. One of the methods is shown to achieve the minimax redundancy within the inner region of the limited parameter interval. The other method achieves better performance on the region boundaries and is more robust numerically to outliers. Simulation results support the analysis of both methods. While non-asymptotically the gains may be significant over standard methods that maximize the probability over the complete parameter simplex, asymptotic gains are in second order. However, these gains translate to meaningful significant factor gains in other applications, such as financial ones. Moreover, the methods proposed generate estimators that are constrained within a given interval throughout the complete estimation process which are essential to applications such as sequential binary channel crossover estimation. The results for the binary case lay the foundation to studying larger alphabets

    Sequential anomaly detection in the presence of noise and limited feedback

    Full text link
    This paper describes a methodology for detecting anomalies from sequentially observed and potentially noisy data. The proposed approach consists of two main elements: (1) {\em filtering}, or assigning a belief or likelihood to each successive measurement based upon our ability to predict it from previous noisy observations, and (2) {\em hedging}, or flagging potential anomalies by comparing the current belief against a time-varying and data-adaptive threshold. The threshold is adjusted based on the available feedback from an end user. Our algorithms, which combine universal prediction with recent work on online convex programming, do not require computing posterior distributions given all current observations and involve simple primal-dual parameter updates. At the heart of the proposed approach lie exponential-family models which can be used in a wide variety of contexts and applications, and which yield methods that achieve sublinear per-round regret against both static and slowly varying product distributions with marginals drawn from the same exponential family. Moreover, the regret against static distributions coincides with the minimax value of the corresponding online strongly convex game. We also prove bounds on the number of mistakes made during the hedging step relative to the best offline choice of the threshold with access to all estimated beliefs and feedback signals. We validate the theory on synthetic data drawn from a time-varying distribution over binary vectors of high dimensionality, as well as on the Enron email dataset.Comment: 19 pages, 12 pdf figures; final version to be published in IEEE Transactions on Information Theor

    Entanglement purification of unknown quantum states

    Get PDF
    A concern has been expressed that ``the Jaynes principle can produce fake entanglement'' [R. Horodecki et al., Phys. Rev. A {\bf 59}, 1799 (1999)]. In this paper we discuss the general problem of distilling maximally entangled states from NN copies of a bipartite quantum system about which only partial information is known, for instance in the form of a given expectation value. We point out that there is indeed a problem with applying the Jaynes principle of maximum entropy to more than one copy of a system, but the nature of this problem is classical and was discussed extensively by Jaynes. Under the additional assumption that the state ρ(N)\rho^{(N)} of the NN copies of the quantum system is exchangeable, one can write down a simple general expression for ρ(N)\rho^{(N)}. We show how to modify two standard entanglement purification protocols, one-way hashing and recurrence, so that they can be applied to exchangeable states. We thus give an explicit algorithm for distilling entanglement from an unknown or partially known quantum state.Comment: 20 pages RevTeX 3.0 + 1 figure (encapsulated Postscript) Submitted to Physical Review

    Optimality of Universal Bayesian Sequence Prediction for General Loss and Alphabet

    Full text link
    Various optimality properties of universal sequence predictors based on Bayes-mixtures in general, and Solomonoff's prediction scheme in particular, will be studied. The probability of observing xtx_t at time tt, given past observations x1...xt1x_1...x_{t-1} can be computed with the chain rule if the true generating distribution μ\mu of the sequences x1x2x3...x_1x_2x_3... is known. If μ\mu is unknown, but known to belong to a countable or continuous class \M one can base ones prediction on the Bayes-mixture ξ\xi defined as a wνw_\nu-weighted sum or integral of distributions \nu\in\M. The cumulative expected loss of the Bayes-optimal universal prediction scheme based on ξ\xi is shown to be close to the loss of the Bayes-optimal, but infeasible prediction scheme based on μ\mu. We show that the bounds are tight and that no other predictor can lead to significantly smaller bounds. Furthermore, for various performance measures, we show Pareto-optimality of ξ\xi and give an Occam's razor argument that the choice wν2K(ν)w_\nu\sim 2^{-K(\nu)} for the weights is optimal, where K(ν)K(\nu) is the length of the shortest program describing ν\nu. The results are applied to games of chance, defined as a sequence of bets, observations, and rewards. The prediction schemes (and bounds) are compared to the popular predictors based on expert advice. Extensions to infinite alphabets, partial, delayed and probabilistic prediction, classification, and more active systems are briefly discussed.Comment: 34 page
    corecore