468 research outputs found

    Online Learning of Quantum States

    Full text link
    Suppose we have many copies of an unknown nn-qubit state ρ\rho. We measure some copies of ρ\rho using a known two-outcome measurement E1E_{1}, then other copies using a measurement E2E_{2}, and so on. At each stage tt, we generate a current hypothesis σt\sigma_{t} about the state ρ\rho, using the outcomes of the previous measurements. We show that it is possible to do this in a way that guarantees that Tr(Eiσt)Tr(Eiρ)|\operatorname{Tr}(E_{i} \sigma_{t}) - \operatorname{Tr}(E_{i}\rho) |, the error in our prediction for the next measurement, is at least ε\varepsilon at most O ⁣(n/ε2)\operatorname{O}\!\left(n / \varepsilon^2 \right) times. Even in the "non-realizable" setting---where there could be arbitrary noise in the measurement outcomes---we show how to output hypothesis states that do significantly worse than the best possible states at most O ⁣(Tn)\operatorname{O}\!\left(\sqrt {Tn}\right) times on the first TT measurements. These results generalize a 2007 theorem by Aaronson on the PAC-learnability of quantum states, to the online and regret-minimization settings. We give three different ways to prove our results---using convex optimization, quantum postselection, and sequential fat-shattering dimension---which have different advantages in terms of parameters and portability.Comment: 18 page

    The Shannon Cipher System with a Guessing Wiretapper: General Sources

    Full text link
    The Shannon cipher system is studied in the context of general sources using a notion of computational secrecy introduced by Merhav & Arikan. Bounds are derived on limiting exponents of guessing moments for general sources. The bounds are shown to be tight for iid, Markov, and unifilar sources, thus recovering some known results. A close relationship between error exponents and correct decoding exponents for fixed rate source compression on the one hand and exponents for guessing moments on the other hand is established.Comment: 24 pages, Submitted to IEEE Transactions on Information Theor

    On empirical cumulant generating functions of code lengths for individual sequences

    Full text link
    We consider the problem of lossless compression of individual sequences using finite-state (FS) machines, from the perspective of the best achievable empirical cumulant generating function (CGF) of the code length, i.e., the normalized logarithm of the empirical average of the exponentiated code length. Since the probabilistic CGF is minimized in terms of the R\'enyi entropy of the source, one of the motivations of this study is to derive an individual-sequence analogue of the R\'enyi entropy, in the same way that the FS compressibility is the individual-sequence counterpart of the Shannon entropy. We consider the CGF of the code-length both from the perspective of fixed-to-variable (F-V) length coding and the perspective of variable-to-variable (V-V) length coding, where the latter turns out to yield a better result, that coincides with the FS compressibility. We also extend our results to compression with side information, available at both the encoder and decoder. In this case, the V-V version no longer coincides with the FS compressibility, but results in a different complexity measure.Comment: 15 pages; submitted for publicatio

    A Multi-Plane Block-Coordinate Frank-Wolfe Algorithm for Training Structural SVMs with a Costly max-Oracle

    Full text link
    Structural support vector machines (SSVMs) are amongst the best performing models for structured computer vision tasks, such as semantic image segmentation or human pose estimation. Training SSVMs, however, is computationally costly, because it requires repeated calls to a structured prediction subroutine (called \emph{max-oracle}), which has to solve an optimization problem itself, e.g. a graph cut. In this work, we introduce a new algorithm for SSVM training that is more efficient than earlier techniques when the max-oracle is computationally expensive, as it is frequently the case in computer vision tasks. The main idea is to (i) combine the recent stochastic Block-Coordinate Frank-Wolfe algorithm with efficient hyperplane caching, and (ii) use an automatic selection rule for deciding whether to call the exact max-oracle or to rely on an approximate one based on the cached hyperplanes. We show experimentally that this strategy leads to faster convergence to the optimum with respect to the number of requires oracle calls, and that this translates into faster convergence with respect to the total runtime when the max-oracle is slow compared to the other steps of the algorithm. A publicly available C++ implementation is provided at http://pub.ist.ac.at/~vnk/papers/SVM.html

    Construction of a Large Class of Deterministic Sensing Matrices that Satisfy a Statistical Isometry Property

    Full text link
    Compressed Sensing aims to capture attributes of kk-sparse signals using very few measurements. In the standard Compressed Sensing paradigm, the \m\times \n measurement matrix \A is required to act as a near isometry on the set of all kk-sparse signals (Restricted Isometry Property or RIP). Although it is known that certain probabilistic processes generate \m \times \n matrices that satisfy RIP with high probability, there is no practical algorithm for verifying whether a given sensing matrix \A has this property, crucial for the feasibility of the standard recovery algorithms. In contrast this paper provides simple criteria that guarantee that a deterministic sensing matrix satisfying these criteria acts as a near isometry on an overwhelming majority of kk-sparse signals; in particular, most such signals have a unique representation in the measurement domain. Probability still plays a critical role, but it enters the signal model rather than the construction of the sensing matrix. We require the columns of the sensing matrix to form a group under pointwise multiplication. The construction allows recovery methods for which the expected performance is sub-linear in \n, and only quadratic in \m; the focus on expected performance is more typical of mainstream signal processing than the worst-case analysis that prevails in standard Compressed Sensing. Our framework encompasses many families of deterministic sensing matrices, including those formed from discrete chirps, Delsarte-Goethals codes, and extended BCH codes.Comment: 16 Pages, 2 figures, to appear in IEEE Journal of Selected Topics in Signal Processing, the special issue on Compressed Sensin

    Universal Codes from Switching Strategies

    Get PDF
    We discuss algorithms for combining sequential prediction strategies, a task which can be viewed as a natural generalisation of the concept of universal coding. We describe a graphical language based on Hidden Markov Models for defining prediction strategies, and we provide both existing and new models as examples. The models include efficient, parameterless models for switching between the input strategies over time, including a model for the case where switches tend to occur in clusters, and finally a new model for the scenario where the prediction strategies have a known relationship, and where jumps are typically between strongly related ones. This last model is relevant for coding time series data where parameter drift is expected. As theoretical ontributions we introduce an interpolation construction that is useful in the development and analysis of new algorithms, and we establish a new sophisticated lemma for analysing the individual sequence regret of parameterised models

    Sequential Predictions based on Algorithmic Complexity

    Get PDF
    This paper studies sequence prediction based on the monotone Kolmogorov complexity Km=-log m, i.e. based on universal deterministic/one-part MDL. m is extremely close to Solomonoff's universal prior M, the latter being an excellent predictor in deterministic as well as probabilistic environments, where performance is measured in terms of convergence of posteriors or losses. Despite this closeness to M, it is difficult to assess the prediction quality of m, since little is known about the closeness of their posteriors, which are the important quantities for prediction. We show that for deterministic computable environments, the "posterior" and losses of m converge, but rapid convergence could only be shown on-sequence; the off-sequence convergence can be slow. In probabilistic environments, neither the posterior nor the losses converge, in general.Comment: 26 pages, LaTe

    Offline to Online Conversion

    Full text link
    We consider the problem of converting offline estimators into an online predictor or estimator with small extra regret. Formally this is the problem of merging a collection of probability measures over strings of length 1,2,3,... into a single probability measure over infinite sequences. We describe various approaches and their pros and cons on various examples. As a side-result we give an elementary non-heuristic purely combinatoric derivation of Turing's famous estimator. Our main technical contribution is to determine the computational complexity of online estimators with good guarantees in general.Comment: 20 LaTeX page
    corecore