214 research outputs found

    Coresets-Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms

    Get PDF
    We present a technical survey on the state of the art approaches in data reduction and the coreset framework. These include geometric decompositions, gradient methods, random sampling, sketching and random projections. We further outline their importance for the design of streaming algorithms and give a brief overview on lower bounding techniques

    Approximate Learning of Limit-Average Automata

    Get PDF
    Limit-average automata are weighted automata on infinite words that use average to aggregate the weights seen in infinite runs. We study approximate learning problems for limit-average automata in two settings: passive and active. In the passive learning case, we show that limit-average automata are not PAC-learnable as samples must be of exponential-size to provide (with good probability) enough details to learn an automaton. We also show that the problem of finding an automaton that fits a given sample is NP-complete. In the active learning case, we show that limit-average automata can be learned almost-exactly, i.e., we can learn in polynomial time an automaton that is consistent with the target automaton on almost all words. On the other hand, we show that the problem of learning an automaton that approximates the target automaton (with perhaps fewer states) is NP-complete. The abovementioned results are shown for the uniform distribution on words. We briefly discuss learning over different distributions

    Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning

    Get PDF
    Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk. Learning from data is central to contemporary computational linguistics. It is in common in such learning to estimate a model in a parametric family using the maximum likelihood principle. This principle applies in the supervised case (i.e., using annotate

    The learnability of unknown quantum measurements

    Full text link
    © Rinton Press. In this work, we provide an elegant framework to analyze learning matrices in the Schatten class by taking advantage of a recently developed methodology—matrix concentration inequalities. We establish the fat-shattering dimension, Rademacher/Gaussian complexity, and the entropy number of learning bounded operators and trace class operators. By characterising the tasks of learning quantum states and two-outcome quantum measurements into learning matrices in the Schatten-1 and ∞ classes, our proposed approach directly solves the sample complexity problems of learning quantum states and quantum measurements. Our main result in the paper is that, for learning an unknown quantum measurement, the upper bound, given by the fat-shattering dimension, is linearly proportional to the dimension of the underlying Hilbert space. Learning an unknown quantum state becomes a dual problem to ours, and as a byproduct, we can recover Aaronson’s famous result [Proc. R. Soc. A 463, 3089–3144 (2007)] solely using a classical machine learning technique. In addition, other famous complexity measures like covering numbers and Rademacher/Gaussian complexities are derived explicitly under the same framework. We are able to connect measures of sample complexity with various areas in quantum information science, e.g. quantum state/measurement tomography, quantum state discrimination and quantum random access codes, which may be of independent interest. Lastly, with the assistance of general Bloch-sphere representation, we show that learning quantum measurements/states can be mathematically formulated as a neural network. Consequently, classical ML algorithms can be applied to efficiently accomplish the two quantum learning tasks

    Sampling-based Approximations with Quantitative Performance for the Probabilistic Reach-Avoid Problem over General Markov Processes

    Get PDF
    This article deals with stochastic processes endowed with the Markov (memoryless) property and evolving over general (uncountable) state spaces. The models further depend on a non-deterministic quantity in the form of a control input, which can be selected to affect the probabilistic dynamics. We address the computation of maximal reach-avoid specifications, together with the synthesis of the corresponding optimal controllers. The reach-avoid specification deals with assessing the likelihood that any finite-horizon trajectory of the model enters a given goal set, while avoiding a given set of undesired states. This article newly provides an approximate computational scheme for the reach-avoid specification based on the Fitted Value Iteration algorithm, which hinges on random sample extractions, and gives a-priori computable formal probabilistic bounds on the error made by the approximation algorithm: as such, the output of the numerical scheme is quantitatively assessed and thus meaningful for safety-critical applications. Furthermore, we provide tighter probabilistic error bounds that are sample-based. The overall computational scheme is put in relationship with alternative approximation algorithms in the literature, and finally its performance is practically assessed over a benchmark case study
    corecore