638 research outputs found

    On the Value of Partial Information for Learning from Examples

    Get PDF
    AbstractThe PAC model of learning and its extension to real valued function classes provides a well-accepted theoretical framework for representing the problem of learning a target functiong(x) using a random sample {(xi,g(xi))}i=1m. Based on the uniform strong law of large numbers the PAC model establishes the sample complexity, i.e., the sample sizemwhich is sufficient for accurately estimating the target function to within high confidence. Often, in addition to a random sample, some form of prior knowledge is available about the target. It is intuitive that increasing the amount of information should have the same effect on the error as increasing the sample size. But quantitatively how does the rate of error with respect to increasing information compare to the rate of error with increasing sample size? To answer this we consider a new approach based on a combination of information-based complexity of Traubet al.and Vapnik–Chervonenkis (VC) theory. In contrast to VC-theory where function classes of finite pseudo-dimension are used only for statistical-based estimation, we let such classes play a dual role of functional estimation as well as approximation. This is captured in a newly introduced quantity, ρd(F), which represents a nonlinear width of a function class F. We then extend the notion of thenth minimal radius of information and define a quantityIn,d(F) which measures the minimal approximation error of the worst-case targetg∈ F by the family of function classes having pseudo-dimensiondgiven partial information ongconsisting of values taken bynlinear operators. The error rates are calculated which leads to a quantitative notion of the value of partial information for the paradigm of learning from examples

    Theoretical Interpretations and Applications of Radial Basis Function Networks

    Get PDF
    Medical applications usually used Radial Basis Function Networks just as Artificial Neural Networks. However, RBFNs are Knowledge-Based Networks that can be interpreted in several way: Artificial Neural Networks, Regularization Networks, Support Vector Machines, Wavelet Networks, Fuzzy Controllers, Kernel Estimators, Instanced-Based Learners. A survey of their interpretations and of their corresponding learning algorithms is provided as well as a brief survey on dynamic learning algorithms. RBFNs' interpretations can suggest applications that are particularly interesting in medical domains

    High-Dimensional Function Approximation: Breaking the Curse with Monte Carlo Methods

    Get PDF
    In this dissertation we study the tractability of the information-based complexity n(ε,d)n(\varepsilon,d) for dd-variate function approximation problems. In the deterministic setting for many unweighted problems the curse of dimensionality holds, that means, for some fixed error tolerance ε>0\varepsilon>0 the complexity n(ε,d)n(\varepsilon,d) grows exponentially in dd. For integration problems one can usually break the curse with the standard Monte Carlo method. For function approximation problems, however, similar effects of randomization have been unknown so far. The thesis contains results on three more or less stand-alone topics. For an extended five page abstract, see the section "Introduction and Results". Chapter 2 is concerned with lower bounds for the Monte Carlo error for general linear problems via Bernstein numbers. This technique is applied to the LL_{\infty}-approximation of certain classes of CC^{\infty}-functions, where it turns out that randomization does not affect the tractability classification of the problem. Chapter 3 studies the LL_{\infty}-approximation of functions from Hilbert spaces with methods that may use arbitrary linear functionals as information. For certain classes of periodic functions from unweighted periodic tensor product spaces, in particular Korobov spaces, we observe the curse of dimensionality in the deterministic setting, while with randomized methods we achieve polynomial tractability. Chapter 4 deals with the L1L_1-approximation of monotone functions via function values. It is known that this problem suffers from the curse in the deterministic setting. An improved lower bound shows that the problem is still intractable in the randomized setting. However, Monte Carlo breaks the curse, in detail, for any fixed error tolerance ε>0\varepsilon>0 the complexity n(ε,d)n(\varepsilon,d) grows exponentially in d\sqrt{d} only.Comment: This is the author's submitted PhD thesis, still in the referee proces

    The learnability of unknown quantum measurements

    Full text link
    © Rinton Press. In this work, we provide an elegant framework to analyze learning matrices in the Schatten class by taking advantage of a recently developed methodology—matrix concentration inequalities. We establish the fat-shattering dimension, Rademacher/Gaussian complexity, and the entropy number of learning bounded operators and trace class operators. By characterising the tasks of learning quantum states and two-outcome quantum measurements into learning matrices in the Schatten-1 and ∞ classes, our proposed approach directly solves the sample complexity problems of learning quantum states and quantum measurements. Our main result in the paper is that, for learning an unknown quantum measurement, the upper bound, given by the fat-shattering dimension, is linearly proportional to the dimension of the underlying Hilbert space. Learning an unknown quantum state becomes a dual problem to ours, and as a byproduct, we can recover Aaronson’s famous result [Proc. R. Soc. A 463, 3089–3144 (2007)] solely using a classical machine learning technique. In addition, other famous complexity measures like covering numbers and Rademacher/Gaussian complexities are derived explicitly under the same framework. We are able to connect measures of sample complexity with various areas in quantum information science, e.g. quantum state/measurement tomography, quantum state discrimination and quantum random access codes, which may be of independent interest. Lastly, with the assistance of general Bloch-sphere representation, we show that learning quantum measurements/states can be mathematically formulated as a neural network. Consequently, classical ML algorithms can be applied to efficiently accomplish the two quantum learning tasks

    Boolean functions: noise stability, non-interactive correlation distillation, and mutual information

    Full text link
    Let TϵT_{\epsilon} be the noise operator acting on Boolean functions f:{0,1}n{0,1}f:\{0, 1\}^n\to \{0, 1\}, where ϵ[0,1/2]\epsilon\in[0, 1/2] is the noise parameter. Given α>1\alpha>1 and fixed mean Ef\mathbb{E} f, which Boolean function ff has the largest α\alpha-th moment E(Tϵf)α\mathbb{E}(T_\epsilon f)^\alpha? This question has close connections with noise stability of Boolean functions, the problem of non-interactive correlation distillation, and Courtade-Kumar's conjecture on the most informative Boolean function. In this paper, we characterize maximizers in some extremal settings, such as low noise (ϵ=ϵ(n)\epsilon=\epsilon(n) is close to 0), high noise (ϵ=ϵ(n)\epsilon=\epsilon(n) is close to 1/2), as well as when α=α(n)\alpha=\alpha(n) is large. Analogous results are also established in more general contexts, such as Boolean functions defined on discrete torus (Z/pZ)n(\mathbb{Z}/p\mathbb{Z})^n and the problem of noise stability in a tree model.Comment: Corrections of some inaccuracie
    corecore