638 research outputs found
On the Value of Partial Information for Learning from Examples
AbstractThe PAC model of learning and its extension to real valued function classes provides a well-accepted theoretical framework for representing the problem of learning a target functiong(x) using a random sample {(xi,g(xi))}i=1m. Based on the uniform strong law of large numbers the PAC model establishes the sample complexity, i.e., the sample sizemwhich is sufficient for accurately estimating the target function to within high confidence. Often, in addition to a random sample, some form of prior knowledge is available about the target. It is intuitive that increasing the amount of information should have the same effect on the error as increasing the sample size. But quantitatively how does the rate of error with respect to increasing information compare to the rate of error with increasing sample size? To answer this we consider a new approach based on a combination of information-based complexity of Traubet al.and Vapnik–Chervonenkis (VC) theory. In contrast to VC-theory where function classes of finite pseudo-dimension are used only for statistical-based estimation, we let such classes play a dual role of functional estimation as well as approximation. This is captured in a newly introduced quantity, ρd(F), which represents a nonlinear width of a function class F. We then extend the notion of thenth minimal radius of information and define a quantityIn,d(F) which measures the minimal approximation error of the worst-case targetg∈ F by the family of function classes having pseudo-dimensiondgiven partial information ongconsisting of values taken bynlinear operators. The error rates are calculated which leads to a quantitative notion of the value of partial information for the paradigm of learning from examples
Theoretical Interpretations and Applications of Radial Basis Function Networks
Medical applications usually used Radial Basis Function Networks just as Artificial Neural Networks. However, RBFNs are Knowledge-Based Networks that can be interpreted in several way: Artificial Neural Networks, Regularization Networks, Support Vector Machines, Wavelet Networks, Fuzzy Controllers, Kernel Estimators, Instanced-Based Learners. A survey of their interpretations and of their corresponding learning algorithms is provided as well as a brief survey on dynamic learning algorithms. RBFNs' interpretations can suggest applications that are particularly interesting in medical domains
High-Dimensional Function Approximation: Breaking the Curse with Monte Carlo Methods
In this dissertation we study the tractability of the information-based
complexity for -variate function approximation problems.
In the deterministic setting for many unweighted problems the curse of
dimensionality holds, that means, for some fixed error tolerance
the complexity grows exponentially in .
For integration problems one can usually break the curse with the standard
Monte Carlo method. For function approximation problems, however, similar
effects of randomization have been unknown so far.
The thesis contains results on three more or less stand-alone topics. For an
extended five page abstract, see the section "Introduction and Results".
Chapter 2 is concerned with lower bounds for the Monte Carlo error for
general linear problems via Bernstein numbers. This technique is applied to the
-approximation of certain classes of -functions, where
it turns out that randomization does not affect the tractability classification
of the problem.
Chapter 3 studies the -approximation of functions from Hilbert
spaces with methods that may use arbitrary linear functionals as information.
For certain classes of periodic functions from unweighted periodic tensor
product spaces, in particular Korobov spaces, we observe the curse of
dimensionality in the deterministic setting, while with randomized methods we
achieve polynomial tractability.
Chapter 4 deals with the -approximation of monotone functions via
function values. It is known that this problem suffers from the curse in the
deterministic setting. An improved lower bound shows that the problem is still
intractable in the randomized setting. However, Monte Carlo breaks the curse,
in detail, for any fixed error tolerance the complexity
grows exponentially in only.Comment: This is the author's submitted PhD thesis, still in the referee
proces
The learnability of unknown quantum measurements
© Rinton Press. In this work, we provide an elegant framework to analyze learning matrices in the Schatten class by taking advantage of a recently developed methodology—matrix concentration inequalities. We establish the fat-shattering dimension, Rademacher/Gaussian complexity, and the entropy number of learning bounded operators and trace class operators. By characterising the tasks of learning quantum states and two-outcome quantum measurements into learning matrices in the Schatten-1 and ∞ classes, our proposed approach directly solves the sample complexity problems of learning quantum states and quantum measurements. Our main result in the paper is that, for learning an unknown quantum measurement, the upper bound, given by the fat-shattering dimension, is linearly proportional to the dimension of the underlying Hilbert space. Learning an unknown quantum state becomes a dual problem to ours, and as a byproduct, we can recover Aaronson’s famous result [Proc. R. Soc. A 463, 3089–3144 (2007)] solely using a classical machine learning technique. In addition, other famous complexity measures like covering numbers and Rademacher/Gaussian complexities are derived explicitly under the same framework. We are able to connect measures of sample complexity with various areas in quantum information science, e.g. quantum state/measurement tomography, quantum state discrimination and quantum random access codes, which may be of independent interest. Lastly, with the assistance of general Bloch-sphere representation, we show that learning quantum measurements/states can be mathematically formulated as a neural network. Consequently, classical ML algorithms can be applied to efficiently accomplish the two quantum learning tasks
Boolean functions: noise stability, non-interactive correlation distillation, and mutual information
Let be the noise operator acting on Boolean functions , where is the noise parameter. Given
and fixed mean , which Boolean function has the
largest -th moment ? This question has
close connections with noise stability of Boolean functions, the problem of
non-interactive correlation distillation, and Courtade-Kumar's conjecture on
the most informative Boolean function. In this paper, we characterize
maximizers in some extremal settings, such as low noise (
is close to 0), high noise ( is close to 1/2), as well as
when is large. Analogous results are also established in
more general contexts, such as Boolean functions defined on discrete torus
and the problem of noise stability in a tree
model.Comment: Corrections of some inaccuracie
- …