4 research outputs found

    On the Value of Partial Information for Learning from Examples

    Get PDF
    AbstractThe PAC model of learning and its extension to real valued function classes provides a well-accepted theoretical framework for representing the problem of learning a target functiong(x) using a random sample {(xi,g(xi))}i=1m. Based on the uniform strong law of large numbers the PAC model establishes the sample complexity, i.e., the sample sizemwhich is sufficient for accurately estimating the target function to within high confidence. Often, in addition to a random sample, some form of prior knowledge is available about the target. It is intuitive that increasing the amount of information should have the same effect on the error as increasing the sample size. But quantitatively how does the rate of error with respect to increasing information compare to the rate of error with increasing sample size? To answer this we consider a new approach based on a combination of information-based complexity of Traubet al.and Vapnikā€“Chervonenkis (VC) theory. In contrast to VC-theory where function classes of finite pseudo-dimension are used only for statistical-based estimation, we let such classes play a dual role of functional estimation as well as approximation. This is captured in a newly introduced quantity, Ļd(F), which represents a nonlinear width of a function class F. We then extend the notion of thenth minimal radius of information and define a quantityIn,d(F) which measures the minimal approximation error of the worst-case targetgāˆˆ F by the family of function classes having pseudo-dimensiondgiven partial information ongconsisting of values taken bynlinear operators. The error rates are calculated which leads to a quantitative notion of the value of partial information for the paradigm of learning from examples

    The degree of approximation of sets in euclidean space using sets with bounded Vapnik-Chervonenkis dimension

    Get PDF
    AbstractThe degree of approximation of infinite-dimensional function classes using finite n-dimensional manifolds has been the subject of a classical field of study in the area of mathematical approximation theory. In Ratsaby and Maiorov (1997), a new quantity Ļn(F, Lq) which measures the degree of approximation of a function class F by the best manifold Hn of pseudo-dimension less than or equal to n in the Lq-metric has been introduced. For sets F āŠ‚Rm it is defined as Ļn(F, lmq) = infHn dist(F, Hn), where dist(F, Hn) = supxĻµF infyĻµHnāˆ„xāˆ’y āˆ„lmq and Hn āŠ‚Rm is any set of VC-dimension less than or equal to n where n<m. It measures the degree of approximation of the set F by the optimal set Hn āŠ‚Rm of VC-dimension less than or equal to n in the lmq-metric. In this paper we compute Ļn(F, lmq) for F being the unit ball Bmp = {x Ļµ Rm : āˆ„xāˆ„lmpā©½ 1} for any 1 ā©½ p, q ā©½ āˆž, and for F being any subset of the boolean m-cube of size larger than 2mĪ³, for any 12 <Ī³< 1

    Learning with side information: PAC learning bounds

    Get PDF
    AbstractThis paper considers a modification of a PAC learning theory problem in which each instance of the training data is supplemented with side information. In this case, a transformation, given by a side-information map, of the training instance is also classified. However, the learning algorithm needs only to classify a new instance, not the instance and its value under the side information map. Side information can improve general learning rates, but not always. This paper shows that side information leads to the improvement of standard PAC learning theory rate bounds, under restrictions on the probable overlap between concepts and their images under the side information map

    Approximation in shift-invariant spaces with deep ReLU neural networks

    Full text link
    We study the expressive power of deep ReLU neural networks for approximating functions in dilated shift-invariant spaces, which are widely used in signal processing, image processing, communications and so on. Approximation error bounds are estimated with respect to the width and depth of neural networks. The network construction is based on the bit extraction and data-fitting capacity of deep neural networks. As applications of our main results, the approximation rates of classical function spaces such as Sobolev spaces and Besov spaces are obtained. We also give lower bounds of the Lp(1ā‰¤pā‰¤āˆž)L^p (1\le p \le \infty) approximation error for Sobolev spaces, which show that our construction of neural network is asymptotically optimal up to a logarithmic factor
    corecore