561 research outputs found

    The degree of approximation of sets in euclidean space using sets with bounded Vapnik-Chervonenkis dimension

    Get PDF
    AbstractThe degree of approximation of infinite-dimensional function classes using finite n-dimensional manifolds has been the subject of a classical field of study in the area of mathematical approximation theory. In Ratsaby and Maiorov (1997), a new quantity Ļn(F, Lq) which measures the degree of approximation of a function class F by the best manifold Hn of pseudo-dimension less than or equal to n in the Lq-metric has been introduced. For sets F āŠ‚Rm it is defined as Ļn(F, lmq) = infHn dist(F, Hn), where dist(F, Hn) = supxĻµF infyĻµHnāˆ„xāˆ’y āˆ„lmq and Hn āŠ‚Rm is any set of VC-dimension less than or equal to n where n<m. It measures the degree of approximation of the set F by the optimal set Hn āŠ‚Rm of VC-dimension less than or equal to n in the lmq-metric. In this paper we compute Ļn(F, lmq) for F being the unit ball Bmp = {x Ļµ Rm : āˆ„xāˆ„lmpā©½ 1} for any 1 ā©½ p, q ā©½ āˆž, and for F being any subset of the boolean m-cube of size larger than 2mĪ³, for any 12 <Ī³< 1

    Fast DD-classification of functional data

    Full text link
    A fast nonparametric procedure for classifying functional data is introduced. It consists of a two-step transformation of the original data plus a classifier operating on a low-dimensional hypercube. The functional data are first mapped into a finite-dimensional location-slope space and then transformed by a multivariate depth function into the DDDD-plot, which is a subset of the unit hypercube. This transformation yields a new notion of depth for functional data. Three alternative depth functions are employed for this, as well as two rules for the final classification on [0,1]q[0,1]^q. The resulting classifier has to be cross-validated over a small range of parameters only, which is restricted by a Vapnik-Cervonenkis bound. The entire methodology does not involve smoothing techniques, is completely nonparametric and allows to achieve Bayes optimality under standard distributional settings. It is robust, efficiently computable, and has been implemented in an R environment. Applicability of the new approach is demonstrated by simulations as well as a benchmark study

    Sign rank versus VC dimension

    Full text link
    This work studies the maximum possible sign rank of NƗNN \times N sign matrices with a given VC dimension dd. For d=1d=1, this maximum is {three}. For d=2d=2, this maximum is Ī˜~(N1/2)\tilde{\Theta}(N^{1/2}). For d>2d >2, similar but slightly less accurate statements hold. {The lower bounds improve over previous ones by Ben-David et al., and the upper bounds are novel.} The lower bounds are obtained by probabilistic constructions, using a theorem of Warren in real algebraic topology. The upper bounds are obtained using a result of Welzl about spanning trees with low stabbing number, and using the moment curve. The upper bound technique is also used to: (i) provide estimates on the number of classes of a given VC dimension, and the number of maximum classes of a given VC dimension -- answering a question of Frankl from '89, and (ii) design an efficient algorithm that provides an O(N/logā”(N))O(N/\log(N)) multiplicative approximation for the sign rank. We also observe a general connection between sign rank and spectral gaps which is based on Forster's argument. Consider the NƗNN \times N adjacency matrix of a Ī”\Delta regular graph with a second eigenvalue of absolute value Ī»\lambda and Ī”ā‰¤N/2\Delta \leq N/2. We show that the sign rank of the signed version of this matrix is at least Ī”/Ī»\Delta/\lambda. We use this connection to prove the existence of a maximum class CāŠ†{Ā±1}NC\subseteq\{\pm 1\}^N with VC dimension 22 and sign rank Ī˜~(N1/2)\tilde{\Theta}(N^{1/2}). This answers a question of Ben-David et al.~regarding the sign rank of large VC classes. We also describe limitations of this approach, in the spirit of the Alon-Boppana theorem. We further describe connections to communication complexity, geometry, learning theory, and combinatorics.Comment: 33 pages. This is a revised version of the paper "Sign rank versus VC dimension". Additional results in this version: (i) Estimates on the number of maximum VC classes (answering a question of Frankl from '89). (ii) Estimates on the sign rank of large VC classes (answering a question of Ben-David et al. from '03). (iii) A discussion on the computational complexity of computing the sign-ran

    On the Value of Partial Information for Learning from Examples

    Get PDF
    AbstractThe PAC model of learning and its extension to real valued function classes provides a well-accepted theoretical framework for representing the problem of learning a target functiong(x) using a random sample {(xi,g(xi))}i=1m. Based on the uniform strong law of large numbers the PAC model establishes the sample complexity, i.e., the sample sizemwhich is sufficient for accurately estimating the target function to within high confidence. Often, in addition to a random sample, some form of prior knowledge is available about the target. It is intuitive that increasing the amount of information should have the same effect on the error as increasing the sample size. But quantitatively how does the rate of error with respect to increasing information compare to the rate of error with increasing sample size? To answer this we consider a new approach based on a combination of information-based complexity of Traubet al.and Vapnikā€“Chervonenkis (VC) theory. In contrast to VC-theory where function classes of finite pseudo-dimension are used only for statistical-based estimation, we let such classes play a dual role of functional estimation as well as approximation. This is captured in a newly introduced quantity, Ļd(F), which represents a nonlinear width of a function class F. We then extend the notion of thenth minimal radius of information and define a quantityIn,d(F) which measures the minimal approximation error of the worst-case targetgāˆˆ F by the family of function classes having pseudo-dimensiondgiven partial information ongconsisting of values taken bynlinear operators. The error rates are calculated which leads to a quantitative notion of the value of partial information for the paradigm of learning from examples

    PAC-learning geometrical figures

    Get PDF

    On interference among moving sensors and related problems

    Full text link
    We show that for any set of nn points moving along "simple" trajectories (i.e., each coordinate is described with a polynomial of bounded degree) in ā„œd\Re^d and any parameter 2ā‰¤kā‰¤n2 \le k \le n, one can select a fixed non-empty subset of the points of size O(klogā”k)O(k \log k), such that the Voronoi diagram of this subset is "balanced" at any given time (i.e., it contains O(n/k)O(n/k) points per cell). We also show that the bound O(klogā”k)O(k \log k) is near optimal even for the one dimensional case in which points move linearly in time. As applications, we show that one can assign communication radii to the sensors of a network of nn moving sensors so that at any given time their interference is O(nlogā”n)O(\sqrt{n\log n}). We also show some results in kinetic approximate range counting and kinetic discrepancy. In order to obtain these results, we extend well-known results from Īµ\varepsilon-net theory to kinetic environments

    A valid theory on probabilistic causation

    Get PDF
    In this paper several definitions of probabilistic causation are considered, and their main drawbacks discussed. Current notions of probabilistic causality have symmetry limitations (e.g. correlation and statistical dependence are symmetric notions). To avoid the symmetry problem, non-reciprocal causality is often defined in terms of dynamic asymmetry. But these notions are likely to consider spurious regularities. In this paper we present a definition of causality that does non have symmetry inconsistences. It is a natural extension of propositional causality in formal logics, and it can be easily analyzed with statistical inference. The modeling problems are also discussed using empirical processes.Causality, Empirical Processes and Classification Theory, 62M30, 62M15, 62G20

    Joint universal lossy coding and identification of stationary mixing sources with general alphabets

    Full text link
    We consider the problem of joint universal variable-rate lossy coding and identification for parametric classes of stationary Ī²\beta-mixing sources with general (Polish) alphabets. Compression performance is measured in terms of Lagrangians, while identification performance is measured by the variational distance between the true source and the estimated source. Provided that the sources are mixing at a sufficiently fast rate and satisfy certain smoothness and Vapnik-Chervonenkis learnability conditions, it is shown that, for bounded metric distortions, there exist universal schemes for joint lossy compression and identification whose Lagrangian redundancies converge to zero as Vnlogā”n/n\sqrt{V_n \log n /n} as the block length nn tends to infinity, where VnV_n is the Vapnik-Chervonenkis dimension of a certain class of decision regions defined by the nn-dimensional marginal distributions of the sources; furthermore, for each nn, the decoder can identify nn-dimensional marginal of the active source up to a ball of radius O(Vnlogā”n/n)O(\sqrt{V_n\log n/n}) in variational distance, eventually with probability one. The results are supplemented by several examples of parametric sources satisfying the regularity conditions.Comment: 16 pages, 1 figure; accepted to IEEE Transactions on Information Theor
    • ā€¦
    corecore