561 research outputs found
The degree of approximation of sets in euclidean space using sets with bounded Vapnik-Chervonenkis dimension
AbstractThe degree of approximation of infinite-dimensional function classes using finite n-dimensional manifolds has been the subject of a classical field of study in the area of mathematical approximation theory. In Ratsaby and Maiorov (1997), a new quantity Ļn(F, Lq) which measures the degree of approximation of a function class F by the best manifold Hn of pseudo-dimension less than or equal to n in the Lq-metric has been introduced. For sets F āRm it is defined as Ļn(F, lmq) = infHn dist(F, Hn), where dist(F, Hn) = supxĻµF infyĻµHnā„xāy ā„lmq and Hn āRm is any set of VC-dimension less than or equal to n where n<m. It measures the degree of approximation of the set F by the optimal set Hn āRm of VC-dimension less than or equal to n in the lmq-metric. In this paper we compute Ļn(F, lmq) for F being the unit ball Bmp = {x Ļµ Rm : ā„xā„lmpā©½ 1} for any 1 ā©½ p, q ā©½ ā, and for F being any subset of the boolean m-cube of size larger than 2mĪ³, for any 12 <Ī³< 1
Fast DD-classification of functional data
A fast nonparametric procedure for classifying functional data is introduced.
It consists of a two-step transformation of the original data plus a classifier
operating on a low-dimensional hypercube. The functional data are first mapped
into a finite-dimensional location-slope space and then transformed by a
multivariate depth function into the -plot, which is a subset of the unit
hypercube. This transformation yields a new notion of depth for functional
data. Three alternative depth functions are employed for this, as well as two
rules for the final classification on . The resulting classifier has
to be cross-validated over a small range of parameters only, which is
restricted by a Vapnik-Cervonenkis bound. The entire methodology does not
involve smoothing techniques, is completely nonparametric and allows to achieve
Bayes optimality under standard distributional settings. It is robust,
efficiently computable, and has been implemented in an R environment.
Applicability of the new approach is demonstrated by simulations as well as a
benchmark study
Sign rank versus VC dimension
This work studies the maximum possible sign rank of sign
matrices with a given VC dimension . For , this maximum is {three}. For
, this maximum is . For , similar but
slightly less accurate statements hold. {The lower bounds improve over previous
ones by Ben-David et al., and the upper bounds are novel.}
The lower bounds are obtained by probabilistic constructions, using a theorem
of Warren in real algebraic topology. The upper bounds are obtained using a
result of Welzl about spanning trees with low stabbing number, and using the
moment curve.
The upper bound technique is also used to: (i) provide estimates on the
number of classes of a given VC dimension, and the number of maximum classes of
a given VC dimension -- answering a question of Frankl from '89, and (ii)
design an efficient algorithm that provides an multiplicative
approximation for the sign rank.
We also observe a general connection between sign rank and spectral gaps
which is based on Forster's argument. Consider the adjacency
matrix of a regular graph with a second eigenvalue of absolute value
and . We show that the sign rank of the signed
version of this matrix is at least . We use this connection to
prove the existence of a maximum class with VC
dimension and sign rank . This answers a question
of Ben-David et al.~regarding the sign rank of large VC classes. We also
describe limitations of this approach, in the spirit of the Alon-Boppana
theorem.
We further describe connections to communication complexity, geometry,
learning theory, and combinatorics.Comment: 33 pages. This is a revised version of the paper "Sign rank versus VC
dimension". Additional results in this version: (i) Estimates on the number
of maximum VC classes (answering a question of Frankl from '89). (ii)
Estimates on the sign rank of large VC classes (answering a question of
Ben-David et al. from '03). (iii) A discussion on the computational
complexity of computing the sign-ran
On the Value of Partial Information for Learning from Examples
AbstractThe PAC model of learning and its extension to real valued function classes provides a well-accepted theoretical framework for representing the problem of learning a target functiong(x) using a random sample {(xi,g(xi))}i=1m. Based on the uniform strong law of large numbers the PAC model establishes the sample complexity, i.e., the sample sizemwhich is sufficient for accurately estimating the target function to within high confidence. Often, in addition to a random sample, some form of prior knowledge is available about the target. It is intuitive that increasing the amount of information should have the same effect on the error as increasing the sample size. But quantitatively how does the rate of error with respect to increasing information compare to the rate of error with increasing sample size? To answer this we consider a new approach based on a combination of information-based complexity of Traubet al.and VapnikāChervonenkis (VC) theory. In contrast to VC-theory where function classes of finite pseudo-dimension are used only for statistical-based estimation, we let such classes play a dual role of functional estimation as well as approximation. This is captured in a newly introduced quantity, Ļd(F), which represents a nonlinear width of a function class F. We then extend the notion of thenth minimal radius of information and define a quantityIn,d(F) which measures the minimal approximation error of the worst-case targetgā F by the family of function classes having pseudo-dimensiondgiven partial information ongconsisting of values taken bynlinear operators. The error rates are calculated which leads to a quantitative notion of the value of partial information for the paradigm of learning from examples
On interference among moving sensors and related problems
We show that for any set of points moving along "simple" trajectories
(i.e., each coordinate is described with a polynomial of bounded degree) in
and any parameter , one can select a fixed non-empty
subset of the points of size , such that the Voronoi diagram of
this subset is "balanced" at any given time (i.e., it contains points
per cell). We also show that the bound is near optimal even for
the one dimensional case in which points move linearly in time. As
applications, we show that one can assign communication radii to the sensors of
a network of moving sensors so that at any given time their interference is
. We also show some results in kinetic approximate range
counting and kinetic discrepancy. In order to obtain these results, we extend
well-known results from -net theory to kinetic environments
A valid theory on probabilistic causation
In this paper several definitions of probabilistic causation are considered, and their main drawbacks discussed. Current notions of probabilistic causality have symmetry limitations (e.g. correlation and statistical dependence are symmetric notions). To avoid the symmetry problem, non-reciprocal causality is often defined in terms of dynamic asymmetry. But these notions are likely to consider spurious regularities. In this paper we present a definition of causality that does non have symmetry inconsistences. It is a natural extension of propositional causality in formal logics, and it can be easily analyzed with statistical inference. The modeling problems are also discussed using empirical processes.Causality, Empirical Processes and Classification Theory, 62M30, 62M15, 62G20
Joint universal lossy coding and identification of stationary mixing sources with general alphabets
We consider the problem of joint universal variable-rate lossy coding and
identification for parametric classes of stationary -mixing sources with
general (Polish) alphabets. Compression performance is measured in terms of
Lagrangians, while identification performance is measured by the variational
distance between the true source and the estimated source. Provided that the
sources are mixing at a sufficiently fast rate and satisfy certain smoothness
and Vapnik-Chervonenkis learnability conditions, it is shown that, for bounded
metric distortions, there exist universal schemes for joint lossy compression
and identification whose Lagrangian redundancies converge to zero as as the block length tends to infinity, where is the
Vapnik-Chervonenkis dimension of a certain class of decision regions defined by
the -dimensional marginal distributions of the sources; furthermore, for
each , the decoder can identify -dimensional marginal of the active
source up to a ball of radius in variational distance,
eventually with probability one. The results are supplemented by several
examples of parametric sources satisfying the regularity conditions.Comment: 16 pages, 1 figure; accepted to IEEE Transactions on Information
Theor
- ā¦