130 research outputs found
Risque garanti pour les modèles de discrimination multi-classes
Colloque avec actes et comité de lecture.Nous étudions les performances en généralisation des systèmes de discrimination à catégories multiples. Nous établissons deux bornes sur ces performances, en fonction de deux mesures de capacité de la famille de fonctions calculées : la fonction de croissance et les nombres de couverture. Ces bornes sont évaluées sur un modèle de combinaison de classifieurs estimant les probabilités a posteriori des classes. Ceci permet de comparer l'adéquation des deux mesures de capacité
Learning from compressed observations
The problem of statistical learning is to construct a predictor of a random
variable as a function of a related random variable on the basis of an
i.i.d. training sample from the joint distribution of . Allowable
predictors are drawn from some specified class, and the goal is to approach
asymptotically the performance (expected loss) of the best predictor in the
class. We consider the setting in which one has perfect observation of the
-part of the sample, while the -part has to be communicated at some
finite bit rate. The encoding of the -values is allowed to depend on the
-values. Under suitable regularity conditions on the admissible predictors,
the underlying family of probability distributions and the loss function, we
give an information-theoretic characterization of achievable predictor
performance in terms of conditional distortion-rate functions. The ideas are
illustrated on the example of nonparametric regression in Gaussian noise.Comment: 6 pages; submitted to the 2007 IEEE Information Theory Workshop (ITW
2007
Agnostic Learning of Disjunctions on Symmetric Distributions
We consider the problem of approximating and learning disjunctions (or
equivalently, conjunctions) on symmetric distributions over .
Symmetric distributions are distributions whose PDF is invariant under any
permutation of the variables. We give a simple proof that for every symmetric
distribution , there exists a set of
functions , such that for every disjunction , there is function
, expressible as a linear combination of functions in , such
that -approximates in distance on or
. This directly
gives an agnostic learning algorithm for disjunctions on symmetric
distributions that runs in time . The best known
previous bound is and follows from approximation of the
more general class of halfspaces (Wimmer, 2010). We also show that there exists
a symmetric distribution , such that the minimum degree of a
polynomial that -approximates the disjunction of all variables is
distance on is . Therefore the
learning result above cannot be achieved via -regression with a
polynomial basis used in most other agnostic learning algorithms.
Our technique also gives a simple proof that for any product distribution
and every disjunction , there exists a polynomial of
degree such that -approximates in
distance on . This was first proved by Blais et al.
(2008) via a more involved argument
Complexity of hyperconcepts
AbstractIn machine-learning, maximizing the sample margin can reduce the learning generalization error. Samples on which the target function has a large margin (γ) convey more information since they yield more accurate hypotheses. Let X be a finite domain and S denote the set of all samples S⊆X of fixed cardinality m. Let H be a class of hypotheses h on X. A hyperconcept h′ is defined as an indicator function for a set A⊆S of all samples on which the corresponding hypothesis h has a margin of at least γ. An estimate on the complexity of the class H′ of hyperconcepts h′ is obtained with explicit dependence on γ, the pseudo-dimension of H and m
- …