130 research outputs found

    Risque garanti pour les modèles de discrimination multi-classes

    Get PDF
    Colloque avec actes et comité de lecture.Nous étudions les performances en généralisation des systèmes de discrimination à catégories multiples. Nous établissons deux bornes sur ces performances, en fonction de deux mesures de capacité de la famille de fonctions calculées : la fonction de croissance et les nombres de couverture. Ces bornes sont évaluées sur un modèle de combinaison de classifieurs estimant les probabilités a posteriori des classes. Ceci permet de comparer l'adéquation des deux mesures de capacité

    Learning from compressed observations

    Full text link
    The problem of statistical learning is to construct a predictor of a random variable YY as a function of a related random variable XX on the basis of an i.i.d. training sample from the joint distribution of (X,Y)(X,Y). Allowable predictors are drawn from some specified class, and the goal is to approach asymptotically the performance (expected loss) of the best predictor in the class. We consider the setting in which one has perfect observation of the XX-part of the sample, while the YY-part has to be communicated at some finite bit rate. The encoding of the YY-values is allowed to depend on the XX-values. Under suitable regularity conditions on the admissible predictors, the underlying family of probability distributions and the loss function, we give an information-theoretic characterization of achievable predictor performance in terms of conditional distortion-rate functions. The ideas are illustrated on the example of nonparametric regression in Gaussian noise.Comment: 6 pages; submitted to the 2007 IEEE Information Theory Workshop (ITW 2007

    Agnostic Learning of Disjunctions on Symmetric Distributions

    Full text link
    We consider the problem of approximating and learning disjunctions (or equivalently, conjunctions) on symmetric distributions over {0,1}n\{0,1\}^n. Symmetric distributions are distributions whose PDF is invariant under any permutation of the variables. We give a simple proof that for every symmetric distribution D\mathcal{D}, there exists a set of nO(log(1/ϵ))n^{O(\log{(1/\epsilon)})} functions S\mathcal{S}, such that for every disjunction cc, there is function pp, expressible as a linear combination of functions in S\mathcal{S}, such that pp ϵ\epsilon-approximates cc in 1\ell_1 distance on D\mathcal{D} or ExD[c(x)p(x)]ϵ\mathbf{E}_{x \sim \mathcal{D}}[ |c(x)-p(x)|] \leq \epsilon. This directly gives an agnostic learning algorithm for disjunctions on symmetric distributions that runs in time nO(log(1/ϵ))n^{O( \log{(1/\epsilon)})}. The best known previous bound is nO(1/ϵ4)n^{O(1/\epsilon^4)} and follows from approximation of the more general class of halfspaces (Wimmer, 2010). We also show that there exists a symmetric distribution D\mathcal{D}, such that the minimum degree of a polynomial that 1/31/3-approximates the disjunction of all nn variables is 1\ell_1 distance on D\mathcal{D} is Ω(n)\Omega( \sqrt{n}). Therefore the learning result above cannot be achieved via 1\ell_1-regression with a polynomial basis used in most other agnostic learning algorithms. Our technique also gives a simple proof that for any product distribution D\mathcal{D} and every disjunction cc, there exists a polynomial pp of degree O(log(1/ϵ))O(\log{(1/\epsilon)}) such that pp ϵ\epsilon-approximates cc in 1\ell_1 distance on D\mathcal{D}. This was first proved by Blais et al. (2008) via a more involved argument

    Complexity of hyperconcepts

    Get PDF
    AbstractIn machine-learning, maximizing the sample margin can reduce the learning generalization error. Samples on which the target function has a large margin (γ) convey more information since they yield more accurate hypotheses. Let X be a finite domain and S denote the set of all samples S⊆X of fixed cardinality m. Let H be a class of hypotheses h on X. A hyperconcept h′ is defined as an indicator function for a set A⊆S of all samples on which the corresponding hypothesis h has a margin of at least γ. An estimate on the complexity of the class H′ of hyperconcepts h′ is obtained with explicit dependence on γ, the pseudo-dimension of H and m
    corecore