Search CORE

13 research outputs found

Unlabeled Sample Compression Schemes and Corner Peelings for Ample and Maximum Classes

Author: Chepoi Victor
Moran Shay
Warmuth Manfred K.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 46th International Colloquium on Automata, Languages, and Programming (ICALP 2019)
Publication date: 01/01/2019
Field of study

We examine connections between combinatorial notions that arise in machine learning and topological notions in cubical/simplicial geometry. These connections enable to export results from geometry to machine learning. Our first main result is based on a geometric construction by H. Tracy Hall (2004) of a partial shelling of the cross-polytope which can not be extended. We use it to derive a maximum class of VC dimension 3 that has no corners. This refutes several previous works in machine learning from the past 11 years. In particular, it implies that the previous constructions of optimal unlabeled compression schemes for maximum classes are erroneous. On the positive side we present a new construction of an optimal unlabeled compression scheme for maximum classes. We leave as open whether our unlabeled compression scheme extends to ample (a.k.a. lopsided or extremal) classes, which represent a natural and far-reaching generalization of maximum classes. Towards resolving this question, we provide a geometric characterization in terms of unique sink orientations of the 1-skeletons of associated cubical complexes

HAL AMU

Dagstuhl Research Online Publication Server

Bounding Embeddings of VC Classes into Maximum Classes

Author: Benjamin I. P. Rubinstein
Benjamin I. P. Rubinstein
J. Hyam Rubinstein
J. Hyam Rubinstein
Peter L. Bartlett
Peter L. Bartlett
Publication venue
Publication date: 28/01/2014
Field of study

One of the earliest conjectures in computational learning theory-the Sample Compression conjecture-asserts that concept classes (equivalently set systems) admit compression schemes of size linear in their VC dimension. To-date this statement is known to be true for maximum classes---those that possess maximum cardinality for their VC dimension. The most promising approach to positively resolving the conjecture is by embedding general VC classes into maximum classes without super-linear increase to their VC dimensions, as such embeddings would extend the known compression schemes to all VC classes. We show that maximum classes can be characterised by a local-connectivity property of the graph obtained by viewing the class as a cubical complex. This geometric characterisation of maximum VC classes is applied to prove a negative embedding result which demonstrates VC-d classes that cannot be embedded in any maximum class of VC dimension lower than 2d. On the other hand, we show that every VC-d class C embeds in a VC-(d+D) maximum class where D is the deficiency of C, i.e., the difference between the cardinalities of a maximum VC-d class and of C. For VC-2 classes in binary n-cubes for 4 <= n <= 6, we give best possible results on embedding into maximum classes. For some special classes of Boolean functions, relationships with maximum classes are investigated. Finally we give a general recursive procedure for embedding VC-d classes into VC-(d+k) maximum classes for smallest k.Comment: 22 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Queensland University of Technology ePrints Archive

University of Melbourne Institutional Repository

Supervised Learning Through the Lens of Compression

Author: Moran S.
Ofir D.
Yehudayoff A.
Publication venue
Publication date: 01/01/2016
Field of study

MPG.PuRe

Sign rank versus VC dimension

Author: Alon Noga
Moran Shay
Yehudayoff Amir
Publication venue
Publication date: 01/01/2016
Field of study

This work studies the maximum possible sign rank of

N \times N

sign matrices with a given VC dimension

d

. For

d=1

, this maximum is {three}. For

d=2

, this maximum is

\tilde{\Theta}(N^{1/2})

. For

d >2

, similar but slightly less accurate statements hold. {The lower bounds improve over previous ones by Ben-David et al., and the upper bounds are novel.} The lower bounds are obtained by probabilistic constructions, using a theorem of Warren in real algebraic topology. The upper bounds are obtained using a result of Welzl about spanning trees with low stabbing number, and using the moment curve. The upper bound technique is also used to: (i) provide estimates on the number of classes of a given VC dimension, and the number of maximum classes of a given VC dimension -- answering a question of Frankl from '89, and (ii) design an efficient algorithm that provides an

O(N/\log(N))

multiplicative approximation for the sign rank. We also observe a general connection between sign rank and spectral gaps which is based on Forster's argument. Consider the

N \times N

adjacency matrix of a

\Delta

regular graph with a second eigenvalue of absolute value

\lambda

and

\Delta \leq N/2

. We show that the sign rank of the signed version of this matrix is at least

\Delta/\lambda

. We use this connection to prove the existence of a maximum class

C\subseteq\{\pm 1\}^N

with VC dimension

2

and sign rank

\tilde{\Theta}(N^{1/2})

. This answers a question of Ben-David et al.~regarding the sign rank of large VC classes. We also describe limitations of this approach, in the spirit of the Alon-Boppana theorem. We further describe connections to communication complexity, geometry, learning theory, and combinatorics.Comment: 33 pages. This is a revised version of the paper "Sign rank versus VC dimension". Additional results in this version: (i) Estimates on the number of maximum VC classes (answering a question of Frankl from '89). (ii) Estimates on the sign rank of large VC classes (answering a question of Ben-David et al. from '03). (iii) A discussion on the computational complexity of computing the sign-ran

arXiv.org e-Print Archive

MPG.PuRe

Unlabeled sample compression schemes and corner peelings for ample and maximum classes

Author: Chalopin Jérémie
Chepoi Victor
Moran Shay
Warmuth Manfred K.
Publication venue
Publication date: 05/12/2018
Field of study

We examine connections between combinatorial notions that arise in machine learning and topological notions in cubical/simplicial geometry. These connections enable to export results from geometry to machine learning. Our first main result is based on a geometric construction by Tracy Hall (2004) of a partial shelling of the cross-polytope which can not be extended. We use it to derive a maximum class of VC dimension 3 that has no corners. This refutes several previous works in machine learning from the past 11 years. In particular, it implies that all previous constructions of optimal unlabeled sample compression schemes for maximum classes are erroneous. On the positive side we present a new construction of an unlabeled sample compression scheme for maximum classes. We leave as open whether our unlabeled sample compression scheme extends to ample (a.k.a. lopsided or extremal) classes, which represent a natural and far-reaching generalization of maximum classes. Towards resolving this question, we provide a geometric characterization in terms of unique sink orientations of the 1-skeletons of associated cubical complexes

arXiv.org e-Print Archive

HAL AMU

Unlabeled Compression Schemes Exceeding the VC-dimension

Author: Pálvölgyi Dömötör
Tardos Gábor
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Repository of the Academy's Library