693 research outputs found
Unlabeled sample compression schemes and corner peelings for ample and maximum classes
We examine connections between combinatorial notions that arise in machine
learning and topological notions in cubical/simplicial geometry. These
connections enable to export results from geometry to machine learning.
Our first main result is based on a geometric construction by Tracy Hall
(2004) of a partial shelling of the cross-polytope which can not be extended.
We use it to derive a maximum class of VC dimension 3 that has no corners. This
refutes several previous works in machine learning from the past 11 years. In
particular, it implies that all previous constructions of optimal unlabeled
sample compression schemes for maximum classes are erroneous.
On the positive side we present a new construction of an unlabeled sample
compression scheme for maximum classes. We leave as open whether our unlabeled
sample compression scheme extends to ample (a.k.a. lopsided or extremal)
classes, which represent a natural and far-reaching generalization of maximum
classes. Towards resolving this question, we provide a geometric
characterization in terms of unique sink orientations of the 1-skeletons of
associated cubical complexes
Unlabeled Sample Compression Schemes and Corner Peelings for Ample and Maximum Classes
We examine connections between combinatorial notions that arise in machine learning and topological notions in cubical/simplicial geometry. These connections enable to export results from geometry to machine learning. Our first main result is based on a geometric construction by H. Tracy Hall (2004) of a partial shelling of the cross-polytope which can not be extended. We use it to derive a maximum class of VC dimension 3 that has no corners. This refutes several previous works in machine learning from the past 11 years. In particular, it implies that the previous constructions of optimal unlabeled compression schemes for maximum classes are erroneous.
On the positive side we present a new construction of an optimal unlabeled compression scheme for maximum classes. We leave as open whether our unlabeled compression scheme extends to ample (a.k.a. lopsided or extremal) classes, which represent a natural and far-reaching generalization of maximum classes. Towards resolving this question, we provide a geometric characterization in terms of unique sink orientations of the 1-skeletons of associated cubical complexes
Sign rank versus VC dimension
This work studies the maximum possible sign rank of sign
matrices with a given VC dimension . For , this maximum is {three}. For
, this maximum is . For , similar but
slightly less accurate statements hold. {The lower bounds improve over previous
ones by Ben-David et al., and the upper bounds are novel.}
The lower bounds are obtained by probabilistic constructions, using a theorem
of Warren in real algebraic topology. The upper bounds are obtained using a
result of Welzl about spanning trees with low stabbing number, and using the
moment curve.
The upper bound technique is also used to: (i) provide estimates on the
number of classes of a given VC dimension, and the number of maximum classes of
a given VC dimension -- answering a question of Frankl from '89, and (ii)
design an efficient algorithm that provides an multiplicative
approximation for the sign rank.
We also observe a general connection between sign rank and spectral gaps
which is based on Forster's argument. Consider the adjacency
matrix of a regular graph with a second eigenvalue of absolute value
and . We show that the sign rank of the signed
version of this matrix is at least . We use this connection to
prove the existence of a maximum class with VC
dimension and sign rank . This answers a question
of Ben-David et al.~regarding the sign rank of large VC classes. We also
describe limitations of this approach, in the spirit of the Alon-Boppana
theorem.
We further describe connections to communication complexity, geometry,
learning theory, and combinatorics.Comment: 33 pages. This is a revised version of the paper "Sign rank versus VC
dimension". Additional results in this version: (i) Estimates on the number
of maximum VC classes (answering a question of Frankl from '89). (ii)
Estimates on the sign rank of large VC classes (answering a question of
Ben-David et al. from '03). (iii) A discussion on the computational
complexity of computing the sign-ran
Sources of Superlinearity in Davenport-Schinzel Sequences
A generalized Davenport-Schinzel sequence is one over a finite alphabet that
contains no subsequences isomorphic to a fixed forbidden subsequence. One of
the fundamental problems in this area is bounding (asymptotically) the maximum
length of such sequences. Following Klazar, let Ex(\sigma,n) be the maximum
length of a sequence over an alphabet of size n avoiding subsequences
isomorphic to \sigma. It has been proved that for every \sigma, Ex(\sigma,n) is
either linear or very close to linear; in particular it is O(n
2^{\alpha(n)^{O(1)}}), where \alpha is the inverse-Ackermann function and O(1)
depends on \sigma. However, very little is known about the properties of \sigma
that induce superlinearity of \Ex(\sigma,n).
In this paper we exhibit an infinite family of independent superlinear
forbidden subsequences. To be specific, we show that there are 17 prototypical
superlinear forbidden subsequences, some of which can be made arbitrarily long
through a simple padding operation. Perhaps the most novel part of our
constructions is a new succinct code for representing superlinear forbidden
subsequences
Quantum network communication -- the butterfly and beyond
We study the k-pair communication problem for quantum information in networks
of quantum channels. We consider the asymptotic rates of high fidelity quantum
communication between specific sender-receiver pairs. Four scenarios of
classical communication assistance (none, forward, backward, and two-way) are
considered. (i) We obtain outer and inner bounds of the achievable rate regions
in the most general directed networks. (ii) For two particular networks
(including the butterfly network) routing is proved optimal, and the free
assisting classical communication can at best be used to modify the directions
of quantum channels in the network. Consequently, the achievable rate regions
are given by counting edge avoiding paths, and precise achievable rate regions
in all four assisting scenarios can be obtained. (iii) Optimality of routing
can also be proved in classes of networks. The first class consists of directed
unassisted networks in which (1) the receivers are information sinks, (2) the
maximum distance from senders to receivers is small, and (3) a certain type of
4-cycles are absent, but without further constraints (such as on the number of
communicating and intermediate parties). The second class consists of arbitrary
backward-assisted networks with 2 sender-receiver pairs. (iv) Beyond the k-pair
communication problem, observations are made on quantum multicasting and a
static version of network communication related to the entanglement of
assistance.Comment: 15 pages, 17 figures. Final versio
- …