3 research outputs found
Optimal PAC Bounds Without Uniform Convergence
In statistical learning theory, determining the sample complexity of
realizable binary classification for VC classes was a long-standing open
problem. The results of Simon and Hanneke established sharp upper bounds in
this setting. However, the reliance of their argument on the uniform
convergence principle limits its applicability to more general learning
settings such as multiclass classification. In this paper, we address this
issue by providing optimal high probability risk bounds through a framework
that surpasses the limitations of uniform convergence arguments.
Our framework converts the leave-one-out error of permutation invariant
predictors into high probability risk bounds. As an application, by adapting
the one-inclusion graph algorithm of Haussler, Littlestone, and Warmuth, we
propose an algorithm that achieves an optimal PAC bound for binary
classification. Specifically, our result shows that certain aggregations of
one-inclusion graph algorithms are optimal, addressing a variant of a classic
question posed by Warmuth.
We further instantiate our framework in three settings where uniform
convergence is provably suboptimal. For multiclass classification, we prove an
optimal risk bound that scales with the one-inclusion hypergraph density of the
class, addressing the suboptimality of the analysis of Daniely and
Shalev-Shwartz. For partial hypothesis classification, we determine the optimal
sample complexity bound, resolving a question posed by Alon, Hanneke, Holzman,
and Moran. For realizable bounded regression with absolute loss, we derive an
optimal risk bound that relies on a modified version of the scale-sensitive
dimension, refining the results of Bartlett and Long. Our rates surpass
standard uniform convergence-based results due to the smaller complexity
measure in our risk bound.Comment: 27 page
On the amortized complexity of approximate counting
Naively storing a counter up to value would require bits
of memory. Nelson and Yu [NY22], following work of [Morris78], showed that if
the query answers need only be -approximate with probability at
least , then bits suffice, and in fact this bound is tight. Morris'
original motivation for studying this problem though, as well as modern
applications, require not only maintaining one counter, but rather counters
for large. This motivates the following question: for large, can
counters be simultaneously maintained using asymptotically less memory than
times the cost of an individual counter? That is to say, does this problem
benefit from an improved {\it amortized} space complexity bound?
We answer this question in the negative. Specifically, we prove a lower bound
for nearly the full range of parameters showing that, in terms of memory usage,
there is no asymptotic benefit possible via amortization when storing multiple
counters. Our main proof utilizes a certain notion of "information cost"
recently introduced by Braverman, Garg and Woodruff in FOCS 2020 to prove lower
bounds for streaming algorithms
Majority-of-Three: The Simplest Optimal Learner?
Developing an optimal PAC learning algorithm in the realizable setting, where
empirical risk minimization (ERM) is suboptimal, was a major open problem in
learning theory for decades. The problem was finally resolved by Hanneke a few
years ago. Unfortunately, Hanneke's algorithm is quite complex as it returns
the majority vote of many ERM classifiers that are trained on carefully
selected subsets of the data. It is thus a natural goal to determine the
simplest algorithm that is optimal. In this work we study the arguably simplest
algorithm that could be optimal: returning the majority vote of three ERM
classifiers. We show that this algorithm achieves the optimal in-expectation
bound on its error which is provably unattainable by a single ERM classifier.
Furthermore, we prove a near-optimal high-probability bound on this algorithm's
error. We conjecture that a better analysis will prove that this algorithm is
in fact optimal in the high-probability regime.Comment: 22 page