80 research outputs found
On PAC Learning Halfspaces in Non-interactive Local Privacy Model with Public Unlabeled Data
In this paper, we study the problem of PAC learning halfspaces in the
non-interactive local differential privacy model (NLDP). To breach the barrier
of exponential sample complexity, previous results studied a relaxed setting
where the server has access to some additional public but unlabeled data. We
continue in this direction. Specifically, we consider the problem under the
standard setting instead of the large margin setting studied before. Under
different mild assumptions on the underlying data distribution, we propose two
approaches that are based on the Massart noise model and self-supervised
learning and show that it is possible to achieve sample complexities that are
only linear in the dimension and polynomial in other terms for both private and
public data, which significantly improve the previous results. Our methods
could also be used for other private PAC learning problems.Comment: To appear in The 14th Asian Conference on Machine Learning (ACML
2022
Efficient Learning of Linear Separators under Bounded Noise
We study the learnability of linear separators in in the presence of
bounded (a.k.a Massart) noise. This is a realistic generalization of the random
classification noise model, where the adversary can flip each example with
probability . We provide the first polynomial time algorithm
that can learn linear separators to arbitrarily small excess error in this
noise model under the uniform distribution over the unit ball in , for
some constant value of . While widely studied in the statistical learning
theory community in the context of getting faster convergence rates,
computationally efficient algorithms in this model had remained elusive. Our
work provides the first evidence that one can indeed design algorithms
achieving arbitrarily small excess error in polynomial time under this
realistic noise model and thus opens up a new and exciting line of research.
We additionally provide lower bounds showing that popular algorithms such as
hinge loss minimization and averaging cannot lead to arbitrarily small excess
error under Massart noise, even under the uniform distribution. Our work
instead, makes use of a margin based technique developed in the context of
active learning. As a result, our algorithm is also an active learning
algorithm with label complexity that is only a logarithmic the desired excess
error
Robust Learning under Strong Noise via {SQs}
This work provides several new insights on the robustness of Kearns' statistical query framework against challenging label-noise models. First, we build on a recent result by \cite{DBLP:journals/corr/abs-2006-04787} that showed noise tolerance of distribution-independently evolvable concept classes under Massart noise. Specifically, we extend their characterization to more general noise models, including the Tsybakov model which considerably generalizes the Massart condition by allowing the flipping probability to be arbitrarily close to for a subset of the domain. As a corollary, we employ an evolutionary algorithm by \cite{DBLP:conf/colt/KanadeVV10} to obtain the first polynomial time algorithm with arbitrarily small excess error for learning linear threshold functions over any spherically symmetric distribution in the presence of spherically symmetric Tsybakov noise. Moreover, we posit access to a stronger oracle, in which for every labeled example we additionally obtain its flipping probability. In this model, we show that every SQ learnable class admits an efficient learning algorithm with OPT + misclassification error for a broad class of noise models. This setting substantially generalizes the widely-studied problem of classification under RCN with known noise rate, and corresponds to a non-convex optimization problem even when the noise function -- i.e. the flipping probabilities of all points -- is known in advance
Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Connections to Evolvability
In this paper we revisit some classic problems on classification under
misspecification. In particular, we study the problem of learning halfspaces
under Massart noise with rate . In a recent work, Diakonikolas,
Goulekakis, and Tzamos resolved a long-standing problem by giving the first
efficient algorithm for learning to accuracy for any
. However, their algorithm outputs a complicated hypothesis,
which partitions space into regions. Here we give a
much simpler algorithm and in the process resolve a number of outstanding open
questions:
(1) We give the first proper learner for Massart halfspaces that achieves
. We also give improved bounds on the sample complexity
achievable by polynomial time algorithms.
(2) Based on (1), we develop a blackbox knowledge distillation procedure to
convert an arbitrarily complex classifier to an equally good proper classifier.
(3) By leveraging a simple but overlooked connection to evolvability, we show
any SQ algorithm requires super-polynomially many queries to achieve
.
Moreover we study generalized linear models where for any odd, monotone, and
Lipschitz function . This family includes the previously mentioned
halfspace models as a special case, but is much richer and includes other
fundamental models like logistic regression. We introduce a challenging new
corruption model that generalizes Massart noise, and give a general algorithm
for learning in this setting. Our algorithms are based on a small set of core
recipes for learning to classify in the presence of misspecification.
Finally we study our algorithm for learning halfspaces under Massart noise
empirically and find that it exhibits some appealing fairness properties.Comment: 51 pages, comments welcom
Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach
We study the problem of computationally and label efficient PAC active
learning -dimensional halfspaces with Tsybakov
Noise~\citep{tsybakov2004optimal} under structured unlabeled data
distributions. Inspired by~\cite{diakonikolas2020learning}, we prove that any
approximate first-order stationary point of a smooth nonconvex loss function
yields a halfspace with a low excess error guarantee. In light of the above
structural result, we design a nonconvex optimization-based algorithm with a
label complexity of \footnote{In the main body
of this work, we use to hide factors
of the form \polylog(d, \frac{1}{\epsilon}, \frac{1}{\delta})}, under the
assumption that the Tsybakov noise parameter , which
narrows down the gap between the label complexities of the previously known
efficient passive or active
algorithms~\citep{diakonikolas2020polynomial,zhang2021improved} and the
information-theoretic lower bound in this setting.Comment: 29 page
- …