3 research outputs found
Agnostic Learning of Halfspaces with Gradient Descent via Soft Margins
We analyze the properties of gradient descent on convex surrogates for the
zero-one loss for the agnostic learning of linear halfspaces. If
is the best classification error achieved by a halfspace, by appealing to the
notion of soft margins we are able to show that gradient descent finds
halfspaces with classification error in time and sample complexity for
a broad class of distributions that includes log-concave isotropic
distributions as a subclass. Along the way we answer a question recently posed
by Ji et al. (2020) on how the tail behavior of a loss function can affect
sample complexity and runtime guarantees for gradient descent.Comment: 25 pages, 1 tabl
Sample-Optimal PAC Learning of Halfspaces with Malicious Noise
We study efficient PAC learning of homogeneous halfspaces in
in the presence of malicious noise of Valiant~(1985). This is a challenging
noise model and only until recently has near-optimal noise tolerance bound been
established under the mild condition that the unlabeled data distribution is
isotropic log-concave. However, it remains unsettled how to obtain the optimal
sample complexity simultaneously. In this work, we present a new analysis for
the algorithm of Awasthi et al.~(2017) and show that it essentially achieves
the near-optimal sample complexity bound of , improving the best
known result of . Our main ingredient is a novel incorporation
of a Matrix Chernoff-type inequality to bound the spectrum of an empirical
covariance matrix for well-behaved distributions, in conjunction with a careful
exploration of the localization schemes of Awasthi et al.~(2017). We further
extend the algorithm and analysis to the more general and stronger nasty noise
model of Bshouty~et~al. (2002), showing that it is still possible to achieve
near-optimal noise tolerance and sample complexity in polynomial time.Comment: arXiv admin note: text overlap with arXiv:2006.0378
On InstaHide, Phase Retrieval, and Sparse Matrix Factorization
In this work, we examine the security of InstaHide, a scheme recently
proposed by [Huang, Song, Li and Arora, ICML'20] for preserving the security of
private datasets in the context of distributed learning. To generate a
synthetic training example to be shared among the distributed learners,
InstaHide takes a convex combination of private feature vectors and randomly
flips the sign of each entry of the resulting vector with probability 1/2. A
salient question is whether this scheme is secure in any provable sense,
perhaps under a plausible hardness assumption and assuming the distributions
generating the public and private data satisfy certain properties.
We show that the answer to this appears to be quite subtle and closely
related to the average-case complexity of a new multi-task, missing-data
version of the classic problem of phase retrieval. Motivated by this
connection, we design a provable algorithm that can recover private vectors
using only the public vectors and synthetic vectors generated by InstaHide,
under the assumption that the private and public vectors are isotropic
Gaussian.Comment: 30 pages, to appear in ICLR 2021, v2: updated discussion of follow-up
wor