54 research outputs found
Representation Learning for Clustering: A Statistical Framework
We address the problem of communicating domain knowledge from a user to the
designer of a clustering algorithm. We propose a protocol in which the user
provides a clustering of a relatively small random sample of a data set. The
algorithm designer then uses that sample to come up with a data representation
under which -means clustering results in a clustering (of the full data set)
that is aligned with the user's clustering. We provide a formal statistical
model for analyzing the sample complexity of learning a clustering
representation with this paradigm. We then introduce a notion of capacity of a
class of possible representations, in the spirit of the VC-dimension, showing
that classes of representations that have finite such dimension can be
successfully learned with sample size error bounds, and end our discussion with
an analysis of that dimension for classes of representations induced by linear
embeddings.Comment: To be published in Proceedings of UAI 201
Sample-Efficient Learning of Mixtures
We consider PAC learning of probability distributions (a.k.a. density
estimation), where we are given an i.i.d. sample generated from an unknown
target distribution, and want to output a distribution that is close to the
target in total variation distance. Let be an arbitrary class of
probability distributions, and let denote the class of
-mixtures of elements of . Assuming the existence of a method
for learning with sample complexity ,
we provide a method for learning with sample complexity
. Our mixture
learning algorithm has the property that, if the -learner is
proper/agnostic, then the -learner would be proper/agnostic as
well.
This general result enables us to improve the best known sample complexity
upper bounds for a variety of important mixture classes. First, we show that
the class of mixtures of axis-aligned Gaussians in is
PAC-learnable in the agnostic setting with
samples, which is tight in and up to logarithmic factors. Second, we
show that the class of mixtures of Gaussians in is
PAC-learnable in the agnostic setting with sample complexity
, which improves the previous known
bounds of and
in its dependence on and . Finally,
we show that the class of mixtures of log-concave distributions over
is PAC-learnable using
samples.Comment: A bug from the previous version, which appeared in AAAI 2018
proceedings, is fixed. 18 page
On the Role of Noise in the Sample Complexity of Learning Recurrent Neural Networks: Exponential Gaps for Long Sequences
We consider the class of noisy multi-layered sigmoid recurrent neural
networks with (unbounded) weights for classification of sequences of length
, where independent noise distributed according to
is added to the output of each neuron in the network. Our main result shows
that the sample complexity of PAC learning this class can be bounded by . For the non-noisy version of the same class (i.e.,
), we prove a lower bound of for the sample complexity.
Our results indicate an exponential gap in the dependence of sample complexity
on for noisy versus non-noisy networks. Moreover, given the mild
logarithmic dependence of the upper bound on , this gap still holds
even for numerically negligible values of .Comment: arXiv admin note: text overlap with arXiv:2206.0719
Adversarially Robust Learning with Tolerance
We initiate the study of tolerant adversarial PAC-learning with respect to
metric perturbation sets. In adversarial PAC-learning, an adversary is allowed
to replace a test point with an arbitrary point in a closed ball of radius
centered at . In the tolerant version, the error of the learner is
compared with the best achievable error with respect to a slightly larger
perturbation radius . This simple tweak helps us bridge the gap
between theory and practice and obtain the first PAC-type guarantees for
algorithmic techniques that are popular in practice.
Our first result concerns the widely-used ``perturb-and-smooth'' approach for
adversarial learning. For perturbation sets with doubling dimension , we
show that a variant of these approaches PAC-learns any hypothesis class
with VC-dimension in the -tolerant adversarial
setting with samples.
This is in contrast to the traditional (non-tolerant) setting in which, as we
show, the perturb-and-smooth approach can provably fail.
Our second result shows that one can PAC-learn the same class using
samples
even in the agnostic setting. This result is based on a novel compression-based
algorithm, and achieves a linear dependence on the doubling dimension as well
as the VC-dimension. This is in contrast to the non-tolerant setting where
there is no known sample complexity upper bound that depend polynomially on the
VC-dimension.Comment: The paper was accepted for ALT 202
Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples
We study the problem of estimating mixtures of Gaussians under the constraint
of differential privacy (DP). Our main result is that samples are sufficient to estimate a
mixture of Gaussians up to total variation distance while
satisfying -DP. This is the first finite sample
complexity upper bound for the problem that does not make any structural
assumptions on the GMMs.
To solve the problem, we devise a new framework which may be useful for other
tasks. On a high level, we show that if a class of distributions (such as
Gaussians) is (1) list decodable and (2) admits a "locally small'' cover (Bun
et al., 2021) with respect to total variation distance, then the class of its
mixtures is privately learnable. The proof circumvents a known barrier
indicating that, unlike Gaussians, GMMs do not admit a locally small cover
(Aden-Ali et al., 2021b)
Polynomial Time and Private Learning of Unbounded Gaussian Mixture Models
We study the problem of privately estimating the parameters of
-dimensional Gaussian Mixture Models (GMMs) with components. For this,
we develop a technique to reduce the problem to its non-private counterpart.
This allows us to privatize existing non-private algorithms in a blackbox
manner, while incurring only a small overhead in the sample complexity and
running time. As the main application of our framework, we develop an
-differentially private algorithm to learn GMMs using
the non-private algorithm of Moitra and Valiant [MV10] as a blackbox.
Consequently, this gives the first sample complexity upper bound and first
polynomial time algorithm for privately learning GMMs without any boundedness
assumptions on the parameters. As part of our analysis, we prove a tight (up to
a constant factor) lower bound on the total variation distance of
high-dimensional Gaussians which can be of independent interest.Comment: Accepted in ICML 202
- …