5,129 research outputs found
PAC Classification based on PAC Estimates of Label Class Distributions
A standard approach in pattern classification is to estimate the
distributions of the label classes, and then to apply the Bayes classifier to
the estimates of the distributions in order to classify unlabeled examples. As
one might expect, the better our estimates of the label class distributions,
the better the resulting classifier will be. In this paper we make this
observation precise by identifying risk bounds of a classifier in terms of the
quality of the estimates of the label class distributions. We show how PAC
learnability relates to estimates of the distributions that have a PAC
guarantee on their distance from the true distribution, and we bound the
increase in negative log likelihood risk in terms of PAC bounds on the
KL-divergence. We give an inefficient but general-purpose smoothing method for
converting an estimated distribution that is good under the metric into a
distribution that is good under the KL-divergence.Comment: 14 page
PAC-Bayes and Domain Adaptation
We provide two main contributions in PAC-Bayesian theory for domain
adaptation where the objective is to learn, from a source distribution, a
well-performing majority vote on a different, but related, target distribution.
Firstly, we propose an improvement of the previous approach we proposed in
Germain et al. (2013), which relies on a novel distribution pseudodistance
based on a disagreement averaging, allowing us to derive a new tighter domain
adaptation bound for the target risk. While this bound stands in the spirit of
common domain adaptation works, we derive a second bound (introduced in Germain
et al., 2016) that brings a new perspective on domain adaptation by deriving an
upper bound on the target risk where the distributions' divergence-expressed as
a ratio-controls the trade-off between a source error measure and the target
voters' disagreement. We discuss and compare both results, from which we obtain
PAC-Bayesian generalization bounds. Furthermore, from the PAC-Bayesian
specialization to linear classifiers, we infer two learning algorithms, and we
evaluate them on real data.Comment: Neurocomputing, Elsevier, 2019. arXiv admin note: substantial text
overlap with arXiv:1503.0694
A review of domain adaptation without target labels
Domain adaptation has become a prominent problem setting in machine learning
and related fields. This review asks the question: how can a classifier learn
from a source domain and generalize to a target domain? We present a
categorization of approaches, divided into, what we refer to as, sample-based,
feature-based and inference-based methods. Sample-based methods focus on
weighting individual observations during training based on their importance to
the target domain. Feature-based methods revolve around on mapping, projecting
and representing features such that a source classifier performs well on the
target domain and inference-based methods incorporate adaptation into the
parameter estimation procedure, for instance through constraints on the
optimization procedure. Additionally, we review a number of conditions that
allow for formulating bounds on the cross-domain generalization error. Our
categorization highlights recurring ideas and raises questions important to
further research.Comment: 20 pages, 5 figure
Learning using Local Membership Queries
We introduce a new model of membership query (MQ) learning, where the
learning algorithm is restricted to query points that are \emph{close} to
random examples drawn from the underlying distribution. The learning model is
intermediate between the PAC model (Valiant, 1984) and the PAC+MQ model (where
the queries are allowed to be arbitrary points).
Membership query algorithms are not popular among machine learning
practitioners. Apart from the obvious difficulty of adaptively querying
labelers, it has also been observed that querying \emph{unnatural} points leads
to increased noise from human labelers (Lang and Baum, 1992). This motivates
our study of learning algorithms that make queries that are close to examples
generated from the data distribution.
We restrict our attention to functions defined on the -dimensional Boolean
hypercube and say that a membership query is local if its Hamming distance from
some example in the (random) training data is at most . We show the
following results in this model:
(i) The class of sparse polynomials (with coefficients in R) over
is polynomial time learnable under a large class of \emph{locally smooth}
distributions using -local queries. This class also includes the
class of -depth decision trees.
(ii) The class of polynomial-sized decision trees is polynomial time
learnable under product distributions using -local queries.
(iii) The class of polynomial size DNF formulas is learnable under the
uniform distribution using -local queries in time
.
(iv) In addition we prove a number of results relating the proposed model to
the traditional PAC model and the PAC+MQ model
What Can We Learn Privately?
Learning problems form an important category of computational tasks that
generalizes many of the computations researchers apply to large real-life data
sets. We ask: what concept classes can be learned privately, namely, by an
algorithm whose output does not depend too heavily on any one input or specific
training example? More precisely, we investigate learning algorithms that
satisfy differential privacy, a notion that provides strong confidentiality
guarantees in contexts where aggregate information is released about a database
containing sensitive information about individuals. We demonstrate that,
ignoring computational constraints, it is possible to privately agnostically
learn any concept class using a sample size approximately logarithmic in the
cardinality of the concept class. Therefore, almost anything learnable is
learnable privately: specifically, if a concept class is learnable by a
(non-private) algorithm with polynomial sample complexity and output size, then
it can be learned privately using a polynomial number of samples. We also
present a computationally efficient private PAC learner for the class of parity
functions. Local (or randomized response) algorithms are a practical class of
private algorithms that have received extensive investigation. We provide a
precise characterization of local private learning algorithms. We show that a
concept class is learnable by a local algorithm if and only if it is learnable
in the statistical query (SQ) model. Finally, we present a separation between
the power of interactive and noninteractive local learning algorithms.Comment: 35 pages, 2 figure
A New PAC-Bayesian Perspective on Domain Adaptation
We study the issue of PAC-Bayesian domain adaptation: We want to learn, from
a source domain, a majority vote model dedicated to a target one. Our
theoretical contribution brings a new perspective by deriving an upper-bound on
the target risk where the distributions' divergence---expressed as a
ratio---controls the trade-off between a source error measure and the target
voters' disagreement. Our bound suggests that one has to focus on regions where
the source data is informative.From this result, we derive a PAC-Bayesian
generalization bound, and specialize it to linear classifiers. Then, we infer a
learning algorithmand perform experiments on real data.Comment: Published at ICML 201
Automatic plankton quantification using deep features
The study of marine plankton data is vital to monitor the health of the world’s oceans. In recent decades, automatic plankton recognition systems have proved useful to address the vast amount of data collected by specially engineered in situ digital imaging systems. At the beginning, these systems were developed and put into operation using traditional automatic classification techniques, which were fed with handdesigned local image descriptors (such as Fourier features), obtaining quite successful results. In the past few years, there have been many advances in the computer vision community with the rebirth of neural networks. In this paper, we leverage how descriptors computed using Convolutional Neural Networks (CNNs) trained with out-of-domain data are useful to replace hand-designed descriptors in the task of estimating the prevalence of each plankton class in a water sample. To achieve this goal, we have designed a broad set of experiments that show how effective these deep features are when working in combination with state-of-the-art quantification algorithms
- …