14 research outputs found
Effects of sampling skewness of the importance-weighted risk estimator on model selection
Importance-weighting is a popular and well-researched technique for dealing
with sample selection bias and covariate shift. It has desirable
characteristics such as unbiasedness, consistency and low computational
complexity. However, weighting can have a detrimental effect on an estimator as
well. In this work, we empirically show that the sampling distribution of an
importance-weighted estimator can be skewed. For sample selection bias
settings, and for small sample sizes, the importance-weighted risk estimator
produces overestimates for datasets in the body of the sampling distribution,
i.e. the majority of cases, and large underestimates for data sets in the tail
of the sampling distribution. These over- and underestimates of the risk lead
to suboptimal regularization parameters when used for importance-weighted
validation.Comment: Conference paper, 6 pages, 5 figure
A review of domain adaptation without target labels
Domain adaptation has become a prominent problem setting in machine learning
and related fields. This review asks the question: how can a classifier learn
from a source domain and generalize to a target domain? We present a
categorization of approaches, divided into, what we refer to as, sample-based,
feature-based and inference-based methods. Sample-based methods focus on
weighting individual observations during training based on their importance to
the target domain. Feature-based methods revolve around on mapping, projecting
and representing features such that a source classifier performs well on the
target domain and inference-based methods incorporate adaptation into the
parameter estimation procedure, for instance through constraints on the
optimization procedure. Additionally, we review a number of conditions that
allow for formulating bounds on the cross-domain generalization error. Our
categorization highlights recurring ideas and raises questions important to
further research.Comment: 20 pages, 5 figure
Guiding new physics searches with unsupervised learning
We propose a new scientific application of unsupervised learning techniques to boost our ability to search for new phenomena in data, by detecting discrepancies between two datasets. These could be, for example, a simulated standard-model background, and an observed dataset containing a potential hidden signal of New Physics. We build a statistical test upon a test statistic which measures deviations between two samples, using a Nearest Neighbors approach to estimate the local ratio of the density of points. The test is model-independent and non-parametric, requiring no knowledge of the shape of the underlying distributions, and it does not bin the data, thus retaining full information from the multidimensional feature space. As a proof-of-concept, we apply our method to synthetic Gaussian data, and to a simulated dark matter signal at the Large Hadron Collider. Even in the case where the background can not be simulated accurately enough to claim discovery, the technique is a powerful tool to identify regions of interest for further study
Towards a Unified Analysis of Kernel-based Methods Under Covariate Shift
Covariate shift occurs prevalently in practice, where the input distributions
of the source and target data are substantially different. Despite its
practical importance in various learning problems, most of the existing methods
only focus on some specific learning tasks and are not well validated
theoretically and numerically. To tackle this problem, we propose a unified
analysis of general nonparametric methods in a reproducing kernel Hilbert space
(RKHS) under covariate shift. Our theoretical results are established for a
general loss belonging to a rich loss function family, which includes many
commonly used methods as special cases, such as mean regression, quantile
regression, likelihood-based classification, and margin-based classification.
Two types of covariate shift problems are the focus of this paper and the sharp
convergence rates are established for a general loss function to provide a
unified theoretical analysis, which concurs with the optimal results in
literature where the squared loss is used. Extensive numerical studies on
synthetic and real examples confirm our theoretical findings and further
illustrate the effectiveness of our proposed method.Comment: Poster to appear in Thirty-seventh Conference on Neural Information
Processing System