Search CORE

126 research outputs found

Effects of sampling skewness of the importance-weighted risk estimator on model selection

Author: Kouw Wouter M.
Loog Marco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/04/2018
Field of study

Importance-weighting is a popular and well-researched technique for dealing with sample selection bias and covariate shift. It has desirable characteristics such as unbiasedness, consistency and low computational complexity. However, weighting can have a detrimental effect on an estimator as well. In this work, we empirically show that the sampling distribution of an importance-weighted estimator can be skewed. For sample selection bias settings, and for small sample sizes, the importance-weighted risk estimator produces overestimates for datasets in the body of the sampling distribution, i.e. the majority of cases, and large underestimates for data sets in the tail of the sampling distribution. These over- and underestimates of the risk lead to suboptimal regularization parameters when used for importance-weighted validation.Comment: Conference paper, 6 pages, 5 figure

arXiv.org e-Print Archive

Crossref

On Regularization Parameter Estimation under Covariate Shift

Author: Kouw Wouter M.
Loog Marco
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 31/07/2016
Field of study

This paper identifies a problem with the usual procedure for L2-regularization parameter estimation in a domain adaptation setting. In such a setting, there are differences between the distributions generating the training data (source domain) and the test data (target domain). The usual cross-validation procedure requires validation data, which can not be obtained from the unlabeled target data. The problem is that if one decides to use source validation data, the regularization parameter is underestimated. One possible solution is to scale the source validation data through importance weighting, but we show that this correction is not sufficient. We conclude the paper with an empirical analysis of the effect of several importance weight estimators on the estimation of the regularization parameter.Comment: 6 pages, 2 figures, 2 tables. Accepted to ICPR 201

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System

A review of domain adaptation without target labels

Author: Kouw Wouter M.
Loog Marco
Publication venue
Publication date: 01/01/2019
Field of study

Domain adaptation has become a prominent problem setting in machine learning and related fields. This review asks the question: how can a classifier learn from a source domain and generalize to a target domain? We present a categorization of approaches, divided into, what we refer to as, sample-based, feature-based and inference-based methods. Sample-based methods focus on weighting individual observations during training based on their importance to the target domain. Feature-based methods revolve around on mapping, projecting and representing features such that a source classifier performs well on the target domain and inference-based methods incorporate adaptation into the parameter estimation procedure, for instance through constraints on the optimization procedure. Additionally, we review a number of conditions that allow for formulating bounds on the cross-domain generalization error. Our categorization highlights recurring ideas and raises questions important to further research.Comment: 20 pages, 5 figure

arXiv.org e-Print Archive

Crossref

Target Contrastive Pessimistic Discriminant Analysis

Author: Kouw Wouter M.
Loog Marco
Publication venue
Publication date: 21/06/2018
Field of study

Domain-adaptive classifiers learn from a source domain and aim to generalize to a target domain. If the classifier's assumptions on the relationship between domains (e.g. covariate shift) are valid, then it will usually outperform a non-adaptive source classifier. Unfortunately, it can perform substantially worse when its assumptions are invalid. Validating these assumptions requires labeled target samples, which are usually not available. We argue that, in order to make domain-adaptive classifiers more practical, it is necessary to focus on robust methods; robust in the sense that the model still achieves a particular level of performance without making strong assumptions on the relationship between domains. With this objective in mind, we formulate a conservative parameter estimator that only deviates from the source classifier when a lower or equal risk is guaranteed for all possible labellings of the given target samples. We derive the corresponding estimator for a discriminant analysis model, and show that its risk is actually strictly smaller than that of the source classifier. Experiments indicate that our classifier outperforms state-of-the-art classifiers for geographically biased samples.Comment: 9 pages, no figures, 2 tables. arXiv admin note: substantial text overlap with arXiv:1706.0808

arXiv.org e-Print Archive

Pure OAI Repository

Diplomacy in action: Latourian Politics and the Intergovernmental Panel on Climate Change

Author: Kouw M
Petersen AC
Publication venue: The Finnish Society for Science and Technology Studies
Publication date: 10/05/2017
Field of study

The Intergovernmental Panel on Climate Change (IPCC) reviews scientific literature on climate change in an attempt to make scientific knowledge about climate change accessible to a wide audience that includes policymakers. Documents produced by the IPCC are subject to negotiations in plenary sessions, which can be frustrating for the scientists and government delegations involved, who all have stakes in getting their respective interests met. This paper draws on the work of Bruno Latour in order to formulate a so-called ‘diplomatic’ approach to knowledge assessment in global climate governance. Such an approach, we argue, helps to make climate governance more inclusive by helping to identify values of parties involved with the IPCC plenaries, and allowing those parties to recognize their mutual interests and perspectives on climate change. Drawing on observations during IPCC plenaries, this paper argues that a Latourian form of diplomacy can lead to more inclusive negotiations in climate governance

Crossref

UCL Discovery

Journal.fi

Robust importance-weighted cross-validation under sample selection bias

Author: Kouw Wouter M.
Krijthe Jesse H.
Loog Marco
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 05/12/2019
Field of study

Cross-validation under sample selection bias can, in principle, be done by importance-weighting the empirical risk. However, the importance-weighted risk estimator produces suboptimal hyperparameter estimates in problem settings where large weights arise with high probability. We study its sampling variance as a function of the training data distribution and introduce a control variate to increase its robustness to problematically large weights

Pure OAI Repository