2,500 research outputs found
Transductive Learning with String Kernels for Cross-Domain Text Classification
For many text classification tasks, there is a major problem posed by the
lack of labeled data in a target domain. Although classifiers for a target
domain can be trained on labeled text data from a related source domain, the
accuracy of such classifiers is usually lower in the cross-domain setting.
Recently, string kernels have obtained state-of-the-art results in various text
classification tasks such as native language identification or automatic essay
scoring. Moreover, classifiers based on string kernels have been found to be
robust to the distribution gap between different domains. In this paper, we
formally describe an algorithm composed of two simple yet effective
transductive learning approaches to further improve the results of string
kernels in cross-domain settings. By adapting string kernels to the test set
without using the ground-truth test labels, we report significantly better
accuracy rates in cross-domain English polarity classification.Comment: Accepted at ICONIP 2018. arXiv admin note: substantial text overlap
with arXiv:1808.0840
Transductive conformal inference with adaptive scores
Conformal inference is a fundamental and versatile tool that provides
distribution-free guarantees for many machine learning tasks. We consider the
transductive setting, where decisions are made on a test sample of new
points, giving rise to conformal -values. {While classical results only
concern their marginal distribution, we show that their joint distribution
follows a P\'olya urn model, and establish a concentration inequality for their
empirical distribution function.} The results hold for arbitrary exchangeable
scores, including {\it adaptive} ones that can use the covariates of the
test+calibration samples at training stage for increased accuracy. We
demonstrate the usefulness of these theoretical results through uniform,
in-probability guarantees for two machine learning tasks of current interest:
interval prediction for transductive transfer learning and novelty detection
based on two-class classification.Comment: 27 pages, 6 Figure
MEG Decoding Across Subjects
Brain decoding is a data analysis paradigm for neuroimaging experiments that
is based on predicting the stimulus presented to the subject from the
concurrent brain activity. In order to make inference at the group level, a
straightforward but sometimes unsuccessful approach is to train a classifier on
the trials of a group of subjects and then to test it on unseen trials from new
subjects. The extreme difficulty is related to the structural and functional
variability across the subjects. We call this approach "decoding across
subjects". In this work, we address the problem of decoding across subjects for
magnetoencephalographic (MEG) experiments and we provide the following
contributions: first, we formally describe the problem and show that it belongs
to a machine learning sub-field called transductive transfer learning (TTL).
Second, we propose to use a simple TTL technique that accounts for the
differences between train data and test data. Third, we propose the use of
ensemble learning, and specifically of stacked generalization, to address the
variability across subjects within train data, with the aim of producing more
stable classifiers. On a face vs. scramble task MEG dataset of 16 subjects, we
compare the standard approach of not modelling the differences across subjects,
to the proposed one of combining TTL and ensemble learning. We show that the
proposed approach is consistently more accurate than the standard one
Unsupervised Domain Adaptation using Graph Transduction Games
Unsupervised domain adaptation (UDA) amounts to assigning class labels to the
unlabeled instances of a dataset from a target domain, using labeled instances
of a dataset from a related source domain. In this paper, we propose to cast
this problem in a game-theoretic setting as a non-cooperative game and
introduce a fully automatized iterative algorithm for UDA based on graph
transduction games (GTG). The main advantages of this approach are its
principled foundation, guaranteed termination of the iterative algorithms to a
Nash equilibrium (which corresponds to a consistent labeling condition) and
soft labels quantifying the uncertainty of the label assignment process. We
also investigate the beneficial effect of using pseudo-labels from linear
classifiers to initialize the iterative process. The performance of the
resulting methods is assessed on publicly available object recognition
benchmark datasets involving both shallow and deep features. Results of
experiments demonstrate the suitability of the proposed game-theoretic approach
for solving UDA tasks.Comment: Oral IJCNN 201
- …