2,500 research outputs found

    Transductive Learning with String Kernels for Cross-Domain Text Classification

    Full text link
    For many text classification tasks, there is a major problem posed by the lack of labeled data in a target domain. Although classifiers for a target domain can be trained on labeled text data from a related source domain, the accuracy of such classifiers is usually lower in the cross-domain setting. Recently, string kernels have obtained state-of-the-art results in various text classification tasks such as native language identification or automatic essay scoring. Moreover, classifiers based on string kernels have been found to be robust to the distribution gap between different domains. In this paper, we formally describe an algorithm composed of two simple yet effective transductive learning approaches to further improve the results of string kernels in cross-domain settings. By adapting string kernels to the test set without using the ground-truth test labels, we report significantly better accuracy rates in cross-domain English polarity classification.Comment: Accepted at ICONIP 2018. arXiv admin note: substantial text overlap with arXiv:1808.0840

    Transductive conformal inference with adaptive scores

    Full text link
    Conformal inference is a fundamental and versatile tool that provides distribution-free guarantees for many machine learning tasks. We consider the transductive setting, where decisions are made on a test sample of mm new points, giving rise to mm conformal pp-values. {While classical results only concern their marginal distribution, we show that their joint distribution follows a P\'olya urn model, and establish a concentration inequality for their empirical distribution function.} The results hold for arbitrary exchangeable scores, including {\it adaptive} ones that can use the covariates of the test+calibration samples at training stage for increased accuracy. We demonstrate the usefulness of these theoretical results through uniform, in-probability guarantees for two machine learning tasks of current interest: interval prediction for transductive transfer learning and novelty detection based on two-class classification.Comment: 27 pages, 6 Figure

    MEG Decoding Across Subjects

    Full text link
    Brain decoding is a data analysis paradigm for neuroimaging experiments that is based on predicting the stimulus presented to the subject from the concurrent brain activity. In order to make inference at the group level, a straightforward but sometimes unsuccessful approach is to train a classifier on the trials of a group of subjects and then to test it on unseen trials from new subjects. The extreme difficulty is related to the structural and functional variability across the subjects. We call this approach "decoding across subjects". In this work, we address the problem of decoding across subjects for magnetoencephalographic (MEG) experiments and we provide the following contributions: first, we formally describe the problem and show that it belongs to a machine learning sub-field called transductive transfer learning (TTL). Second, we propose to use a simple TTL technique that accounts for the differences between train data and test data. Third, we propose the use of ensemble learning, and specifically of stacked generalization, to address the variability across subjects within train data, with the aim of producing more stable classifiers. On a face vs. scramble task MEG dataset of 16 subjects, we compare the standard approach of not modelling the differences across subjects, to the proposed one of combining TTL and ensemble learning. We show that the proposed approach is consistently more accurate than the standard one

    Unsupervised Domain Adaptation using Graph Transduction Games

    Full text link
    Unsupervised domain adaptation (UDA) amounts to assigning class labels to the unlabeled instances of a dataset from a target domain, using labeled instances of a dataset from a related source domain. In this paper, we propose to cast this problem in a game-theoretic setting as a non-cooperative game and introduce a fully automatized iterative algorithm for UDA based on graph transduction games (GTG). The main advantages of this approach are its principled foundation, guaranteed termination of the iterative algorithms to a Nash equilibrium (which corresponds to a consistent labeling condition) and soft labels quantifying the uncertainty of the label assignment process. We also investigate the beneficial effect of using pseudo-labels from linear classifiers to initialize the iterative process. The performance of the resulting methods is assessed on publicly available object recognition benchmark datasets involving both shallow and deep features. Results of experiments demonstrate the suitability of the proposed game-theoretic approach for solving UDA tasks.Comment: Oral IJCNN 201
    corecore