305 research outputs found

    Adapted tree boosting for Transfer Learning

    Full text link
    Secure online transaction is an essential task for e-commerce platforms. Alipay, one of the world's leading cashless payment platform, provides the payment service to both merchants and individual customers. The fraud detection models are built to protect the customers, but stronger demands are raised by the new scenes, which are lacking in training data and labels. The proposed model makes a difference by utilizing the data under similar old scenes and the data under a new scene is treated as the target domain to be promoted. Inspired by this real case in Alipay, we view the problem as a transfer learning problem and design a set of revise strategies to transfer the source domain models to the target domain under the framework of gradient boosting tree models. This work provides an option for the cold-starting and data-sharing problems

    Application of Transfer Learning Approaches in Multimodal Wearable Human Activity Recognition

    Full text link
    Through this project, we researched on transfer learning methods and their applications on real world problems. By implementing and modifying various methods in transfer learning for our problem, we obtained an insight in the advantages and disadvantages of these methods, as well as experiences in developing neural network models for knowledge transfer. Due to time constraint, we only applied a representative method for each major approach in transfer learning. As pointed out in the literature review, each method has its own assumptions, strengths and shortcomings. Thus we believe that an ensemble-learning approach combining the different methods should yield a better performance, which can be our future research focus

    Selective Transfer Learning for Cross Domain Recommendation

    Full text link
    Collaborative filtering (CF) aims to predict users' ratings on items according to historical user-item preference data. In many real-world applications, preference data are usually sparse, which would make models overfit and fail to give accurate predictions. Recently, several research works show that by transferring knowledge from some manually selected source domains, the data sparseness problem could be mitigated. However for most cases, parts of source domain data are not consistent with the observations in the target domain, which may misguide the target domain model building. In this paper, we propose a novel criterion based on empirical prediction error and its variance to better capture the consistency across domains in CF settings. Consequently, we embed this criterion into a boosting framework to perform selective knowledge transfer. Comparing to several state-of-the-art methods, we show that our proposed selective transfer learning framework can significantly improve the accuracy of rating prediction tasks on several real-world recommendation tasks

    Continual Learning in Deep Neural Network by Using a Kalman Optimiser

    Full text link
    Learning and adapting to new distributions or learning new tasks sequentially without forgetting the previously learned knowledge is a challenging phenomenon in continual learning models. Most of the conventional deep learning models are not capable of learning new tasks sequentially in one model without forgetting the previously learned ones. We address this issue by using a Kalman Optimiser. The Kalman Optimiser divides the neural network into two parts: the long-term and short-term memory units. The long-term memory unit is used to remember the learned tasks and the short-term memory unit is to adapt to the new task. We have evaluated our method on MNIST, CIFAR10, CIFAR100 datasets and compare our results with state-of-the-art baseline models. The results show that our approach enables the model to continually learn and adapt to the new changes without forgetting the previously learned tasks.Comment: accepted by ICML worksho

    Learn on Source, Refine on Target:A Model Transfer Learning Framework with Random Forests

    Full text link
    We propose novel model transfer-learning methods that refine a decision forest model M learned within a "source" domain using a training set sampled from a "target" domain, assumed to be a variation of the source. We present two random forest transfer algorithms. The first algorithm searches greedily for locally optimal modifications of each tree structure by trying to locally expand or reduce the tree around individual nodes. The second algorithm does not modify structure, but only the parameter (thresholds) associated with decision nodes. We also propose to combine both methods by considering an ensemble that contains the union of the two forests. The proposed methods exhibit impressive experimental results over a range of problems.Comment: 2 columns, 14 pages, TPAMI submitte

    Zero-shot Domain Adaptation without Domain Semantic Descriptors

    Full text link
    We propose a method to infer domain-specific models such as classifiers for unseen domains, from which no data are given in the training phase, without domain semantic descriptors. When training and test distributions are different, standard supervised learning methods perform poorly. Zero-shot domain adaptation attempts to alleviate this problem by inferring models that generalize well to unseen domains by using training data in multiple source domains. Existing methods use observed semantic descriptors characterizing domains such as time information to infer the domain-specific models for the unseen domains. However, it cannot always be assumed that such metadata can be used in real-world applications. The proposed method can infer appropriate domain-specific models without any semantic descriptors by introducing the concept of latent domain vectors, which are latent representations for the domains and are used for inferring the models. The latent domain vector for the unseen domain is inferred from the set of the feature vectors in the corresponding domain, which is given in the testing phase. The domain-specific models consist of two components: the first is for extracting a representation of a feature vector to be predicted, and the second is for inferring model parameters given the latent domain vector. The posterior distributions of the latent domain vectors and the domain-specific models are parametrized by neural networks, and are optimized by maximizing the variational lower bound using stochastic gradient descent. The effectiveness of the proposed method was demonstrated through experiments using one regression and two classification tasks.Comment: 10 pages, 10 figure

    Viewpoint Adaptation for Rigid Object Detection

    Full text link
    An object detector performs suboptimally when applied to image data taken from a viewpoint different from the one with which it was trained. In this paper, we present a viewpoint adaptation algorithm that allows a trained single-view object detector to be adapted to a new, distinct viewpoint. We first illustrate how a feature space transformation can be inferred from a known homography between the source and target viewpoints. Second, we show that a variety of trained classifiers can be modified to behave as if that transformation were applied to each testing instance. The proposed algorithm is evaluated on a person detection task using images from the PETS 2007 and CAVIAR datasets, as well as from a new synthetic multi-view person detection dataset. It yields substantial performance improvements when adapting single-view person detectors to new viewpoints, and simultaneously reduces computational complexity. This work has the potential to improve detection performance for cameras viewing objects from arbitrary viewpoints, while simplifying data collection and feature extraction

    Multi-Fidelity Reinforcement Learning with Gaussian Processes

    Full text link
    We study the problem of Reinforcement Learning (RL) using as few real-world samples as possible. A naive application of RL can be inefficient in large and continuous state spaces. We present two versions of Multi-Fidelity Reinforcement Learning (MFRL), model-based and model-free, that leverage Gaussian Processes (GPs) to learn the optimal policy in a real-world environment. In the MFRL framework, an agent uses multiple simulators of the real environment to perform actions. With increasing fidelity in a simulator chain, the number of samples used in successively higher simulators can be reduced. By incorporating GPs in the MFRL framework, we empirically observe up to 40%40\% reduction in the number of samples for model-based RL and 60%60\% reduction for the model-free version. We examine the performance of our algorithms through simulations and through real-world experiments for navigation with a ground robot

    Regularized Bayesian transfer learning for population level etiological distributions

    Full text link
    Computer-coded verbal autopsy (CCVA) algorithms predict cause of death from high-dimensional family questionnaire data (verbal autopsies) of a deceased individual. CCVA algorithms are typically trained on non-local data, then used to generate national and regional estimates of cause-specific mortality fractions. These estimates may be inaccurate if the non-local training data is different from the local population of interest. This problem is a special case of transfer learning. However, most transfer learning classification approaches are concerned with individual (e.g. a person's) classification within a target domain (e.g. a particular population) with training performed in data from a source domain. Epidemiologists are often more interested in estimating population-level etiological distributions, using datasets much smaller than those used in common transfer learning applications. We present a parsimonious hierarchical Bayesian transfer learning framework to directly estimate population-level class probabilities in a target domain. To address small sample sizes, we introduce a novel shrinkage prior for the transfer error rates guaranteeing that, in absence of any labeled target domain data or when the baseline classifier has zero transfer error, the calibrated estimate of class probabilities coincides with the naive estimates from the baseline classifier, thereby subsuming the default practice as a special case. A novel Gibbs sampler using data-augmentation enables fast implementation. We extend our approach to use not one, but an ensemble of baseline classifiers. Theoretical and empirical results demonstrate how the ensemble model favors the most accurate baseline classifier. We present extensions allowing class probabilities to vary with covariates, and an EM-algorithm-based MAP estimation. An R-package implementing this method is developed
    • …
    corecore