305 research outputs found
Adapted tree boosting for Transfer Learning
Secure online transaction is an essential task for e-commerce platforms.
Alipay, one of the world's leading cashless payment platform, provides the
payment service to both merchants and individual customers. The fraud detection
models are built to protect the customers, but stronger demands are raised by
the new scenes, which are lacking in training data and labels. The proposed
model makes a difference by utilizing the data under similar old scenes and the
data under a new scene is treated as the target domain to be promoted. Inspired
by this real case in Alipay, we view the problem as a transfer learning problem
and design a set of revise strategies to transfer the source domain models to
the target domain under the framework of gradient boosting tree models. This
work provides an option for the cold-starting and data-sharing problems
Application of Transfer Learning Approaches in Multimodal Wearable Human Activity Recognition
Through this project, we researched on transfer learning methods and their
applications on real world problems. By implementing and modifying various
methods in transfer learning for our problem, we obtained an insight in the
advantages and disadvantages of these methods, as well as experiences in
developing neural network models for knowledge transfer. Due to time
constraint, we only applied a representative method for each major approach in
transfer learning. As pointed out in the literature review, each method has its
own assumptions, strengths and shortcomings. Thus we believe that an
ensemble-learning approach combining the different methods should yield a
better performance, which can be our future research focus
Selective Transfer Learning for Cross Domain Recommendation
Collaborative filtering (CF) aims to predict users' ratings on items
according to historical user-item preference data. In many real-world
applications, preference data are usually sparse, which would make models
overfit and fail to give accurate predictions. Recently, several research works
show that by transferring knowledge from some manually selected source domains,
the data sparseness problem could be mitigated. However for most cases, parts
of source domain data are not consistent with the observations in the target
domain, which may misguide the target domain model building. In this paper, we
propose a novel criterion based on empirical prediction error and its variance
to better capture the consistency across domains in CF settings. Consequently,
we embed this criterion into a boosting framework to perform selective
knowledge transfer. Comparing to several state-of-the-art methods, we show that
our proposed selective transfer learning framework can significantly improve
the accuracy of rating prediction tasks on several real-world recommendation
tasks
Continual Learning in Deep Neural Network by Using a Kalman Optimiser
Learning and adapting to new distributions or learning new tasks sequentially
without forgetting the previously learned knowledge is a challenging phenomenon
in continual learning models. Most of the conventional deep learning models are
not capable of learning new tasks sequentially in one model without forgetting
the previously learned ones. We address this issue by using a Kalman Optimiser.
The Kalman Optimiser divides the neural network into two parts: the long-term
and short-term memory units. The long-term memory unit is used to remember the
learned tasks and the short-term memory unit is to adapt to the new task. We
have evaluated our method on MNIST, CIFAR10, CIFAR100 datasets and compare our
results with state-of-the-art baseline models. The results show that our
approach enables the model to continually learn and adapt to the new changes
without forgetting the previously learned tasks.Comment: accepted by ICML worksho
Learn on Source, Refine on Target:A Model Transfer Learning Framework with Random Forests
We propose novel model transfer-learning methods that refine a decision
forest model M learned within a "source" domain using a training set sampled
from a "target" domain, assumed to be a variation of the source. We present two
random forest transfer algorithms. The first algorithm searches greedily for
locally optimal modifications of each tree structure by trying to locally
expand or reduce the tree around individual nodes. The second algorithm does
not modify structure, but only the parameter (thresholds) associated with
decision nodes. We also propose to combine both methods by considering an
ensemble that contains the union of the two forests. The proposed methods
exhibit impressive experimental results over a range of problems.Comment: 2 columns, 14 pages, TPAMI submitte
Zero-shot Domain Adaptation without Domain Semantic Descriptors
We propose a method to infer domain-specific models such as classifiers for
unseen domains, from which no data are given in the training phase, without
domain semantic descriptors. When training and test distributions are
different, standard supervised learning methods perform poorly. Zero-shot
domain adaptation attempts to alleviate this problem by inferring models that
generalize well to unseen domains by using training data in multiple source
domains. Existing methods use observed semantic descriptors characterizing
domains such as time information to infer the domain-specific models for the
unseen domains. However, it cannot always be assumed that such metadata can be
used in real-world applications. The proposed method can infer appropriate
domain-specific models without any semantic descriptors by introducing the
concept of latent domain vectors, which are latent representations for the
domains and are used for inferring the models. The latent domain vector for the
unseen domain is inferred from the set of the feature vectors in the
corresponding domain, which is given in the testing phase. The domain-specific
models consist of two components: the first is for extracting a representation
of a feature vector to be predicted, and the second is for inferring model
parameters given the latent domain vector. The posterior distributions of the
latent domain vectors and the domain-specific models are parametrized by neural
networks, and are optimized by maximizing the variational lower bound using
stochastic gradient descent. The effectiveness of the proposed method was
demonstrated through experiments using one regression and two classification
tasks.Comment: 10 pages, 10 figure
Viewpoint Adaptation for Rigid Object Detection
An object detector performs suboptimally when applied to image data taken
from a viewpoint different from the one with which it was trained. In this
paper, we present a viewpoint adaptation algorithm that allows a trained
single-view object detector to be adapted to a new, distinct viewpoint. We
first illustrate how a feature space transformation can be inferred from a
known homography between the source and target viewpoints. Second, we show that
a variety of trained classifiers can be modified to behave as if that
transformation were applied to each testing instance. The proposed algorithm is
evaluated on a person detection task using images from the PETS 2007 and CAVIAR
datasets, as well as from a new synthetic multi-view person detection dataset.
It yields substantial performance improvements when adapting single-view person
detectors to new viewpoints, and simultaneously reduces computational
complexity. This work has the potential to improve detection performance for
cameras viewing objects from arbitrary viewpoints, while simplifying data
collection and feature extraction
Recommended from our members
RIPEx: Extracting Malicious IP Addresses from Security Forums Using Cross-Forum Learning
Is it possible to extract malicious IP addresses reported in security forums
in an automatic way? This is the question at the heart of our work. We focus on
security forums, where security professionals and hackers share knowledge and
information, and often report misbehaving IP addresses. So far, there have only
been a few efforts to extract information from such security forums. We propose
RIPEx, a systematic approach to identify and label IP addresses in security
forums by utilizing a cross-forum learning method. In more detail, the
challenge is twofold: (a) identifying IP addresses from other numerical
entities, such as software version numbers, and (b) classifying the IP address
as benign or malicious. We propose an integrated solution that tackles both
these problems. A novelty of our approach is that it does not require training
data for each new forum. Our approach does knowledge transfer across forums: we
use a classifier from our source forums to identify seed information for
training a classifier on the target forum. We evaluate our method using data
collected from five security forums with a total of 31K users and 542K posts.
First, RIPEx can distinguish IP address from other numeric expressions with 95%
precision and above 93% recall on average. Second, RIPEx identifies malicious
IP addresses with an average precision of 88% and over 78% recall, using our
cross-forum learning. Our work is a first step towards harnessing the wealth of
useful information that can be found in security forums
Multi-Fidelity Reinforcement Learning with Gaussian Processes
We study the problem of Reinforcement Learning (RL) using as few real-world
samples as possible. A naive application of RL can be inefficient in large and
continuous state spaces. We present two versions of Multi-Fidelity
Reinforcement Learning (MFRL), model-based and model-free, that leverage
Gaussian Processes (GPs) to learn the optimal policy in a real-world
environment. In the MFRL framework, an agent uses multiple simulators of the
real environment to perform actions. With increasing fidelity in a simulator
chain, the number of samples used in successively higher simulators can be
reduced. By incorporating GPs in the MFRL framework, we empirically observe up
to reduction in the number of samples for model-based RL and
reduction for the model-free version. We examine the performance of our
algorithms through simulations and through real-world experiments for
navigation with a ground robot
Regularized Bayesian transfer learning for population level etiological distributions
Computer-coded verbal autopsy (CCVA) algorithms predict cause of death from
high-dimensional family questionnaire data (verbal autopsies) of a deceased
individual. CCVA algorithms are typically trained on non-local data, then used
to generate national and regional estimates of cause-specific mortality
fractions. These estimates may be inaccurate if the non-local training data is
different from the local population of interest. This problem is a special case
of transfer learning. However, most transfer learning classification approaches
are concerned with individual (e.g. a person's) classification within a target
domain (e.g. a particular population) with training performed in data from a
source domain. Epidemiologists are often more interested in estimating
population-level etiological distributions, using datasets much smaller than
those used in common transfer learning applications. We present a parsimonious
hierarchical Bayesian transfer learning framework to directly estimate
population-level class probabilities in a target domain. To address small
sample sizes, we introduce a novel shrinkage prior for the transfer error rates
guaranteeing that, in absence of any labeled target domain data or when the
baseline classifier has zero transfer error, the calibrated estimate of class
probabilities coincides with the naive estimates from the baseline classifier,
thereby subsuming the default practice as a special case. A novel Gibbs sampler
using data-augmentation enables fast implementation. We extend our approach to
use not one, but an ensemble of baseline classifiers. Theoretical and empirical
results demonstrate how the ensemble model favors the most accurate baseline
classifier. We present extensions allowing class probabilities to vary with
covariates, and an EM-algorithm-based MAP estimation. An R-package implementing
this method is developed
- …