6 research outputs found
Stratified Transfer Learning for Cross-domain Activity Recognition
In activity recognition, it is often expensive and time-consuming to acquire
sufficient activity labels. To solve this problem, transfer learning leverages
the labeled samples from the source domain to annotate the target domain which
has few or none labels. Existing approaches typically consider learning a
global domain shift while ignoring the intra-affinity between classes, which
will hinder the performance of the algorithms. In this paper, we propose a
novel and general cross-domain learning framework that can exploit the
intra-affinity of classes to perform intra-class knowledge transfer. The
proposed framework, referred to as Stratified Transfer Learning (STL), can
dramatically improve the classification accuracy for cross-domain activity
recognition. Specifically, STL first obtains pseudo labels for the target
domain via majority voting technique. Then, it performs intra-class knowledge
transfer iteratively to transform both domains into the same subspaces.
Finally, the labels of target domain are obtained via the second annotation. To
evaluate the performance of STL, we conduct comprehensive experiments on three
large public activity recognition datasets~(i.e. OPPORTUNITY, PAMAP2, and UCI
DSADS), which demonstrates that STL significantly outperforms other
state-of-the-art methods w.r.t. classification accuracy (improvement of 7.68%).
Furthermore, we extensively investigate the performance of STL across different
degrees of similarities and activity levels between domains. And we also
discuss the potential of STL in other pervasive computing applications to
provide empirical experience for future research.Comment: 10 pages; accepted by IEEE PerCom 2018; full paper. (camera-ready
version
Domain Adaptation For Vehicle Detection In Traffic Surveillance Images From Daytime To Nighttime
Vehicle detection in traffic surveillance images is an important approach to obtain vehicle data and rich traffic flow parameters. Recently, deep learning based methods have been widely used in vehicle detection with high accuracy and efficiency. However, deep learning based methods require a large number of manually labeled ground truths (bounding box of each vehicle in each image) to train the Convolutional Neural Networks (CNN). In the modern urban surveillance cameras, there are already many manually labeled ground truths in daytime images for training CNN, while there are little or much less manually labeled ground truths in nighttime images. In this paper, we focus on the research to make maximum usage of labeled daytime images (Source Domain) to help the vehicle detection in unlabeled nighttime images (Target Domain). For this purpose, we propose a new method based on Faster R-CNN with Domain Adaptation (DA) to improve the vehicle detection at nighttime. With the assistance of DA, the domain distribution discrepancy of Source and Target Domains is reduced. We collected a new dataset of 2,200 traffic images (1,200 for daytime and 1,000 for nighttime) of 57,059 vehicles for training and testing CNN. In the experiment, only using the manually labeled ground truths of daytime data, Faster R- CNN obtained 82.84% as F-measure on the nighttime vehicle detection, while the proposed method (Faster R-CNN+DA) achieved 86.39% as F-measure on the nighttime vehicle detection
Cross-position Activity Recognition with Stratified Transfer Learning
Human activity recognition aims to recognize the activities of daily living
by utilizing the sensors on different body parts. However, when the labeled
data from a certain body position (i.e. target domain) is missing, how to
leverage the data from other positions (i.e. source domain) to help learn the
activity labels of this position? When there are several source domains
available, it is often difficult to select the most similar source domain to
the target domain. With the selected source domain, we need to perform accurate
knowledge transfer between domains. Existing methods only learn the global
distance between domains while ignoring the local property. In this paper, we
propose a \textit{Stratified Transfer Learning} (STL) framework to perform both
source domain selection and knowledge transfer. STL is based on our proposed
\textit{Stratified} distance to capture the local property of domains. STL
consists of two components: Stratified Domain Selection (STL-SDS) can select
the most similar source domain to the target domain; Stratified Activity
Transfer (STL-SAT) is able to perform accurate knowledge transfer. Extensive
experiments on three public activity recognition datasets demonstrate the
superiority of STL. Furthermore, we extensively investigate the performance of
transfer learning across different degrees of similarities and activity levels
between domains. We also discuss the potential applications of STL in other
fields of pervasive computing for future research.Comment: Submit to Pervasive and Mobile Computing as an extension to PerCom 18
paper; First revision. arXiv admin note: substantial text overlap with
arXiv:1801.0082
Person Identification With Convolutional Neural Networks
Person identification aims at matching persons across images or videos captured by different cameras, without requiring the presence of persons’ faces. It is an important problem in computer vision community and has many important real-world applica- tions, such as person search, security surveillance, and no-checkout stores. However, this problem is very challenging due to various factors, such as illumination varia- tion, view changes, human pose deformation, and occlusion. Traditional approaches generally focus on hand-crafting features and/or learning distance metrics for match- ing to tackle these challenges. With Convolutional Neural Networks (CNNs), feature extraction and metric learning can be combined in a unified framework.
In this work, we study two important sub-problems of person identification: cross- view person identification and visible-thermal person re-identification. Cross-view person identification aims to match persons from temporally synchronized videos taken by wearable cameras. Visible-thermal person re-identification aims to match persons between images taken by visible cameras under normal illumination condition and thermal cameras under poor illumination condition such as during night time.
For cross-view person identification, we focus on addressing the challenge of view changes between cameras. Since the videos are taken by wearable cameras, the un- derlying 3D motion pattern of the same person should be consistent and thus can be used for effective matching. In light of this, we propose to extract view-invariant mo- tion features to match persons. Specifically, we propose a CNN-based triplet network to learn view-invariant features by establishing correspondences between 3D human MoCap data and the projected 2D optical flow data. After training, the triplet net-work is used to extract view-invariant features from 2D optical flows of videos for matching persons. We collect three datasets for evaluation. The experimental results demonstrate the effectiveness of this method.
For visible-thermal person re-identification, we focus on the challenge of domain discrepancy between visible images and thermal images. We propose to address this issue at a class level with a CNN-based two-stream network. Specifically, our idea is to learn a center for features of each person in each domain (visible and thermal domains), using a new relaxed center loss. Instead of imposing constraints between pairs of samples, we enforce the centers of the same person in visible and thermal domains to be close, and the centers of different persons to be distant. We also enforce the feature vector from the center of one person to another in visible feature space to be similar to that in thermal feature space. Using this network, we can learn domain- independent features for visible-thermal person re-identification. Experiments on two public datasets demonstrate the effectiveness of this method