14 research outputs found
Improved Res2Net model for Person re-identification
Person re-identification has become a very popular research topic in the
computer vision community owing to its numerous applications and growing
importance in visual surveillance. Person re-identification remains challenging
due to occlusion, illumination and significant intra-class variations across
different cameras. In this paper, we propose a multi-task network base on an
improved Res2Net model that simultaneously computes the identification loss and
verification loss of two pedestrian images. Given a pair of pedestrian images,
the system predicts the identities of the two input images and whether they
belong to the same identity. In order to obtain deeper feature information of
pedestrians, we propose to use the latest Res2Net model for feature extraction
of each input image. Experiments on several large-scale person
re-identification benchmark datasets demonstrate the accuracy of our approach.
For example, rank-1 accuracies are 83.18% (+1.38) and 93.14% (+0.84) for the
DukeMTMC and Market-1501 datasets, respectively. The proposed method shows
encouraging improvements compared with state-of-the-art methods.Comment: 6 page
Simple to Complex Cross-modal Learning to Rank
The heterogeneity-gap between different modalities brings a significant
challenge to multimedia information retrieval. Some studies formalize the
cross-modal retrieval tasks as a ranking problem and learn a shared multi-modal
embedding space to measure the cross-modality similarity. However, previous
methods often establish the shared embedding space based on linear mapping
functions which might not be sophisticated enough to reveal more complicated
inter-modal correspondences. Additionally, current studies assume that the
rankings are of equal importance, and thus all rankings are used
simultaneously, or a small number of rankings are selected randomly to train
the embedding space at each iteration. Such strategies, however, always suffer
from outliers as well as reduced generalization capability due to their lack of
insightful understanding of procedure of human cognition. In this paper, we
involve the self-paced learning theory with diversity into the cross-modal
learning to rank and learn an optimal multi-modal embedding space based on
non-linear mapping functions. This strategy enhances the model's robustness to
outliers and achieves better generalization via training the model gradually
from easy rankings by diverse queries to more complex ones. An efficient
alternative algorithm is exploited to solve the proposed challenging problem
with fast convergence in practice. Extensive experimental results on several
benchmark datasets indicate that the proposed method achieves significant
improvements over the state-of-the-arts in this literature.Comment: 14 pages; Accepted by Computer Vision and Image Understandin
Short-Term Wind Speed Forecasting via Stacked Extreme Learning Machine With Generalized Correntropy
Recently, wind speed forecasting as an effective computing technique plays an important role in advancing industry informatics, while dealing with these issues of control and operation for renewable power systems. However, it is facing some increasing difficulties to handle the large-scale dataset generated in these forecasting applications, with the purpose of ensuring stable computing performance. In response to such limitation, this paper proposes a more practical approach through the combination of extreme-learning machine (ELM) method and deep-learning model. ELM is a novel computing paradigm that enables the neural network (NN) based learning to be achieved with fast training speed and good generalization performance. The stacked ELM (SELM) is an advanced ELM algorithm under deep-learning framework, which works efficiently on memory consumption decrease. In this paper, an enhanced SELM is accordingly developed via replacing the Euclidean norm of the mean square error (MSE) criterion in ELM with the generalized correntropy criterion to further improve the forecasting performance. The advantage of the enhanced SELM with generalized correntropy to achieve better forecasting performance mainly relies on the following aspect. Generalized correntropy is a stable and robust nonlinear similarity measure while employing machine learning method to forecast wind speed, where the outliers may exist in some industrially measured values. Specifically, the experimental results of short-term and ultra-short-term forecasting on real wind speed data show that the proposed approach can achieve better computing performance compared with other traditional and more recent methods
Short-Term Wind Speed Forecasting via Stacked Extreme Learning Machine With Generalized Correntropy
Recently, wind speed forecasting as an effective computing technique plays an important role in advancing industry informatics, while dealing with these issues of control and operation for renewable power systems. However, it is facing some increasing difficulties to handle the large-scale dataset generated in these forecasting applications, with the purpose of ensuring stable computing performance. In response to such limitation, this paper proposes a more practical approach through the combination of extreme-learning machine (ELM) method and deep-learning model. ELM is a novel computing paradigm that enables the neural network (NN) based learning to be achieved with fast training speed and good generalization performance. The stacked ELM (SELM) is an advanced ELM algorithm under deep-learning framework, which works efficiently on memory consumption decrease. In this paper, an enhanced SELM is accordingly developed via replacing the Euclidean norm of the mean square error (MSE) criterion in ELM with the generalized correntropy criterion to further improve the forecasting performance. The advantage of the enhanced SELM with generalized correntropy to achieve better forecasting performance mainly relies on the following aspect. Generalized correntropy is a stable and robust nonlinear similarity measure while employing machine learning method to forecast wind speed, where the outliers may exist in some industrially measured values. Specifically, the experimental results of short-term and ultra-short-term forecasting on real wind speed data show that the proposed approach can achieve better computing performance compared with other traditional and more recent methods
Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search
Mobile landmark search (MLS) recently receives increasing attention for its
great practical values. However, it still remains unsolved due to two important
challenges. One is high bandwidth consumption of query transmission, and the
other is the huge visual variations of query images sent from mobile devices.
In this paper, we propose a novel hashing scheme, named as canonical view based
discrete multi-modal hashing (CV-DMH), to handle these problems via a novel
three-stage learning procedure. First, a submodular function is designed to
measure visual representativeness and redundancy of a view set. With it,
canonical views, which capture key visual appearances of landmark with limited
redundancy, are efficiently discovered with an iterative mining strategy.
Second, multi-modal sparse coding is applied to transform visual features from
multiple modalities into an intermediate representation. It can robustly and
adaptively characterize visual contents of varied landmark images with certain
canonical views. Finally, compact binary codes are learned on intermediate
representation within a tailored discrete binary embedding model which
preserves visual relations of images measured with canonical views and removes
the involved noises. In this part, we develop a new augmented Lagrangian
multiplier (ALM) based optimization method to directly solve the discrete
binary codes. We can not only explicitly deal with the discrete constraint, but
also consider the bit-uncorrelated constraint and balance constraint together.
Experiments on real world landmark datasets demonstrate the superior
performance of CV-DMH over several state-of-the-art methods
-softmax: Improving Intra-class Compactness and Inter-class Separability of Features
Intra-class compactness and inter-class separability are crucial indicators
to measure the effectiveness of a model to produce discriminative features,
where intra-class compactness indicates how close the features with the same
label are to each other and inter-class separability indicates how far away the
features with different labels are. In this work, we investigate intra-class
compactness and inter-class separability of features learned by convolutional
networks and propose a Gaussian-based softmax (-softmax) function
that can effectively improve intra-class compactness and inter-class
separability. The proposed function is simple to implement and can easily
replace the softmax function. We evaluate the proposed -softmax
function on classification datasets (i.e., CIFAR-10, CIFAR-100, and Tiny
ImageNet) and on multi-label classification datasets (i.e., MS COCO and
NUS-WIDE). The experimental results show that the proposed
-softmax function improves the state-of-the-art models across all
evaluated datasets. In addition, analysis of the intra-class compactness and
inter-class separability demonstrates the advantages of the proposed function
over the softmax function, which is consistent with the performance
improvement. More importantly, we observe that high intra-class compactness and
inter-class separability are linearly correlated to average precision on MS
COCO and NUS-WIDE. This implies that improvement of intra-class compactness and
inter-class separability would lead to improvement of average precision.Comment: 15 pages, published in TNNL