Search CORE

14 research outputs found

Improved Res2Net model for Person re-identification

Author: Cao Zongjing
Lee Hyo Jong
Publication venue
Publication date: 24/02/2020
Field of study

Person re-identification has become a very popular research topic in the computer vision community owing to its numerous applications and growing importance in visual surveillance. Person re-identification remains challenging due to occlusion, illumination and significant intra-class variations across different cameras. In this paper, we propose a multi-task network base on an improved Res2Net model that simultaneously computes the identification loss and verification loss of two pedestrian images. Given a pair of pedestrian images, the system predicts the identities of the two input images and whether they belong to the same identity. In order to obtain deeper feature information of pedestrians, we propose to use the latest Res2Net model for feature extraction of each input image. Experiments on several large-scale person re-identification benchmark datasets demonstrate the accuracy of our approach. For example, rank-1 accuracies are 83.18% (+1.38) and 93.14% (+0.84) for the DukeMTMC and Market-1501 datasets, respectively. The proposed method shows encouraging improvements compared with state-of-the-art methods.Comment: 6 page

arXiv.org e-Print Archive

Simple to Complex Cross-modal Learning to Rank

Author: Luo Minnan
Chang Xiaojun
Li Zhihui
Nie Liqiang
Hauptmann Alexander G.
Zheng Qinghua
Publication venue
Publication date: 01/01/1998
Field of study

The heterogeneity-gap between different modalities brings a significant challenge to multimedia information retrieval. Some studies formalize the cross-modal retrieval tasks as a ranking problem and learn a shared multi-modal embedding space to measure the cross-modality similarity. However, previous methods often establish the shared embedding space based on linear mapping functions which might not be sophisticated enough to reveal more complicated inter-modal correspondences. Additionally, current studies assume that the rankings are of equal importance, and thus all rankings are used simultaneously, or a small number of rankings are selected randomly to train the embedding space at each iteration. Such strategies, however, always suffer from outliers as well as reduced generalization capability due to their lack of insightful understanding of procedure of human cognition. In this paper, we involve the self-paced learning theory with diversity into the cross-modal learning to rank and learn an optimal multi-modal embedding space based on non-linear mapping functions. This strategy enhances the model's robustness to outliers and achieves better generalization via training the model gradually from easy rankings by diverse queries to more complex ones. An efficient alternative algorithm is exploited to solve the proposed challenging problem with fast convergence in practice. Extensive experimental results on several benchmark datasets indicate that the proposed method achieves significant improvements over the state-of-the-arts in this literature.Comment: 14 pages; Accepted by Computer Vision and Image Understandin

arXiv.org e-Print Archive

Crossref

OPUS - University of Technology Sydney

Wageningen University & Research Publications

Short-Term Wind Speed Forecasting via Stacked Extreme Learning Machine With Generalized Correntropy

Author: Luo Xiong
Sun Jiankun
Wang Jenq-Haur
Wang Long
Wang Weiping
Wu Jinsong
Zhang Zijun
Zhao Wenbing
Publication venue: EngagedScholarship@CSU
Publication date: 01/11/2018
Field of study

Recently, wind speed forecasting as an effective computing technique plays an important role in advancing industry informatics, while dealing with these issues of control and operation for renewable power systems. However, it is facing some increasing difficulties to handle the large-scale dataset generated in these forecasting applications, with the purpose of ensuring stable computing performance. In response to such limitation, this paper proposes a more practical approach through the combination of extreme-learning machine (ELM) method and deep-learning model. ELM is a novel computing paradigm that enables the neural network (NN) based learning to be achieved with fast training speed and good generalization performance. The stacked ELM (SELM) is an advanced ELM algorithm under deep-learning framework, which works efficiently on memory consumption decrease. In this paper, an enhanced SELM is accordingly developed via replacing the Euclidean norm of the mean square error (MSE) criterion in ELM with the generalized correntropy criterion to further improve the forecasting performance. The advantage of the enhanced SELM with generalized correntropy to achieve better forecasting performance mainly relies on the following aspect. Generalized correntropy is a stable and robust nonlinear similarity measure while employing machine learning method to forecast wind speed, where the outliers may exist in some industrially measured values. Specifically, the experimental results of short-term and ultra-short-term forecasting on real wind speed data show that the proposed approach can achieve better computing performance compared with other traditional and more recent methods

Short-Term Wind Speed Forecasting via Stacked Extreme Learning Machine With Generalized Correntropy

Author: Luo Xiong
Sun Jiankun
Wang Jenq-Haur
Wang Long
Wang Weiping
Wu Jinsong
Zhang Zijun
Zhao Wenbing
Publication venue: EngagedScholarship@CSU
Publication date: 01/11/2018
Field of study

Cleveland-Marshall College of Law

Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search

Author: He Xiangnan
Huang Zi
Liu Xiaobai
Song Jingkuan
Zhou Xiaofang
Zhu Lei
Publication venue
Publication date: 13/07/2017
Field of study

Mobile landmark search (MLS) recently receives increasing attention for its great practical values. However, it still remains unsolved due to two important challenges. One is high bandwidth consumption of query transmission, and the other is the huge visual variations of query images sent from mobile devices. In this paper, we propose a novel hashing scheme, named as canonical view based discrete multi-modal hashing (CV-DMH), to handle these problems via a novel three-stage learning procedure. First, a submodular function is designed to measure visual representativeness and redundancy of a view set. With it, canonical views, which capture key visual appearances of landmark with limited redundancy, are efficiently discovered with an iterative mining strategy. Second, multi-modal sparse coding is applied to transform visual features from multiple modalities into an intermediate representation. It can robustly and adaptively characterize visual contents of varied landmark images with certain canonical views. Finally, compact binary codes are learned on intermediate representation within a tailored discrete binary embedding model which preserves visual relations of images measured with canonical views and removes the involved noises. In this part, we develop a new augmented Lagrangian multiplier (ALM) based optimization method to directly solve the discrete binary codes. We can not only explicitly deal with the discrete constraint, but also consider the bit-uncorrelated constraint and balance constraint together. Experiments on real world landmark datasets demonstrate the superior performance of CV-DMH over several state-of-the-art methods

arXiv.org e-Print Archive

University of Queensland eSpace

$\mathcal{G}$ -softmax: Improving Intra-class Compactness and Inter-class Separability of Features

Author: Kankanhalli Mohan
Luo Yan
Wong Yongkang
Zhao Qi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/07/2019
Field of study

Intra-class compactness and inter-class separability are crucial indicators to measure the effectiveness of a model to produce discriminative features, where intra-class compactness indicates how close the features with the same label are to each other and inter-class separability indicates how far away the features with different labels are. In this work, we investigate intra-class compactness and inter-class separability of features learned by convolutional networks and propose a Gaussian-based softmax (

\mathcal{G}

-softmax) function that can effectively improve intra-class compactness and inter-class separability. The proposed function is simple to implement and can easily replace the softmax function. We evaluate the proposed

\mathcal{G}

-softmax function on classification datasets (i.e., CIFAR-10, CIFAR-100, and Tiny ImageNet) and on multi-label classification datasets (i.e., MS COCO and NUS-WIDE). The experimental results show that the proposed

\mathcal{G}

-softmax function improves the state-of-the-art models across all evaluated datasets. In addition, analysis of the intra-class compactness and inter-class separability demonstrates the advantages of the proposed function over the softmax function, which is consistent with the performance improvement. More importantly, we observe that high intra-class compactness and inter-class separability are linearly correlated to average precision on MS COCO and NUS-WIDE. This implies that improvement of intra-class compactness and inter-class separability would lead to improvement of average precision.Comment: 15 pages, published in TNNL

arXiv.org e-Print Archive

ScholarBank@NUS