13 research outputs found
Contrastive Bayesian Analysis for Deep Metric Learning
Recent methods for deep metric learning have been focusing on designing
different contrastive loss functions between positive and negative pairs of
samples so that the learned feature embedding is able to pull positive samples
of the same class closer and push negative samples from different classes away
from each other. In this work, we recognize that there is a significant
semantic gap between features at the intermediate feature layer and class
labels at the final output layer. To bridge this gap, we develop a contrastive
Bayesian analysis to characterize and model the posterior probabilities of
image labels conditioned by their features similarity in a contrastive learning
setting. This contrastive Bayesian analysis leads to a new loss function for
deep metric learning. To improve the generalization capability of the proposed
method onto new classes, we further extend the contrastive Bayesian loss with a
metric variance constraint. Our experimental results and ablation studies
demonstrate that the proposed contrastive Bayesian metric learning method
significantly improves the performance of deep metric learning in both
supervised and pseudo-supervised scenarios, outperforming existing methods by a
large margin.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine
Intelligenc
Coded Residual Transform for Generalizable Deep Metric Learning
A fundamental challenge in deep metric learning is the generalization
capability of the feature embedding network model since the embedding network
learned on training classes need to be evaluated on new test classes. To
address this challenge, in this paper, we introduce a new method called coded
residual transform (CRT) for deep metric learning to significantly improve its
generalization capability. Specifically, we learn a set of diversified
prototype features, project the feature map onto each prototype, and then
encode its features using their projection residuals weighted by their
correlation coefficients with each prototype. The proposed CRT method has the
following two unique characteristics. First, it represents and encodes the
feature map from a set of complimentary perspectives based on projections onto
diversified prototypes. Second, unlike existing transformer-based feature
representation approaches which encode the original values of features based on
global correlation analysis, the proposed coded residual transform encodes the
relative differences between the original features and their projected
prototypes. Embedding space density and spectral decay analysis show that this
multi-perspective projection onto diversified prototypes and coded residual
representation are able to achieve significantly improved generalization
capability in metric learning. Finally, to further enhance the generalization
performance, we propose to enforce the consistency on their feature similarity
matrices between coded residual transforms with different sizes of projection
prototypes and embedding dimensions. Our extensive experimental results and
ablation studies demonstrate that the proposed CRT method outperform the
state-of-the-art deep metric learning methods by large margins and improving
upon the current best method by up to 4.28% on the CUB dataset.Comment: Accepted by NeurIPS 202
A GAN-Based Input-Size Flexibility Model for Single Image Dehazing
Image-to-image translation based on generative adversarial network (GAN) has
achieved state-of-the-art performance in various image restoration
applications. Single image dehazing is a typical example, which aims to obtain
the haze-free image of a haze one. This paper concentrates on the challenging
task of single image dehazing. Based on the atmospheric scattering model, we
design a novel model to directly generate the haze-free image. The main
challenge of image dehazing is that the atmospheric scattering model has two
parameters, i.e., transmission map and atmospheric light. When we estimate them
respectively, the errors will be accumulated to compromise dehazing quality.
Considering this reason and various image sizes, we propose a novel input-size
flexibility conditional generative adversarial network (cGAN) for single image
dehazing, which is input-size flexibility at both training and test stages for
image-to-image translation with cGAN framework. We propose a simple and
effective U-type residual network (UR-Net) to combine the generator and adopt
the spatial pyramid pooling (SPP) to design the discriminator. Moreover, the
model is trained with multi-loss function, in which the consistency loss is a
novel designed loss in this paper. We finally build a multi-scale cGAN fusion
model to realize state-of-the-art single image dehazing performance. The
proposed models receive a haze image as input and directly output a haze-free
one. Experimental results demonstrate the effectiveness and efficiency of the
proposed models.Comment: Computer Visio
Multi-Focus Image Fusion Based on Multi-Scale Generative Adversarial Network
The methods based on the convolutional neural network have demonstrated its powerful information integration ability in image fusion. However, most of the existing methods based on neural networks are only applied to a part of the fusion process. In this paper, an end-to-end multi-focus image fusion method based on a multi-scale generative adversarial network (MsGAN) is proposed that makes full use of image features by a combination of multi-scale decomposition with a convolutional neural network. Extensive qualitative and quantitative experiments on the synthetic and Lytro datasets demonstrated the effectiveness and superiority of the proposed MsGAN compared to the state-of-the-art multi-focus image fusion methods
Weather Recognition based on Attention Image Search Method
Weather monitoring plays a vital role in intelligent traffic transportation, and the improvement of weather
recognition accuracy can effectively improve driving safety. At present, classification-based and segmentation-based algorithms for weather recognition have achieved good performance, but it is still full of challenges in real applications. On the one hand, the number of classes in public data sets is insufficient, which cannot identify the conditions such as stagnant water and debris flow. On the other hand, the current weather recognition methods have poor generalization ability, the model needs to be retrained when classes are changed. In this paper, we first propose a new multi-traffic weather (MTW) data set for weather recognition, it contains much richer classes. Then, a new weather recognition method based on attention image retrieval (AIR) is proposed to improve the performance of recognition. Compared with the previous methods, our method can obtain better generalization performance
PGAN: part-based nondirect coupling embedded GAN for person re-identification
The block-based representation learning method has been proven to be a very effective method for person re-identification (Re-ID), but the features extracted by the existing block-based approach tend to have a high correlation among different blocks. Also, these methods perform less well for persons with large posture changes. Thus, Part-based Nondirect Coupling (PNC) representation learning method is proposed by introducing a similarity measure loss to constrain features of different blocks. Moreover, Part-based Nondirect Coupling Embedded GAN (PGAN) method is proposed, which aims to extract more common features of different postures of a same person. In this way, the extracted features of the network are robust for posture changes of a person, and there are no auxiliary pose information and additional computational cost required in the test stage. Experimental results on public datasets show that our proposed method achieves good performances, especially, it outperforms the state-of-the-art GAN-based methods for person Re-ID
Study on the Design and Speed Ratio Control Strategy of Continuously Variable Transmission for Electric Vehicle
In order to develop continuously variable transmission (CVT) suited for electric vehicle (EV), a comparative analysis of the transmission used on EV is introduced in this paper. To solve the problems of traditional CVT used for EV, a newly designed electric CVT (ECVT) is proposed. The final drive ratio, speed ratio range of variator and hydraulic system of ECVT are redesigned for improving the transmission efficiency. Considering that the permanent magnet synchronization traction motor (PMSTM) of EV has an ideal driving characteristic (Constant torque at low speed, constant power at high speed) and wider high efficiency area compared with gasoline engine, the speed ratio control strategy of ECVT is studied again for improving the vehicle driving range and better meeting the requirements of driver. Finally, a simulation test system is fabricated for evaluating the battery energy consumption of an EV equipped with ECVT and single-speed transmission (SST) under constant speed and road drive cycle. The simulation results show that the ECVT proposed can extend the endurance mileage of EV at the speed of 20 to 90km/h, and improve the acceleration performance from 0 to 50km/h. The EV equipped with ECVT showed more advantages compared with SST in urban conditions