13 research outputs found

    Contrastive Bayesian Analysis for Deep Metric Learning

    Full text link
    Recent methods for deep metric learning have been focusing on designing different contrastive loss functions between positive and negative pairs of samples so that the learned feature embedding is able to pull positive samples of the same class closer and push negative samples from different classes away from each other. In this work, we recognize that there is a significant semantic gap between features at the intermediate feature layer and class labels at the final output layer. To bridge this gap, we develop a contrastive Bayesian analysis to characterize and model the posterior probabilities of image labels conditioned by their features similarity in a contrastive learning setting. This contrastive Bayesian analysis leads to a new loss function for deep metric learning. To improve the generalization capability of the proposed method onto new classes, we further extend the contrastive Bayesian loss with a metric variance constraint. Our experimental results and ablation studies demonstrate that the proposed contrastive Bayesian metric learning method significantly improves the performance of deep metric learning in both supervised and pseudo-supervised scenarios, outperforming existing methods by a large margin.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligenc

    Coded Residual Transform for Generalizable Deep Metric Learning

    Full text link
    A fundamental challenge in deep metric learning is the generalization capability of the feature embedding network model since the embedding network learned on training classes need to be evaluated on new test classes. To address this challenge, in this paper, we introduce a new method called coded residual transform (CRT) for deep metric learning to significantly improve its generalization capability. Specifically, we learn a set of diversified prototype features, project the feature map onto each prototype, and then encode its features using their projection residuals weighted by their correlation coefficients with each prototype. The proposed CRT method has the following two unique characteristics. First, it represents and encodes the feature map from a set of complimentary perspectives based on projections onto diversified prototypes. Second, unlike existing transformer-based feature representation approaches which encode the original values of features based on global correlation analysis, the proposed coded residual transform encodes the relative differences between the original features and their projected prototypes. Embedding space density and spectral decay analysis show that this multi-perspective projection onto diversified prototypes and coded residual representation are able to achieve significantly improved generalization capability in metric learning. Finally, to further enhance the generalization performance, we propose to enforce the consistency on their feature similarity matrices between coded residual transforms with different sizes of projection prototypes and embedding dimensions. Our extensive experimental results and ablation studies demonstrate that the proposed CRT method outperform the state-of-the-art deep metric learning methods by large margins and improving upon the current best method by up to 4.28% on the CUB dataset.Comment: Accepted by NeurIPS 202

    A GAN-Based Input-Size Flexibility Model for Single Image Dehazing

    Full text link
    Image-to-image translation based on generative adversarial network (GAN) has achieved state-of-the-art performance in various image restoration applications. Single image dehazing is a typical example, which aims to obtain the haze-free image of a haze one. This paper concentrates on the challenging task of single image dehazing. Based on the atmospheric scattering model, we design a novel model to directly generate the haze-free image. The main challenge of image dehazing is that the atmospheric scattering model has two parameters, i.e., transmission map and atmospheric light. When we estimate them respectively, the errors will be accumulated to compromise dehazing quality. Considering this reason and various image sizes, we propose a novel input-size flexibility conditional generative adversarial network (cGAN) for single image dehazing, which is input-size flexibility at both training and test stages for image-to-image translation with cGAN framework. We propose a simple and effective U-type residual network (UR-Net) to combine the generator and adopt the spatial pyramid pooling (SPP) to design the discriminator. Moreover, the model is trained with multi-loss function, in which the consistency loss is a novel designed loss in this paper. We finally build a multi-scale cGAN fusion model to realize state-of-the-art single image dehazing performance. The proposed models receive a haze image as input and directly output a haze-free one. Experimental results demonstrate the effectiveness and efficiency of the proposed models.Comment: Computer Visio

    Multi-Focus Image Fusion Based on Multi-Scale Generative Adversarial Network

    No full text
    The methods based on the convolutional neural network have demonstrated its powerful information integration ability in image fusion. However, most of the existing methods based on neural networks are only applied to a part of the fusion process. In this paper, an end-to-end multi-focus image fusion method based on a multi-scale generative adversarial network (MsGAN) is proposed that makes full use of image features by a combination of multi-scale decomposition with a convolutional neural network. Extensive qualitative and quantitative experiments on the synthetic and Lytro datasets demonstrated the effectiveness and superiority of the proposed MsGAN compared to the state-of-the-art multi-focus image fusion methods

    Weather Recognition based on Attention Image Search Method

    Get PDF
    Weather monitoring plays a vital role in intelligent traffic transportation, and the improvement of weather recognition accuracy can effectively improve driving safety. At present, classification-based and segmentation-based algorithms for weather recognition have achieved good performance, but it is still full of challenges in real applications. On the one hand, the number of classes in public data sets is insufficient, which cannot identify the conditions such as stagnant water and debris flow. On the other hand, the current weather recognition methods have poor generalization ability, the model needs to be retrained when classes are changed. In this paper, we first propose a new multi-traffic weather (MTW) data set for weather recognition, it contains much richer classes. Then, a new weather recognition method based on attention image retrieval (AIR) is proposed to improve the performance of recognition. Compared with the previous methods, our method can obtain better generalization performance

    PGAN: part-based nondirect coupling embedded GAN for person re-identification

    Get PDF
    The block-based representation learning method has been proven to be a very effective method for person re-identification (Re-ID), but the features extracted by the existing block-based approach tend to have a high correlation among different blocks. Also, these methods perform less well for persons with large posture changes. Thus, Part-based Nondirect Coupling (PNC) representation learning method is proposed by introducing a similarity measure loss to constrain features of different blocks. Moreover, Part-based Nondirect Coupling Embedded GAN (PGAN) method is proposed, which aims to extract more common features of different postures of a same person. In this way, the extracted features of the network are robust for posture changes of a person, and there are no auxiliary pose information and additional computational cost required in the test stage. Experimental results on public datasets show that our proposed method achieves good performances, especially, it outperforms the state-of-the-art GAN-based methods for person Re-ID

    Supervised Deep Feature Embedding With Handcrafted Feature

    No full text

    Study on the Design and Speed Ratio Control Strategy of Continuously Variable Transmission for Electric Vehicle

    No full text
    In order to develop continuously variable transmission (CVT) suited for electric vehicle (EV), a comparative analysis of the transmission used on EV is introduced in this paper. To solve the problems of traditional CVT used for EV, a newly designed electric CVT (ECVT) is proposed. The final drive ratio, speed ratio range of variator and hydraulic system of ECVT are redesigned for improving the transmission efficiency. Considering that the permanent magnet synchronization traction motor (PMSTM) of EV has an ideal driving characteristic (Constant torque at low speed, constant power at high speed) and wider high efficiency area compared with gasoline engine, the speed ratio control strategy of ECVT is studied again for improving the vehicle driving range and better meeting the requirements of driver. Finally, a simulation test system is fabricated for evaluating the battery energy consumption of an EV equipped with ECVT and single-speed transmission (SST) under constant speed and road drive cycle. The simulation results show that the ECVT proposed can extend the endurance mileage of EV at the speed of 20 to 90km/h, and improve the acceleration performance from 0 to 50km/h. The EV equipped with ECVT showed more advantages compared with SST in urban conditions
    corecore