62 research outputs found

    Multi-evidence and multi-modal fusion network for ground-based cloud recognition

    Get PDF
    In recent times, deep neural networks have drawn much attention in ground-based cloud recognition. Yet such kind of approaches simply center upon learning global features from visual information, which causes incomplete representations for ground-based clouds. In this paper, we propose a novel method named multi-evidence and multi-modal fusion network (MMFN) for ground-based cloud recognition, which could learn extended cloud information by fusing heterogeneous features in a unified framework. Namely, MMFN exploits multiple pieces of evidence, i.e., global and local visual features, from ground-based cloud images using the main network and the attentive network. In the attentive network, local visual features are extracted from attentive maps which are obtained by refining salient patterns from convolutional activation maps. Meanwhile, the multi-modal network in MMFN learns multi-modal features for ground-based cloud. To fully fuse the multi-modal and multi-evidence visual features, we design two fusion layers in MMFN to incorporate multi-modal features with global and local visual features, respectively. Furthermore, we release the first multi-modal ground-based cloud dataset named MGCD which not only contains the ground-based cloud images but also contains the multi-modal information corresponding to each cloud image. The MMFN is evaluated on MGCD and achieves a classification accuracy of 88.63% comparative to the state-of-the-art methods, which validates its effectiveness for ground-based cloud recognition

    Fuzzy multilayer clustering and fuzzy label regularization for unsupervised person reidentification

    Get PDF
    Unsupervised person reidentification has received more attention due to its wide real-world applications. In this paper, we propose a novel method named fuzzy multilayer clustering (FMC) for unsupervised person reidentification. The proposed FMC learns a new feature space using a multilayer perceptron for clustering in order to overcome the influence of complex pedestrian images. Meanwhile, the proposed FMC generates fuzzy labels for unlabeled pedestrian images, which simultaneously considers the membership degree and the similarity between the sample and each cluster. We further propose the fuzzy label regularization (FLR) to train the convolutional neural network (CNN) using pedestrian images with fuzzy labels in a supervised manner. The proposed FLR could regularize the CNN training process and reduce the risk of overfitting. The effectiveness of our method is validated on three large-scale person reidentification databases, i.e., Market-1501, DukeMTMC-reID, and CUHK03

    Processing transitive nearest-neighbor queries in multi-channel access environments

    Get PDF
    Wireless broadcast is an efficient way for information dissemination due to its good scalability [10]. Existing works typically assume mobile devices, such as cell phones and PDAs, can access only one channel at a time. In this paper, we consider a scenario of near future where a mobile device has the ability to process queries using information simultaneously received from multiple channels. We focus on the query processing of the transitive nearest neighbor (TNN) search [19]. Two TNN algorithms developed for a single broadcast channel environment are adapted to our new broadcast enviroment. Based on the obtained insights, we propose two new algorithms, namely Double-NN-Search and Hybrid-NN-Search algorithms. Further, we develop an optimization technique, called approximate-NN (ANN), to reduce the energy consumption in mobile devices. Finally, we conduct a comprehensive set of experiments to validate our proposals. The result shows that our new algorithms provide a better performance than the existing ones and the optimization technique efficiently reduces energy consumption. Keywords Multi-Channel access, transitive nearest neighbor, query processing

    Completed part transformer for person re-identification

    Get PDF
    Recently, part information of pedestrian images has been demonstrated to be effective for person re-identification (ReID), but the part interaction is ignored when using Transformer to learn long-range dependencies. In this paper, we propose a novel transformer network named Completed Part Transformer (CPT) for person ReID, where we design the part transformer layer to learn the completed part interaction. The part transformer layer includes the intra-part layer and the part-global layer, where they consider long-range dependencies from the aspects of the intra-part interaction and the part-global interaction, simultaneously. Furthermore, in order to overcome the limitation of fixed number of the patch tokens in the transformer layer, we propose the Adaptive Refined Tokens (ART) module to focus on learning the interaction between the informative patch tokens in the pedestrian image, which improves the discrimination of the pedestrian representation. Extensive experimental results on four person ReID datasets, i.e., MSMT17, Market1501, DukeMTMC-reID and CUHK03, demonstrate that the proposed method achieves a new state-of-the-art performance, e.g., it achieves 68.0% mAP and 84.6% Rank-1 accuracy on MSMT17