2 research outputs found

    Semantic-Aware Dual Contrastive Learning for Multi-label Image Classification

    Full text link
    Extracting image semantics effectively and assigning corresponding labels to multiple objects or attributes for natural images is challenging due to the complex scene contents and confusing label dependencies. Recent works have focused on modeling label relationships with graph and understanding object regions using class activation maps (CAM). However, these methods ignore the complex intra- and inter-category relationships among specific semantic features, and CAM is prone to generate noisy information. To this end, we propose a novel semantic-aware dual contrastive learning framework that incorporates sample-to-sample contrastive learning (SSCL) as well as prototype-to-sample contrastive learning (PSCL). Specifically, we leverage semantic-aware representation learning to extract category-related local discriminative features and construct category prototypes. Then based on SSCL, label-level visual representations of the same category are aggregated together, and features belonging to distinct categories are separated. Meanwhile, we construct a novel PSCL module to narrow the distance between positive samples and category prototypes and push negative samples away from the corresponding category prototypes. Finally, the discriminative label-level features related to the image content are accurately captured by the joint training of the above three parts. Experiments on five challenging large-scale public datasets demonstrate that our proposed method is effective and outperforms the state-of-the-art methods. Code and supplementary materials are released on https://github.com/yu-gi-oh-leilei/SADCL.Comment: 8 pages, 6 figures, accepted by European Conference on Artificial Intelligence (2023 ECAI

    Hybrid attention mechanism of feature fusion for medical image segmentation

    No full text
    Abstract Traditional convolution neural networks (CNN) have achieved good performance in multiā€organ segmentation of medical images. Due to the lack of ability to model longā€range dependencies and correlations between image pixels, CNN usually ignores the information of channel dimension. To further improve the performance of multiā€organ segmentation, a hybrid attention mechanism model is proposed. First, a CNN was used to extract multiā€scale feature maps and fed into the Channel Attention Enhancement Module (CAEM) to selectively pay attention to target organs in medical images, and the Transformer encoded tokenized image patches from CNN feature maps as the input sequence to model longā€range dependencies. Second, the decoder upsampled the output from Transformer and fused with the CAEM features in multiā€scale through skip connections. Finally, we introduced a Refinement Module (RM) after the decoder to improve feature correlations of the same organ and the feature discriminability between different organs. The model outperformed on dice coefficient (%) and hd95 on both the synapse multiā€organ segmentation and cardiac diagnosis challenge datasets. The hybrid attention mechanisms exhibited high efficiency and high segmentation accuracy in medicalĀ images
    corecore