Patch-Level Contrasting without Patch Correspondence for Accurate and
  Dense Contrastive Representation Learning

Yan, Junchi; Zhang, Shaofeng; Zhao, Rui; Zhu, Feng

Patch-Level Contrasting without Patch Correspondence for Accurate and Dense Contrastive Representation Learning

Authors: Junchi Yan
Shaofeng Zhang
Rui Zhao
Feng Zhu
Publication date: 23 June 2023
Publisher

Abstract

We propose ADCLR: A ccurate and D ense Contrastive Representation Learning, a novel self-supervised learning framework for learning accurate and dense vision representation. To extract spatial-sensitive information, ADCLR introduces query patches for contrasting in addition with global contrasting. Compared with previous dense contrasting methods, ADCLR mainly enjoys three merits: i) achieving both global-discriminative and spatial-sensitive representation, ii) model-efficient (no extra parameters in addition to the global contrasting baseline), and iii) correspondence-free and thus simpler to implement. Our approach achieves new state-of-the-art performance for contrastive methods. On classification tasks, for ViT-S, ADCLR achieves 77.5% top-1 accuracy on ImageNet with linear probing, outperforming our baseline (DINO) without our devised techniques as plug-in, by 0.5%. For ViT-B, ADCLR achieves 79.8%, 84.0% accuracy on ImageNet by linear probing and finetune, outperforming iBOT by 0.3%, 0.2% accuracy. For dense tasks, on MS-COCO, ADCLR achieves significant improvements of 44.3% AP on object detection, 39.7% AP on instance segmentation, outperforming previous SOTA method SelfPatch by 2.2% and 1.2%, respectively. On ADE20K, ADCLR outperforms SelfPatch by 1.0% mIoU, 1.2% mAcc on the segm

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2306.13337

Last time updated on 28/06/2023