Search CORE

29 research outputs found

Prototypical Contrast Adaptation for Domain Adaptive Semantic Segmentation

Author: Gao Peng
Jiang Zhengkai
Li Yuxi
Tai Ying
Wang Chengjie
Wang Yabiao
Yang Ceyuan
Publication venue
Publication date: 14/07/2022
Field of study

Unsupervised Domain Adaptation (UDA) aims to adapt the model trained on the labeled source domain to an unlabeled target domain. In this paper, we present Prototypical Contrast Adaptation (ProCA), a simple and efficient contrastive learning method for unsupervised domain adaptive semantic segmentation. Previous domain adaptation methods merely consider the alignment of the intra-class representational distributions across various domains, while the inter-class structural relationship is insufficiently explored, resulting in the aligned representations on the target domain might not be as easily discriminated as done on the source domain anymore. Instead, ProCA incorporates inter-class information into class-wise prototypes, and adopts the class-centered distribution alignment for adaptation. By considering the same class prototypes as positives and other class prototypes as negatives to achieve class-centered distribution alignment, ProCA achieves state-of-the-art performance on classical domain adaptation tasks, {\em i.e., GTA5

\to

Cityscapes \text{and} SYNTHIA

\to

Cityscapes}. Code is available at \href{https://github.com/jiangzhengkai/ProCA}{ProCA

arXiv.org e-Print Archive

You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction

Author: Cui Ziteng
Gao Peng
Gu Lin
Harada Tatsuya
Jiang Zhengkai
Li Kunchang
Qiao Yu
Su Shenghan
Publication venue
Publication date: 02/08/2022
Field of study

Challenging illumination conditions (low-light, under-exposure and over-exposure) in the real world not only cast an unpleasant visual appearance but also taint the computer vision tasks. After camera captures the raw-RGB data, it renders standard sRGB images with image signal processor (ISP). By decomposing ISP pipeline into local and global image components, we propose a lightweight fast Illumination Adaptive Transformer (IAT) to restore the normal lit sRGB image from either low-light or under/over-exposure conditions. Specifically, IAT uses attention queries to represent and adjust the ISP-related parameters such as colour correction, gamma correction. With only ~90k parameters and ~0.004s processing speed, our IAT consistently achieves superior performance over SOTA on the current benchmark low-light enhancement and exposure correction datasets. Competitive experimental performance also demonstrates that our IAT significantly enhances object detection and semantic segmentation tasks under various light conditions. Training code and pretrained model is available at https://github.com/cuiziteng/Illumination-Adaptive-Transformer.Comment: 23 page

arXiv.org e-Print Archive

Dynamic fusion with intra-and inter-modality attention flow for visual question answering

Author: GAO Peng
HOI Steven C. H.
JIANG Zhengkai
LI Hongsheng
LU Pan
WANG Xiaogang
YOU Haoxuan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2019
Field of study

Learning effective fusion of multi-modality features is at the heart of visual question answering. We propose a novel method of dynamically fusing multi-modal features with intra- and inter-modality information flow, which alternatively pass dynamic information between and across the visual and language modalities. It can robustly capture the high-level interactions between language and vision domains, thus significantly improves the performance of visual question answering. We also show that the proposed dynamic intra-modality attention flow conditioned on the other modality can dynamically modulate the intra-modality attention of the target modality, which is vital for multimodality feature fusion. Experimental evaluations on the VQA 2.0 dataset show that the proposed method achieves state-of-the-art VQA performance. Extensive ablation studies are carried out for the comprehensive analysis of the proposed method.Comment: CVPR 2019 ORA

arXiv.org e-Print Archive

Crossref

Institutional Knowledge at Singapore Management University

Rethinking Mobile Block for Efficient Attention-based Models

Author: Huang Tianxin
Jiang Zhengkai
Li Jian
Li Xiangtai
Liu Liang
Wang Chengjie
Wang Yabiao
Xue Zhucun
Zhang Boshen
Zhang Jiangning
Publication venue
Publication date: 14/08/2023
Field of study

This paper focuses on developing modern, efficient, lightweight models for dense predictions while trading off parameters, FLOPs, and performance. Inverted Residual Block (IRB) serves as the infrastructure for lightweight CNNs, but no counterpart has been recognized by attention-based studies. This work rethinks lightweight infrastructure from efficient IRB and effective components of Transformer from a unified perspective, extending CNN-based IRB to attention-based models and abstracting a one-residual Meta Mobile Block (MMB) for lightweight model design. Following simple but effective design criterion, we deduce a modern Inverted Residual Mobile Block (iRMB) and build a ResNet-like Efficient MOdel (EMO) with only iRMB for down-stream tasks. Extensive experiments on ImageNet-1K, COCO2017, and ADE20K benchmarks demonstrate the superiority of our EMO over state-of-the-art methods, e.g., EMO-1M/2M/5M achieve 71.5, 75.1, and 78.4 Top-1 that surpass equal-order CNN-/Attention-based models, while trading-off the parameter, efficiency, and accuracy well: running 2.8-4.0x faster than EdgeNeXt on iPhone14

arXiv.org e-Print Archive