Search CORE

50 research outputs found

Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation

Author: Chang Shih-Fu
Lu Zhiwu
Niu Yulei
Wen Ji-Rong
Xiang Tao
Publication venue
Publication date: 18/10/2018
Field of study

Image annotation aims to annotate a given image with a variable number of class labels corresponding to diverse visual concepts. In this paper, we address two main issues in large-scale image annotation: 1) how to learn a rich feature representation suitable for predicting a diverse set of visual concepts ranging from object, scene to abstract concept; 2) how to annotate an image with the optimal number of class labels. To address the first issue, we propose a novel multi-scale deep model for extracting rich and discriminative features capable of representing a wide range of visual concepts. Specifically, a novel two-branch deep neural network architecture is proposed which comprises a very deep main network branch and a companion feature fusion network branch designed for fusing the multi-scale features computed from the main branch. The deep model is also made multi-modal by taking noisy user-provided tags as model input to complement the image input. For tackling the second issue, we introduce a label quantity prediction auxiliary task to the main label prediction task to explicitly estimate the optimal label number for a given image. Extensive experiments are carried out on two large-scale image annotation benchmark datasets and the results show that our method significantly outperforms the state-of-the-art.Comment: Submited to IEEE TI

arXiv.org e-Print Archive

University of Surrey

Debiased Fine-Tuning for Vision-language Models by Prompt Regularization

Author: Hur Minhoe
Lee Saeil
Niu Yulei
Zhang Hanwang
Zhu Beier
Publication venue
Publication date: 31/03/2023
Field of study

We present a new paradigm for fine-tuning large-scale visionlanguage pre-trained models on downstream task, dubbed Prompt Regularization (ProReg). Different from traditional fine-tuning which easily overfits to the downstream task data, ProReg uses the prediction by prompting the pretrained model to regularize the fine-tuning. The motivation is: by prompting the large model "a photo of a [CLASS]", the fil-lin answer is only dependent on the pretraining encyclopedic knowledge while independent of the task data distribution, which is usually biased. Specifically, given a training sample prediction during fine-tuning, we first calculate its KullbackLeibler loss of the prompt prediction and Cross-Entropy loss of the ground-truth label, and then combine them with a proposed sample-wise adaptive trade-off weight, which automatically adjusts the transfer between the pretrained and downstream domains. On various out-of-distribution benchmarks, we show the consistently strong performance of ProReg compared with conventional fine-tuning, zero-shot prompt, prompt tuning, and other state-of-the-art methods.Comment: AAAI2023 accepte

arXiv.org e-Print Archive

Second-Order Topological Insulator in van der Waals Heterostructures of CoBr $_2$ /Pt $_2$ HgSe $_3$ /CoBr $_2$

Author: Han Yulei
Liu Zheng
Niu Qian
Qiao Zhenhua
Ren Yafei
Publication venue: 'American Physical Society (APS)'
Publication date: 01/02/2022
Field of study

Second-order topological insulator, which has (d-2)-dimensional topological hinge or corner states, has been observed in three-dimensional materials, but has yet not been observed in two-dimensional system. In this Letter, we theoretically propose the realization of second-order topological insulator in the van der Waals heterostructure of CoBr

_2

/Pt

_2

HgSe

_3

/CoBr

_2

. Pt

_2

HgSe

_3

is a large gap

\mathbb{Z}_2

topological insulator. With in-plane exchange field from neighboring CoBr

_2

, a large band gap above 70 meV opens up at the edge. The corner states, which are robust against edge disorders and irregular shapes, are confirmed in the nanoflake. We further show that the second-order topological states can also be realized in the heterostructure of jacutingaite family

\mathbb{Z}_2

topological insulators. We believe that our work will be beneficial for the experimental realization of second-order topological insulators in van der Waals layered materials

arXiv.org e-Print Archive

Prompt-aligned Gradient for Prompt Tuning

Author: Han Yucheng
Niu Yulei
Wu Yue
Zhang Hanwang
Zhu Beier
Publication venue
Publication date: 28/09/2023
Field of study

Thanks to the large pre-trained vision-language models (VLMs) like CLIP, we can craft a zero-shot classifier by "prompt", e.g., the confidence score of an image being "[CLASS]" can be obtained by using the VLM provided similarity measure between the image and the prompt sentence "a photo of a [CLASS]". Therefore, prompt shows a great potential for fast adaptation of VLMs to downstream tasks if we fine-tune the prompt-based similarity measure. However, we find a common failure that improper fine-tuning may not only undermine the prompt's inherent prediction for the task-related classes, but also for other classes in the VLM vocabulary. Existing methods still address this problem by using traditional anti-overfitting techniques such as early stopping and data augmentation, which lack a principled solution specific to prompt. We present Prompt-aligned Gradient, dubbed ProGrad, to prevent prompt tuning from forgetting the the general knowledge learned from VLMs. In particular, ProGrad only updates the prompt whose gradient is aligned (or non-conflicting) to the "general direction", which is represented as the gradient of the KL loss of the pre-defined prompt prediction. Extensive experiments demonstrate the stronger few-shot generalization ability of ProGrad over state-of-the-art prompt tuning methods. Codes are available at https://github.com/BeierZhu/Prompt-align.Comment: Accepted by ICCV202

arXiv.org e-Print Archive

Anderson Localization from Berry-Curvature Interchange in Quantum Anomalous Hall System

Author: Hua Jiang
J. G. Checkelsky
Jian Wang
Ke Wang
Lei Zhang
Qian Niu
Shengyuan A. Yang
Xinzhou Deng
Yulei Han
Zhenhua Qiao
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2016
Field of study

We theoretically investigate the localization mechanism of the quantum anomalous Hall effect (QAHE) in the presence of spin-flip disorders. We show that the QAHE keeps quantized at weak disorders, then enters a Berry-curvature mediated metallic phase at moderate disorders, and finally goes into the Anderson insulating phase at strong disorders. From the phase diagram, we find that at the charge neutrality point although the QAHE is most robust against disorders, the corresponding metallic phase is much easier to be localized into the Anderson insulating phase due to the \textit{interchange} of Berry curvatures carried respectively by the conduction and valence bands. At the end, we provide a phenomenological picture related to the topological charges to better understand the underlying physical origin of the QAHE Anderson localization.Comment: 6 pages, 4 figure

arXiv.org e-Print Archive

Crossref

HKU Scholars Hub