Search CORE

282 research outputs found

Improved modelling of the human cerebral vasculature

Author: ZHENG WEILI
Publication venue
Publication date: 29/09/2007
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Fast Sampling of Diffusion Models via Operator Learning

Author: Anandkumar Anima
Azizzadenesheli Kamyar
Nie Weili
Vahdat Arash
Zheng Hongkai
Publication venue
Publication date: 24/11/2022
Field of study

Diffusion models have found widespread adoption in various areas. However, sampling from them is slow because it involves emulating a reverse process with hundreds-to-thousands of network evaluations. Inspired by the success of neural operators in accelerating differential equations solving, we approach this problem by solving the underlying neural differential equation from an operator learning perspective. We examine probability flow ODE trajectories in diffusion models and observe a compact energy spectrum that can be learned efficiently in Fourier space. With this insight, we propose diffusion Fourier neural operator (DFNO) with temporal convolution in Fourier space to parameterize the operator that maps initial condition to the solution trajectory, which is a continuous function in time. DFNO can be applied to any diffusion model and generate high-quality samples in one model forward call. Our method achieves the state-of-the-art FID of 4.72 on CIFAR-10 using only one model evaluation

arXiv.org e-Print Archive

Knowledge-Aware Prompt Tuning for Generalizable Vision-Language Models

Author: Guan Weili
Kan Baoshuo
Lu Wenpeng
Wang Teng
Zhen Xiantong
Zheng Feng
Publication venue
Publication date: 22/08/2023
Field of study

Pre-trained vision-language models, e.g., CLIP, working with manually designed prompts have demonstrated great capacity of transfer learning. Recently, learnable prompts achieve state-of-the-art performance, which however are prone to overfit to seen classes, failing to generalize to unseen classes. In this paper, we propose a Knowledge-Aware Prompt Tuning (KAPT) framework for vision-language models. Our approach takes inspiration from human intelligence in which external knowledge is usually incorporated into recognizing novel categories of objects. Specifically, we design two complementary types of knowledge-aware prompts for the text encoder to leverage the distinctive characteristics of category-related external knowledge. The discrete prompt extracts the key information from descriptions of an object category, and the learned continuous prompt captures overall contexts. We further design an adaptation head for the visual encoder to aggregate salient attentive visual cues, which establishes discriminative and task-aware visual representations. We conduct extensive experiments on 11 widely-used benchmark datasets and the results verify the effectiveness in few-shot image classification, especially in generalizing to unseen categories. Compared with the state-of-the-art CoCoOp method, KAPT exhibits favorable performance and achieves an absolute gain of 3.22% on new classes and 2.57% in terms of harmonic mean.Comment: Accepted by ICCV 202

arXiv.org e-Print Archive