Search CORE

512 research outputs found

Revisiting Classifier: Transferring Vision-Language Models for Video Recognition

Author: Ouyang Wanli
Sun Zhun
Wu Wenhao
Publication venue
Publication date: 26/03/2023
Field of study

Transferring knowledge from task-agnostic pre-trained deep models for downstream tasks is an important topic in computer vision research. Along with the growth of computational capacity, we now have open-source vision-language pre-trained models in large scales of the model architecture and amount of data. In this study, we focus on transferring knowledge for video classification tasks. Conventional methods randomly initialize the linear classifier head for vision classification, but they leave the usage of the text encoder for downstream visual recognition tasks undiscovered. In this paper, we revise the role of the linear classifier and replace the classifier with the different knowledge from pre-trained model. We utilize the well-pretrained language model to generate good semantic target for efficient transferring learning. The empirical study shows that our method improves both the performance and the training speed of video classification, with a negligible change in the model. Our simple yet effective tuning paradigm achieves state-of-the-art performance and efficient training on various video recognition scenarios, i.e., zero-shot, few-shot, general recognition. In particular, our paradigm achieves the state-of-the-art accuracy of 87.8% on Kinetics-400, and also surpasses previous methods by 20~50% absolute top-1 accuracy under zero-shot, few-shot settings on five popular video datasets. Code and models can be found at https://github.com/whwu95/Text4Vis .Comment: Accepted by AAAI-2023. Camera Ready Versio

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Artificial Intelligence for Crystal Growth and Characterization

Author: Dropka Natasha
Schimmel Saskia
Sun Wenhao
Publication venue: Basel : MDPI
Publication date: 01/01/2022
Field of study

[no abstract available

Directory of Open Access Journals

Repositorium für Naturwissenschaften und Technik

DISPERSION-DRIVEN ISOMERISM IN THE GAS PHASE: THEORETICAL AND MICROWAVE SPECTROSCOPIC STUDY OF ALLYL ISOCYANATE

Author: Sun Wenhao
Publication venue: 'International Symposium on Molecular Spectroscopy'
Publication date: 18/06/2019
Field of study

The pure rotational spectrum of allyl isocyanate (\chem{CH_2=CHCH_2NCO}) was studied using chirped pulse and Balle-Flygare Fourier Transform microwave (FTMW) spectroscopy. Besides the previously reported \textit{gauche} conformer,\footnote{S. Maiti, A. I. Jaman, and R. N. Nandi, \textit{J. Mol. Spectrosc.} \textbf{158}, 8-13 (1993)} the lowest energy conformer was identified for the first time with the assistance of the quantum-chemical calculations performed at the B3LYP-D3(BJ) and MP2 levels of theory with Dunning’s cc-pVQZ basis set. The assignments were confirmed by the resolved hyperfine structure due to the

^{14}

N quadrupole moment and the spectra of the corresponding

^{13}

^{15}

N and

^{18}

O singly substituted isotopologues in natural abundance. Rotational transitions of the most stable conformer revealed a tunneling splitting due to the interconversion motion between its two mirror images, and the tunneling path was established theoretically. In addition, benchmark calculations of various density functionals with and without dispersion corrections were carried out to investigate the effect of the short-range dispersion energy on the conformational structures

Illinois Digital Environment for Access to Learning and Scholarship Repository

The roles of intrinsic motivators and extrinsic motivators in promoting e-learning in the workplace

Author: Han Seung-hyun
Huang Wenhao
Uoo Sun Joo
Publication venue: [Amsterdam]
Publication date: 01/01/2012
Field of study

K-Developedia(KDI School) Repository

ROTATIONAL SPECTRA AND STRUCTURAL DETERMINATION OF HCCNCS

Author: Sun Wenhao
Publication venue: 'International Symposium on Molecular Spectroscopy'
Publication date
Field of study

The ground state of HCCNCS, prepared by high voltage electric discharge of a gas mixture of acetylene and chem{CH_3NCS} in neon during supersonic expansion, was studied using both chirped pulse Fourier transform microwave (cp-FTMW) and Balle Flygare FTMW spectrometers. The pure rotational spectra were measured for the parent,

^{34}

S, and three

^{13}

C isotopologues in natural abundance and the

^{14}

N nuclear quadrupole hyperfine structure was resolved. The observed spectra are consistent with a linear or quasilinear ground state of HCCNCS. The corresponding rotational constants were used to derive the substitution (r

_{s}

) and effective ground state (r

_{0}

) geometries. Supporting calculations at the MP2/cc-pVQZ and CCSD(T)/cc-pVQZ (expanded basis cc-pV(Q+d)Z for sulfur) levels of theory reveal that the potential energy surface is virtually flat around the minimum and yield an equilibrium structure (r

_{e}

) that is consistent with experiment

Illinois Digital Environment for Access to Learning and Scholarship Repository

THE MOLECULAR STRUCTURE OF MONOFLUOROBENZALDEHYDES

Author: Sun Wenhao
Publication venue: 'International Symposium on Molecular Spectroscopy'
Publication date
Field of study

The pure rotational spectra of 2- and 3-fluorobenzaldehyde have been investigated using a chirped pulse Fourier transform microwave (FTMW) spectrometer in the range of 8-18 GHz and a Balle-Flygare FTMW spectrometer in the range of 4-26 GHz. As in a previous study of monofluorobenzaldehydes,footnote{Jos'{e} L. Alonso and Rosa M. Villama~{n}'{a}n, J. Chem. Soc., Faraday Trans. 2, 1989, 85(2), 137-149} only transitions due to a single planar conformer were observed for 2-fluorobenzaldehyde (O-trans) whereas two planar conformers (O-trans and O-cis) of 3-fluorobenzaldehydes were confirmed. Transitions due to the seven unique

^{13}

C isotopologues of each of the three molecules have been observed for the first time. Their rotational constants were used to derive the effective ground state (r

_{0}

) and substitution (r

_{s}

) structures. The results compare favourably with the equilibrium (r

_{e}

) geometries which were determined following geometry optimization at the MP2/aug-cc-pVTZ level of theory

Illinois Digital Environment for Access to Learning and Scholarship Repository

TransHP: Image Classification with Hierarchical Prompting

Author: Li Wei
Sun Yifan
Wang Wenhao
Yang Yi
Publication venue
Publication date: 13/04/2023
Field of study

This paper explores a hierarchical prompting mechanism for the hierarchical image classification (HIC) task. Different from prior HIC methods, our hierarchical prompting is the first to explicitly inject ancestor-class information as a tokenized hint that benefits the descendant-class discrimination. We think it well imitates human visual recognition, i.e., humans may use the ancestor class as a prompt to draw focus on the subtle differences among descendant classes. We model this prompting mechanism into a Transformer with Hierarchical Prompting (TransHP). TransHP consists of three steps: 1) learning a set of prompt tokens to represent the coarse (ancestor) classes, 2) on-the-fly predicting the coarse class of the input image at an intermediate block, and 3) injecting the prompt token of the predicted coarse class into the intermediate feature. Though the parameters of TransHP maintain the same for all input images, the injected coarse-class prompt conditions (modifies) the subsequent feature extraction and encourages a dynamic focus on relatively subtle differences among the descendant classes. Extensive experiments show that TransHP improves image classification on accuracy (e.g., improving ViT-B/16 by +2.83% ImageNet classification accuracy), training data efficiency (e.g., +12.69% improvement under 10% ImageNet training data), and model explainability. Moreover, TransHP also performs favorably against prior HIC methods, showing that TransHP well exploits the hierarchical information

arXiv.org e-Print Archive

Sense: Model Hardware Co-design for Accelerating Sparse CNN on Systolic Array

Author: Chen Song
Kang Yi
Liu Deng
Sun Wendi
Sun Wenhao
Zou Zhiwei
Publication venue
Publication date: 23/09/2022
Field of study

Sparsity is an intrinsic property of convolutional neural network(CNN) and worth exploiting for CNN accelerators, but extra processing comes with hardware overhead, causing many architectures suffering from only minor profit. Meanwhile, systolic array has been increasingly competitive on CNNs acceleration for its high spatiotemporal locality and low hardware overhead. However, the irregularity of sparsity induces imbalanced workload under the rigid systolic dataflow, causing performance degradation. Thus, this paper proposed a systolicarray-based architecture, called Sense, for sparse CNN acceleration by model-hardware co-design, achieving large performance improvement. To balance input feature map(IFM) and weight loads across Processing Element(PE) array, we applied channel clustering to gather IFMs with approximate sparsity for array computation, and co-designed a load-balancing weight pruning method to keep the sparsity ratio of each kernel at a certain value with little accuracy loss, improving PE utilization and overall performance. Additionally, Adaptive Dataflow Configuration is applied to determine the computing strategy based on the storage ratio of IFMs and weights, lowering 1.17x-1.8x DRAM access compared with Swallow and further reducing system energy consumption. The whole design is implemented on ZynqZCU102 with 200MHz and performs at 471-, 34-, 53- and 191-image/s for AlexNet, VGG-16, ResNet-50 and GoogleNet respectively. Compared against sparse systolic-array-based accelerators, Swallow, FESA and SPOTS, Sense achieves 1x-2.25x, 1.95x-2.5x and 1.17x-2.37x performance improvement on these CNNs respectively with reasonable overhead.Comment: 14 pages, 29 figures, 6 tables, IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEM

arXiv.org e-Print Archive