Search CORE

22 research outputs found

ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition

Author: Qu Yi
Shen Hengtao
Xu Xing
Zhou Yixuan
Publication venue
Publication date: 15/08/2023
Field of study

Class imbalance is a common challenge in real-world recognition tasks, where the majority of classes have few samples, also known as tail classes. We address this challenge with the perspective of generalization and empirically find that the promising Sharpness-Aware Minimization (SAM) fails to address generalization issues under the class-imbalanced setting. Through investigating this specific type of task, we identify that its generalization bottleneck primarily lies in the severe overfitting for tail classes with limited training data. To overcome this bottleneck, we leverage class priors to restrict the generalization scope of the class-agnostic SAM and propose a class-aware smoothness optimization algorithm named Imbalanced-SAM (ImbSAM). With the guidance of class priors, our ImbSAM specifically improves generalization targeting tail classes. We also verify the efficacy of ImbSAM on two prototypical applications of class-imbalanced recognition: long-tailed classification and semi-supervised anomaly detection, where our ImbSAM demonstrates remarkable performance improvements for tail classes and anomaly. Our code implementation is available at https://github.com/cool-xuan/Imbalanced_SAM.Comment: Accepted by International Conference on Computer Vision (ICCV) 202

arXiv.org e-Print Archive

DETA: Denoised Task Adaptation for Few-Shot Learning

Author: Gao Lianli
Luo Xu
Shen Hengtao
Song Jingkuan
Zhang Ji
Publication venue
Publication date: 11/03/2023
Field of study

Test-time task adaptation in few-shot learning aims to adapt a pre-trained task-agnostic model for capturing taskspecific knowledge of the test task, rely only on few-labeled support samples. Previous approaches generally focus on developing advanced algorithms to achieve the goal, while neglecting the inherent problems of the given support samples. In fact, with only a handful of samples available, the adverse effect of either the image noise (a.k.a. X-noise) or the label noise (a.k.a. Y-noise) from support samples can be severely amplified. To address this challenge, in this work we propose DEnoised Task Adaptation (DETA), a first, unified image- and label-denoising framework orthogonal to existing task adaptation approaches. Without extra supervision, DETA filters out task-irrelevant, noisy representations by taking advantage of both global visual information and local region details of support samples. On the challenging Meta-Dataset, DETA consistently improves the performance of a broad spectrum of baseline methods applied on various pre-trained models. Notably, by tackling the overlooked image noise in Meta-Dataset, DETA establishes new state-of-the-art results. Code is released at https://github.com/nobody-1617/DETA.Comment: 10 pages, 5 figure

arXiv.org e-Print Archive

Knowledge Tracing: A Review of Available Technologies

Author: Dai Miao
Du Xu
Hung Jui-Long
Li Hao
Tang Hengtao
Publication venue: The Aquila Digital Community
Publication date: 01/10/2021
Field of study

As a student modeling technique, knowledge tracing is widely used by various intelligent tutoring systems to infer and trace the individual’s knowledge state during the learning process. In recent years, various models were proposed to get accurate and easy-to-interpret results. To make sense of the wide Knowledge tracing (KT) modeling landscape, this paper conducts a systematic review to provide a detailed and nuanced discussion of relevant KT techniques from the perspective of assumptions, data, and algorithms. The results show that most existing KT models consider only a fragment of the assumptions that relate to the knowledge components within items and student’s cognitive process. Almost all types of KT models take “quize data” as input, although it is insufficient to reflect a clear picture of students’ learning process. Dynamic Bayesian network, logistic regression and deep learning are the main algorithms used by various knowledge tracing models. Some open issues are identified based on the analytics of the reviewed works and discussed potential future research directions

Aquila Digital Community

BatchNorm-based Weakly Supervised Video Anomaly Detection

Author: Qu Yi
Shen Fumin
Shen Hengtao
Song Jingkuan
Xu Xing
Zhou Yixuan
Publication venue
Publication date: 26/11/2023
Field of study

In weakly supervised video anomaly detection (WVAD), where only video-level labels indicating the presence or absence of abnormal events are available, the primary challenge arises from the inherent ambiguity in temporal annotations of abnormal occurrences. Inspired by the statistical insight that temporal features of abnormal events often exhibit outlier characteristics, we propose a novel method, BN-WVAD, which incorporates BatchNorm into WVAD. In the proposed BN-WVAD, we leverage the Divergence of Feature from Mean vector (DFM) of BatchNorm as a reliable abnormality criterion to discern potential abnormal snippets in abnormal videos. The proposed DFM criterion is also discriminative for anomaly recognition and more resilient to label noise, serving as the additional anomaly score to amend the prediction of the anomaly classifier that is susceptible to noisy labels. Moreover, a batch-level selection strategy is devised to filter more abnormal snippets in videos where more abnormal events occur. The proposed BN-WVAD model demonstrates state-of-the-art performance on UCF-Crime with an AUC of 87.24%, and XD-Violence, where AP reaches up to 84.93%. Our code implementation is accessible at https://github.com/cool-xuan/BN-WVAD

arXiv.org e-Print Archive

Understand Group Interaction and Cognitive State in Online Collaborative Problem Solving: Leveraging Brain-to-Brain Synchrony Data

Author: Du Xu
Hung Jui-Long
Li Hao
Tang Hengtao
Xie Yiqian
Zhang Lizhao
Publication venue: 'IUScholarWorks'
Publication date: 04/10/2022
Field of study

The purpose of this study aimed to analyze the process of online collaborative problem solving (CPS) via brain-to-brain synchrony (BS) at the problem-understanding and problem-solving stages. Aiming to obtain additional insights than traditional approaches (survey and observation), BS refers to the synchronization of brain activity between two or more people, as an indicator of interpersonal interaction or common attention. Thirty-six undergraduate students participated. Results indicate the problem-understanding stage showed a higher level of BS than the problem-solving stage. Moreover, the level of BS at the problem-solving stage was significantly correlated with task performance. Groups with all high CPS skill students had the highest level of BS, while some of the mixed groups could achieve the same level of BS. BS is an effective indicator of CPS to group performance and individual interaction. Implications for the online CPS design and possible supports for the process of online CPS activity are also discussed

Boise State University - ScholarWorks

An Efficient Membership Inference Attack for the Diffusion Model by Proximal Initialization

Author: Duan Jinhao
Kong Fei
Ma RuiPeng
Shen Hengtao
Shi Xiaoshuang
Xu Kaidi
Zhu Xiaofeng
Publication venue
Publication date: 26/05/2023
Field of study

Recently, diffusion models have achieved remarkable success in generating tasks, including image and audio generation. However, like other generative models, diffusion models are prone to privacy issues. In this paper, we propose an efficient query-based membership inference attack (MIA), namely Proximal Initialization Attack (PIA), which utilizes groundtruth trajectory obtained by

\epsilon

initialized in

t=0

and predicted point to infer memberships. Experimental results indicate that the proposed method can achieve competitive performance with only two queries on both discrete-time and continuous-time diffusion models. Moreover, previous works on the privacy of diffusion models have focused on vision tasks without considering audio tasks. Therefore, we also explore the robustness of diffusion models to MIA in the text-to-speech (TTS) task, which is an audio generation task. To the best of our knowledge, this work is the first to study the robustness of diffusion models to MIA in the TTS task. Experimental results indicate that models with mel-spectrogram (image-like) output are vulnerable to MIA, while models with audio output are relatively robust to MIA. {Code is available at \url{https://github.com/kong13661/PIA}}

arXiv.org e-Print Archive

A Linear-Arc Composite Beam Piezoelectric Energy Harvester Modeling and Finite Element Analysis

Author: Fulin Zhu
Hao Tian
Hengtao Xu
Xiaoyu Chen
Xuhui Zhang
Yan Guo
Publication venue: 'MDPI AG'
Publication date: 01/05/2022
Field of study

To improve the output performance of the piezoelectric energy harvester, this paper proposed the design of a linear-arc composite beam piezoelectric energy harvester (PEH-C). First the nonlinear restoring force model of a composite beam was obtained by the numerical simulation method. Afterwards, the corresponding coupled governing equations were derived by using the generalized Hamilton principle, laying the foundation for subsequent in-depth research. After this, a finite element simulation was performed in the COMSOL software to simulate the output voltage, stress distribution, and resonance frequency of the PEH-C under different curvatures. In this way, the effect of curvature change on the PEH-C was analyzed. Finally, the PEH-C with a curvature of 40 m−1  was prepared, and an experimental platform was built to verify the correctness of the relevant analysis. The results showed that the resonant frequency of the PEH-C can be changed by changing the curvature, and that the stress on the composite beam will increase after the arc segment is introduced. When the curvature of the PEH-C was 40 m−1, the open-circuit output voltage was 44.3% higher than that of the straight beam

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

PubMed Central