Search CORE

102 research outputs found

SEAL: Simultaneous Label Hierarchy Exploration And Learning

Author: Tan Zhiquan
Wang Zihao
Zhang Yifan
Publication venue
Publication date: 26/04/2023
Field of study

Label hierarchy is an important source of external knowledge that can enhance classification performance. However, most existing methods rely on predefined label hierarchies that may not match the data distribution. To address this issue, we propose Simultaneous label hierarchy Exploration And Learning (SEAL), a new framework that explores the label hierarchy by augmenting the observed labels with latent labels that follow a prior hierarchical structure. Our approach uses a 1-Wasserstein metric over the tree metric space as an objective function, which enables us to simultaneously learn a data-driven label hierarchy and perform (semi-)supervised learning. We evaluate our method on several datasets and show that it achieves superior results in both supervised and semi-supervised scenarios and reveals insightful label structures. Our implementation is available at https://github.com/tzq1999/SEAL

arXiv.org e-Print Archive

COVER: A Heuristic Greedy Adversarial Attack on Prompt-based Learning in Language Models

Author: Chen Qingliang
Huang Yongjian
Tan Zihao
Zhu Wenbin
Publication venue
Publication date: 08/06/2023
Field of study

Prompt-based learning has been proved to be an effective way in pre-trained language models (PLMs), especially in low-resource scenarios like few-shot settings. However, the trustworthiness of PLMs is of paramount significance and potential vulnerabilities have been shown in prompt-based templates that could mislead the predictions of language models, causing serious security concerns. In this paper, we will shed light on some vulnerabilities of PLMs, by proposing a prompt-based adversarial attack on manual templates in black box scenarios. First of all, we design character-level and word-level heuristic approaches to break manual templates separately. Then we present a greedy algorithm for the attack based on the above heuristic destructive approaches. Finally, we evaluate our approach with the classification tasks on three variants of BERT series models and eight datasets. And comprehensive experimental results justify the effectiveness of our approach in terms of attack success rate and attack speed. Further experimental studies indicate that our proposed method also displays good capabilities in scenarios with varying shot counts, template lengths and query counts, exhibiting good generalizability

arXiv.org e-Print Archive

Investigation into the nature behind the interesting half levitation behavior of claimed superconductor LK-99

Author: Chen Zihao
Liao Lingyi
Mei Qingsong
Tan Yuanyuan
Publication venue
Publication date: 09/09/2023
Field of study

A recent article published by Lee et.al. claimed to have successfully achieved superconductivity at room temperature (RT) has become a topical issue. Besides the research paper, Lee and his team provided a demonstration video of LK-99 half levitating (HL) on a magnet. Such interesting HL appearance has drawn tremendous sensation both in academia and the network. However, the true identity of LK-99 still remains unclear, i.e., whether the HL behavior can necessarily indicate the diamagnetism behavior of the sample. Here, we fabricated our own LK-99 samples following the procedures reported by Lee et al. We found quite a few sample pieces showing the typical HL that is similar to those reported. Meanwhile, oxidation during the sample preparation was found to deleterious to acquiring HL in the sample, while furnace cooling or water quenching in the last step revealed little effect. However, our careful observations indicated that those HL pieces are more likely simple ferromagnetic. Then we conducted a comprehensive study on the behavior patterns of typical diamagnetism and ferromagnetic substances interacting with a Nd2Fe14B magnet, and provided instructions to distinguish the characteristics between ferromagnetic and diamagnetic to prevent misunderstanding of LK-99 like levitation behavior

arXiv.org e-Print Archive

TARGET: Template-Transferable Backdoor Attack Against Prompt-based NLP Models via GPT4

Author: Chen Qingliang
Huang Yongjian
Liang Chen
Tan Zihao
Publication venue
Publication date: 29/11/2023
Field of study

Prompt-based learning has been widely applied in many low-resource NLP tasks such as few-shot scenarios. However, this paradigm has been shown to be vulnerable to backdoor attacks. Most of the existing attack methods focus on inserting manually predefined templates as triggers in the pre-training phase to train the victim model and utilize the same triggers in the downstream task to perform inference, which tends to ignore the transferability and stealthiness of the templates. In this work, we propose a novel approach of TARGET (Template-trAnsfeRable backdoor attack aGainst prompt-basEd NLP models via GPT4), which is a data-independent attack method. Specifically, we first utilize GPT4 to reformulate manual templates to generate tone-strong and normal templates, and the former are injected into the model as a backdoor trigger in the pre-training phase. Then, we not only directly employ the above templates in the downstream task, but also use GPT4 to generate templates with similar tone to the above templates to carry out transferable attacks. Finally we have conducted extensive experiments on five NLP datasets and three BERT series models, with experimental results justifying that our TARGET method has better attack performance and stealthiness compared to the two-external baseline methods on direct attacks, and in addition achieves satisfactory attack capability in the unseen tone-similar templates

arXiv.org e-Print Archive

High-efficient screening method for identification of key genes in breast cancer through microarray and bioinformatics

Author: Gong Chang
Jiang Wen Guo
Liang Gehao
Liu Zihao
Su An
Tan Luyuan
Publication venue: 'Anticancer Research USA Inc.'
Publication date: 01/08/2017
Field of study

Background/Aim: The aim of the present study was to identify key pathways and genes in breast cancer and develop a new method for screening key genes with abnormal expression based on bioinformatics. Materials and Methods: Three microarray datasets GSE21422, GSE42568 and GSE45827 were downloaded from the Gene Expression Omnibus (GEO) database and differentially expressed genes (DEGs) were analyzed using GEO2R. The gene ontology (GO) and pathway enrichment analysis were established through DAVID database. The protein–protein interaction (PPI) network was performed through the Search Tool for the Retrieval of Interacting Genes (STRING) database and managed by Cytoscape. The overall survival (OS) analysis of the 4 genes including AURKA, CDH1, CDK1 and PPARG that had higher degrees in this network was uncovered Kaplan-Meier analysis. Results: A total of 811 DEGs were identified in breast cancer, which were enriched in biological processes, including cell cycle, mitosis, vessel development and lipid metabolic. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that the up-regulated DEGs were particularly involved in cell cycle, progesterone-mediated oocyte maturation and leukocyte transendothelial migration, while the down-regulated DEGs were mainly involved in regulation of lipolysis, fatty acid degradation and glycerolipid metabolism. Through PPI network analysis, 14 hub genes were identified. Among them, the high expression of AURKA, CDH1 and CDK1 were associated with worse OS of breast cancer patients; while the high expression of PPARG was linked with better OS. Conclusion: The present study identified key pathways and genes involved in breast cancer which are potential molecular targets for breast cancer treatment and diagnosis

Online Research @ Cardiff

Towards Efficient Task-Driven Model Reprogramming with Foundation Models

Author: Lian Zihao
Luo Ran
Tan Mingkui
Wang Yaowei
Xu Shoukai
Yao Jiangchao
Zhang Shuhai
Publication venue
Publication date: 05/04/2023
Field of study

Vision foundation models exhibit impressive power, benefiting from the extremely large model capacity and broad training data. However, in practice, downstream scenarios may only support a small model due to the limited computational resources or efficiency considerations. Moreover, the data used for pretraining foundation models are usually invisible and very different from the target data of downstream tasks. This brings a critical challenge for the real-world application of foundation models: one has to transfer the knowledge of a foundation model to the downstream task that has a quite different architecture with only downstream target data. Existing transfer learning or knowledge distillation methods depend on either the same model structure or finetuning of the foundation model. Thus, naively introducing these methods can be either infeasible or very inefficient. To address this, we propose a Task-Driven Model Reprogramming (TDMR) framework. Specifically, we reprogram the foundation model to project the knowledge into a proxy space, which alleviates the adverse effect of task mismatch and domain inconsistency. Then, we reprogram the target model via progressive distillation from the proxy space to efficiently learn the knowledge from the reprogrammed foundation model. TDMR is compatible with different pre-trained model types (CNN, transformer or their mix) and limited target data, and promotes the wide applications of vision foundation models to downstream tasks in a cost-effective manner. Extensive experiments on different downstream classification tasks and target model structures demonstrate the effectiveness of our methods with both CNNs and transformer foundation models

arXiv.org e-Print Archive

MelodyGLM: Multi-task Pre-training for Symbolic Melody Generation

Author: Huang Zhijie
Sun Lingyun
Tan Xu
Wang Zihao
Wu Xinda
Yu Jiaxing
Zhang Kejun
Zhang Tieyao
Publication venue
Publication date: 20/09/2023
Field of study

Pre-trained language models have achieved impressive results in various music understanding and generation tasks. However, existing pre-training methods for symbolic melody generation struggle to capture multi-scale, multi-dimensional structural information in note sequences, due to the domain knowledge discrepancy between text and music. Moreover, the lack of available large-scale symbolic melody datasets limits the pre-training improvement. In this paper, we propose MelodyGLM, a multi-task pre-training framework for generating melodies with long-term structure. We design the melodic n-gram and long span sampling strategies to create local and global blank infilling tasks for modeling the local and global structures in melodies. Specifically, we incorporate pitch n-grams, rhythm n-grams, and their combined n-grams into the melodic n-gram blank infilling tasks for modeling the multi-dimensional structures in melodies. To this end, we have constructed a large-scale symbolic melody dataset, MelodyNet, containing more than 0.4 million melody pieces. MelodyNet is utilized for large-scale pre-training and domain-specific n-gram lexicon construction. Both subjective and objective evaluations demonstrate that MelodyGLM surpasses the standard and previous pre-training methods. In particular, subjective evaluations show that, on the melody continuation task, MelodyGLM gains average improvements of 0.82, 0.87, 0.78, and 0.94 in consistency, rhythmicity, structure, and overall quality, respectively. Notably, MelodyGLM nearly matches the quality of human-composed melodies on the melody inpainting task

arXiv.org e-Print Archive