Search CORE

41 research outputs found

Assisting Language Learners: Automated Trans-Lingual Definition Generation via Contrastive Prompt Learning

Author: Jiang Yong
Li Dawei
Li Yanran
Shang Chenming
Shi Chufan
Zhang Hengyuan
Publication venue
Publication date: 09/06/2023
Field of study

The standard definition generation task requires to automatically produce mono-lingual definitions (e.g., English definitions for English words), but ignores that the generated definitions may also consist of unfamiliar words for language learners. In this work, we propose a novel task of Trans-Lingual Definition Generation (TLDG), which aims to generate definitions in another language, i.e., the native speaker's language. Initially, we explore the unsupervised manner of this task and build up a simple implementation of fine-tuning the multi-lingual machine translation model. Then, we develop two novel methods, Prompt Combination and Contrastive Prompt Learning, for further enhancing the quality of the generation. Our methods are evaluated against the baseline Pipeline method in both rich- and low-resource settings, and we empirically establish its superiority in generating higher-quality trans-lingual definitions.Comment: Accepted by ACL-BEA worksho

arXiv.org e-Print Archive

Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling

Author: Feng Jianfeng
Ma Hengyuan
Zhang Li
Zhu Xiatian
Publication venue
Publication date: 19/07/2022
Field of study

Score-based generative models (SGMs) have recently emerged as a promising class of generative models. However, a fundamental limitation is that their inference is very slow due to a need for many (e.g., 2000) iterations of sequential computations. An intuitive acceleration method is to reduce the sampling iterations which however causes severe performance degradation. We investigate this problem by viewing the diffusion sampling process as a Metropolis adjusted Langevin algorithm, which helps reveal the underlying cause to be ill-conditioned curvature. Under this insight, we propose a model-agnostic preconditioned diffusion sampling (PDS) method that leverages matrix preconditioning to alleviate the aforementioned problem. Crucially, PDS is proven theoretically to converge to the original target distribution of a SGM, no need for retraining. Extensive experiments on three image datasets with a variety of resolutions and diversity validate that PDS consistently accelerates off-the-shelf SGMs whilst maintaining the synthesis quality. In particular, PDS can accelerate by up to 29x on more challenging high resolution (1024x1024) image generation.Comment: ECCV 2022. Code is available at https://github.com/fudan-zvg/PD

arXiv.org e-Print Archive

Probabilistic computation and uncertainty quantification with emerging covariance

Author: Feng Jianfeng
Lu Wenlian
Ma Hengyuan
Qi Yang
Zhang Li
Publication venue
Publication date: 12/01/2024
Field of study

Building robust, interpretable, and secure AI system requires quantifying and representing uncertainty under a probabilistic perspective to mimic human cognitive abilities. However, probabilistic computation presents significant challenges for most conventional artificial neural network, as they are essentially implemented in a deterministic manner. In this paper, we develop an efficient probabilistic computation framework by truncating the probabilistic representation of neural activation up to its mean and covariance and construct a moment neural network that encapsulates the nonlinear coupling between the mean and covariance of the underlying stochastic network. We reveal that when only the mean but not the covariance is supervised during gradient-based learning, the unsupervised covariance spontaneously emerges from its nonlinear coupling with the mean and faithfully captures the uncertainty associated with model predictions. Our findings highlight the inherent simplicity of probabilistic computation by seamlessly incorporating uncertainty into model prediction, paving the way for integrating it into large-scale AI systems.Comment: Code is available in https://github.com/AwakerMhy/probabilistic-computing-mn

arXiv.org e-Print Archive

Hufu: A Modality-Agnositc Watermarking System for Pre-Trained Transformers via Permutation Equivariance

Author: Li Baochun
Ma Xingjun
Xiang Liyao
Xu Hengyuan
Yang Borui
Publication venue
Publication date: 09/03/2024
Field of study

With the blossom of deep learning models and services, it has become an imperative concern to safeguard the valuable model parameters from being stolen. Watermarking is considered an important tool for ownership verification. However, current watermarking schemes are customized for different models and tasks, hard to be integrated as an integrated intellectual protection service. We propose Hufu, a modality-agnostic watermarking system for pre-trained Transformer-based models, relying on the permutation equivariance property of Transformers. Hufu embeds watermark by fine-tuning the pre-trained model on a set of data samples specifically permuted, and the embedded model essentially contains two sets of weights -- one for normal use and the other for watermark extraction which is triggered on permuted inputs. The permutation equivariance ensures minimal interference between these two sets of model weights and thus high fidelity on downstream tasks. Since our method only depends on the model itself, it is naturally modality-agnostic, task-independent, and trigger-sample-free. Extensive experiments on the state-of-the-art vision Transformers, BERT, and GPT2 have demonstrated Hufu's superiority in meeting watermarking requirements including effectiveness, efficiency, fidelity, and robustness, showing its great potential to be deployed as a uniform ownership verification service for various Transformers

arXiv.org e-Print Archive

Permutation Equivariance of Transformers and Its Applications

Author: Chu Pengzhi
Li Baochun
Xiang Liyao
Xu Hengyuan
Yao Dixi
Ye Hangyu
Publication venue
Publication date: 31/03/2024
Field of study

Revolutionizing the field of deep learning, Transformer-based models have achieved remarkable performance in many tasks. Recent research has recognized these models are robust to shuffling but are limited to inter-token permutation in the forward propagation. In this work, we propose our definition of permutation equivariance, a broader concept covering both inter- and intra- token permutation in the forward and backward propagation of neural networks. We rigorously proved that such permutation equivariance property can be satisfied on most vanilla Transformer-based models with almost no adaptation. We examine the property over a range of state-of-the-art models including ViT, Bert, GPT, and others, with experimental validations. Further, as a proof-of-concept, we explore how real-world applications including privacy-enhancing split learning, and model authorization, could exploit the permutation equivariance property, which implicates wider, intriguing application scenarios.Comment: Accepted by CVPR 202

arXiv.org e-Print Archive

SGS: Mutant Reduction for Higher-order Mutation-based Fault Localization

Author: Chen Xiang
Doyle Paul
Fan Luxi
Li Zheng
Liu Hengyuan
Liu Yong
Wang Haifeng
Publication venue: Technological University Dublin
Publication date: 01/01/2023
Field of study

MBFL (Mutation-Based Fault Localization) is one of the most commonly studied fault localization techniques due to its promising fault localization effectiveness. However, MBFL incurs a high execution cost as it needs to execute the test suite on a large number of mutants. While previous studies have proposed mutant reduction methods for FOMs (First-Order Mutants) to help alleviate the cost of MBFL, the reduction of HOMs (Higher-Order Mutants) has not been thoroughly investigated. In this study, we propose SGS (Statement Granularity Sampling), a method which conducts HOMs reduction for HMBFL (Higher-Order Mutation-Based Fault Localization). Considering the relationship between HOMs and statements, we sample HOMs at the statement level to ensure each statement has corresponding HOMs. We empirically evaluate the fault localization effectiveness of HMBFL using SGS on 237 multiple-fault programs taken from the SIR and Codeflaws benchmarks. The experimental results show that (1) The best sampling ratio for HMBFL with SGS is 20%, which preserves the performance and reduces execution costs by 80% ; (2) The fault localization accuracy of HMBFL with SGS outperforms the state-of-the-art SBFL (Spectrum-Based Fault Localization) and MBFL techniques by 20%

Arrow@TUDublin

Formation of a streamer blob via the merger of multiple plasma clumps below 2Rs

Author: Deng Kaiwen
Fu Hui
Huang Zhenghua
Li Haiyi
Qi Youqian
Song Hongqiang
Wei Hengyuan
Xia Lidong
Xiong Ming
Zhang Chao
Publication venue
Publication date: 04/02/2024
Field of study

Context. Propagating streamer blobs could be an important source of disturbances in the solar wind. Direct observations on formation of streamer blobs could be a proxy for understanding the formation of small-scale structures and disturbances in the solar wind. Aims. We aim to investigate how a streamer blob is formed before it is observed in the outer corona. Methods. Usingspecialcoordinated-observationsfromSOHO/LASCO,GOES/SUVIandSDO/AIA, we study the precursors of a streamer blob as seen in the corona below 2.0 solar radii (Rs). Results. We found that the streamer blob was formed due to the gradual merging of three clumps of brightenings initiated from the lower corona at about 1.8Rs, which is likely driven by expansion of the loop system at the base of the streamer. The acceleration of the blob starts from 1.9Rs or lower. It propagates along the south flank of the streamer where an expanding elongated brightening occurs coincidently. Conclusions. Our observations demonstrate that formation of a streamer blob is a complex process. We suggest that the expansion of the loop results in a pinching-off flux-rope-like blob at the loop apex below 2Rs. When the blob moves outward, it can be transferred across the overlying loops through interchange/component magnetic reconnection and then is released into the open field system. When the blob moves toward open field lines, interchange magnetic reconnections might also occur, and that can accelerate the plasma blob intermittently whilst allow it to transfer across the open field lines. Such dynamics in a streamer blob might further trigger small-scale disturbances in the solar wind such as switchbacks in the inner heliosphere

arXiv.org e-Print Archive

Constructing Heterostructure through Bidentate Coordination toward Operationally Stable Inverted Perovskite Solar Cells

Author: Ding Chenfeng
Guo Ting
Han Liyuan
Huo Xiaomin
Li Tongtong
Liu Xiaomin
Ono Luis K.
Qi Yabing
Wang Hengyuan
Wang Yanbo
Wu Tianhao
Yuan Shuai
Zhang Caiyi
Zhang Congyang
Zhang Jiahao
Publication venue: Wiley-VCH GmbH
Publication date: 14/06/2023
Field of study

It has been reported that one of the influencing factors leading to stability issues in iodine-containing perovskite solar cells is the iodine loss from the perovskite layer. Herein, bidentate coordination is used with undercoordinated I− of the perovskite surface to construct the stable perovskite-based heterostructure. This strong halogen bonding effectively inhibits interfacial migration of I− into functional layers such as C60 and Ag. Moreover, passivation of the undercoordinated I− suppresses the release of I2 and further delays the formation of voids at the perovskite surface. The resulting inverted perovskite solar cell exhibits a power conversion efficiency of 22.59% and the unencapsulated device maintains 96.15% of its initial value after continuous operation for 500 h under illumination.journal articl

OIST Institutional Repository