41 research outputs found
Assisting Language Learners: Automated Trans-Lingual Definition Generation via Contrastive Prompt Learning
The standard definition generation task requires to automatically produce
mono-lingual definitions (e.g., English definitions for English words), but
ignores that the generated definitions may also consist of unfamiliar words for
language learners. In this work, we propose a novel task of Trans-Lingual
Definition Generation (TLDG), which aims to generate definitions in another
language, i.e., the native speaker's language. Initially, we explore the
unsupervised manner of this task and build up a simple implementation of
fine-tuning the multi-lingual machine translation model. Then, we develop two
novel methods, Prompt Combination and Contrastive Prompt Learning, for further
enhancing the quality of the generation. Our methods are evaluated against the
baseline Pipeline method in both rich- and low-resource settings, and we
empirically establish its superiority in generating higher-quality
trans-lingual definitions.Comment: Accepted by ACL-BEA worksho
Accelerating Score-based Generative Models with Preconditioned Diffusion Sampling
Score-based generative models (SGMs) have recently emerged as a promising
class of generative models. However, a fundamental limitation is that their
inference is very slow due to a need for many (e.g., 2000) iterations of
sequential computations. An intuitive acceleration method is to reduce the
sampling iterations which however causes severe performance degradation. We
investigate this problem by viewing the diffusion sampling process as a
Metropolis adjusted Langevin algorithm, which helps reveal the underlying cause
to be ill-conditioned curvature. Under this insight, we propose a
model-agnostic preconditioned diffusion sampling (PDS) method that leverages
matrix preconditioning to alleviate the aforementioned problem. Crucially, PDS
is proven theoretically to converge to the original target distribution of a
SGM, no need for retraining. Extensive experiments on three image datasets with
a variety of resolutions and diversity validate that PDS consistently
accelerates off-the-shelf SGMs whilst maintaining the synthesis quality. In
particular, PDS can accelerate by up to 29x on more challenging high resolution
(1024x1024) image generation.Comment: ECCV 2022. Code is available at https://github.com/fudan-zvg/PD
Probabilistic computation and uncertainty quantification with emerging covariance
Building robust, interpretable, and secure AI system requires quantifying and
representing uncertainty under a probabilistic perspective to mimic human
cognitive abilities. However, probabilistic computation presents significant
challenges for most conventional artificial neural network, as they are
essentially implemented in a deterministic manner. In this paper, we develop an
efficient probabilistic computation framework by truncating the probabilistic
representation of neural activation up to its mean and covariance and construct
a moment neural network that encapsulates the nonlinear coupling between the
mean and covariance of the underlying stochastic network. We reveal that when
only the mean but not the covariance is supervised during gradient-based
learning, the unsupervised covariance spontaneously emerges from its nonlinear
coupling with the mean and faithfully captures the uncertainty associated with
model predictions. Our findings highlight the inherent simplicity of
probabilistic computation by seamlessly incorporating uncertainty into model
prediction, paving the way for integrating it into large-scale AI systems.Comment: Code is available in
https://github.com/AwakerMhy/probabilistic-computing-mn
Hufu: A Modality-Agnositc Watermarking System for Pre-Trained Transformers via Permutation Equivariance
With the blossom of deep learning models and services, it has become an
imperative concern to safeguard the valuable model parameters from being
stolen. Watermarking is considered an important tool for ownership
verification. However, current watermarking schemes are customized for
different models and tasks, hard to be integrated as an integrated intellectual
protection service. We propose Hufu, a modality-agnostic watermarking system
for pre-trained Transformer-based models, relying on the permutation
equivariance property of Transformers. Hufu embeds watermark by fine-tuning the
pre-trained model on a set of data samples specifically permuted, and the
embedded model essentially contains two sets of weights -- one for normal use
and the other for watermark extraction which is triggered on permuted inputs.
The permutation equivariance ensures minimal interference between these two
sets of model weights and thus high fidelity on downstream tasks. Since our
method only depends on the model itself, it is naturally modality-agnostic,
task-independent, and trigger-sample-free. Extensive experiments on the
state-of-the-art vision Transformers, BERT, and GPT2 have demonstrated Hufu's
superiority in meeting watermarking requirements including effectiveness,
efficiency, fidelity, and robustness, showing its great potential to be
deployed as a uniform ownership verification service for various Transformers
Permutation Equivariance of Transformers and Its Applications
Revolutionizing the field of deep learning, Transformer-based models have
achieved remarkable performance in many tasks. Recent research has recognized
these models are robust to shuffling but are limited to inter-token permutation
in the forward propagation. In this work, we propose our definition of
permutation equivariance, a broader concept covering both inter- and intra-
token permutation in the forward and backward propagation of neural networks.
We rigorously proved that such permutation equivariance property can be
satisfied on most vanilla Transformer-based models with almost no adaptation.
We examine the property over a range of state-of-the-art models including ViT,
Bert, GPT, and others, with experimental validations. Further, as a
proof-of-concept, we explore how real-world applications including
privacy-enhancing split learning, and model authorization, could exploit the
permutation equivariance property, which implicates wider, intriguing
application scenarios.Comment: Accepted by CVPR 202
SGS: Mutant Reduction for Higher-order Mutation-based Fault Localization
MBFL (Mutation-Based Fault Localization) is one of the most commonly studied fault localization techniques due to its promising fault localization effectiveness. However, MBFL incurs a high execution cost as it needs to execute the test suite on a large number of mutants. While previous studies have proposed mutant reduction methods for FOMs (First-Order Mutants) to help alleviate the cost of MBFL, the reduction of HOMs (Higher-Order Mutants) has not been thoroughly investigated. In this study, we propose SGS (Statement Granularity Sampling), a method which conducts HOMs reduction for HMBFL (Higher-Order Mutation-Based Fault Localization). Considering the relationship between HOMs and statements, we sample HOMs at the statement level to ensure each statement has corresponding HOMs. We empirically evaluate the fault localization effectiveness of HMBFL using SGS on 237 multiple-fault programs taken from the SIR and Codeflaws benchmarks. The experimental results show that (1) The best sampling ratio for HMBFL with SGS is 20%, which preserves the performance and reduces execution costs by 80% ; (2) The fault localization accuracy of HMBFL with SGS outperforms the state-of-the-art SBFL (Spectrum-Based Fault Localization) and MBFL techniques by 20%
Formation of a streamer blob via the merger of multiple plasma clumps below 2Rs
Context. Propagating streamer blobs could be an important source of
disturbances in the solar wind. Direct observations on formation of streamer
blobs could be a proxy for understanding the formation of small-scale
structures and disturbances in the solar wind.
Aims. We aim to investigate how a streamer blob is formed before it is
observed in the outer corona.
Methods.
Usingspecialcoordinated-observationsfromSOHO/LASCO,GOES/SUVIandSDO/AIA, we
study the precursors of a streamer blob as seen in the corona below 2.0 solar
radii (Rs).
Results. We found that the streamer blob was formed due to the gradual
merging of three clumps of brightenings initiated from the lower corona at
about 1.8Rs, which is likely driven by expansion of the loop system at the base
of the streamer. The acceleration of the blob starts from 1.9Rs or lower. It
propagates along the south flank of the streamer where an expanding elongated
brightening occurs coincidently.
Conclusions. Our observations demonstrate that formation of a streamer blob
is a complex process. We suggest that the expansion of the loop results in a
pinching-off flux-rope-like blob at the loop apex below 2Rs. When the blob
moves outward, it can be transferred across the overlying loops through
interchange/component magnetic reconnection and then is released into the open
field system. When the blob moves toward open field lines, interchange magnetic
reconnections might also occur, and that can accelerate the plasma blob
intermittently whilst allow it to transfer across the open field lines. Such
dynamics in a streamer blob might further trigger small-scale disturbances in
the solar wind such as switchbacks in the inner heliosphere
Constructing Heterostructure through Bidentate Coordination toward Operationally Stable Inverted Perovskite Solar Cells
It has been reported that one of the influencing factors leading to stability issues in iodine-containing perovskite solar cells is the iodine loss from the perovskite layer. Herein, bidentate coordination is used with undercoordinated I− of the perovskite surface to construct the stable perovskite-based heterostructure. This strong halogen bonding effectively inhibits interfacial migration of I− into functional layers such as C60 and Ag. Moreover, passivation of the undercoordinated I− suppresses the release of I2 and further delays the formation of voids at the perovskite surface. The resulting inverted perovskite solar cell exhibits a power conversion efficiency of 22.59% and the unencapsulated device maintains 96.15% of its initial value after continuous operation for 500 h under illumination.journal articl