85 research outputs found
CoPT: Mitigating Bias in Pre-trained Language Models through Counterfactual Contrastive Prompt Tuning
Pre-trained Language Models are widely used in many important real-world
applications. However, recent studies show that these models can encode social
biases from large pre-training corpora and even amplify biases in downstream
applications. To address this challenge, we propose CoPT, an efficient and
effective debias-while-prompt tuning method for mitigating biases via
counterfactual contrastive prompt tuning on downstream tasks. Our experiments
conducted on three extrinsic bias benchmarks demonstrate the effectiveness of
CoPT on bias mitigation during the prompt tuning process and its
adaptability to existing upstream debiased language models. These findings
indicate the strength of CoPT and provide promising avenues for further
enhancement in bias mitigation on downstream tasks.Comment: To appear in Findings of EMNLP 202
DiffusionInst: Diffusion Model for Instance Segmentation
Diffusion frameworks have achieved comparable performance with previous
state-of-the-art image generation models. Researchers are curious about its
variants in discriminative tasks because of its powerful noise-to-image
denoising pipeline. This paper proposes DiffusionInst, a novel framework that
represents instances as instance-aware filters and formulates instance
segmentation as a noise-to-filter denoising process. The model is trained to
reverse the noisy groundtruth without any inductive bias from RPN. During
inference, it takes a randomly generated filter as input and outputs mask in
one-step or multi-step denoising. Extensive experimental results on COCO and
LVIS show that DiffusionInst achieves competitive performance compared to
existing instance segmentation models with various backbones, such as ResNet
and Swin Transformers. We hope our work could serve as a strong baseline, which
could inspire designing more efficient diffusion frameworks for challenging
discriminative tasks. Our code is available in
https://github.com/chenhaoxing/DiffusionInst
Faithful Low-Resource Data-to-Text Generation through Cycle Training
Methods to generate text from structured data have advanced significantly in
recent years, primarily due to fine-tuning of pre-trained language models on
large datasets. However, such models can fail to produce output faithful to the
input data, particularly on out-of-domain data. Sufficient annotated data is
often not available for specific domains, leading us to seek an unsupervised
approach to improve the faithfulness of output text. Since the problem is
fundamentally one of consistency between the representations of the structured
data and text, we evaluate the effectiveness of cycle training in this work.
Cycle training uses two models which are inverses of each other: one that
generates text from structured data, and one which generates the structured
data from natural language text. We show that cycle training, when initialized
with a small amount of supervised data (100 samples in our case), achieves
nearly the same performance as fully supervised approaches for the data-to-text
generation task on the WebNLG, E2E, WTQ, and WSQL datasets. We perform
extensive empirical analysis with automated evaluation metrics and a newly
designed human evaluation schema to reveal different cycle training strategies'
effectiveness of reducing various types of generation errors. Our code is
publicly available at https://github.com/Edillower/CycleNLG.Comment: 19 pages, 4 figures, ACL 202
Backpropagation Path Search On Adversarial Transferability
Deep neural networks are vulnerable to adversarial examples, dictating the
imperativeness to test the model's robustness before deployment. Transfer-based
attackers craft adversarial examples against surrogate models and transfer them
to victim models deployed in the black-box situation. To enhance the
adversarial transferability, structure-based attackers adjust the
backpropagation path to avoid the attack from overfitting the surrogate model.
However, existing structure-based attackers fail to explore the convolution
module in CNNs and modify the backpropagation graph heuristically, leading to
limited effectiveness. In this paper, we propose backPropagation pAth Search
(PAS), solving the aforementioned two problems. We first propose SkipConv to
adjust the backpropagation path of convolution by structural
reparameterization. To overcome the drawback of heuristically designed
backpropagation paths, we further construct a DAG-based search space, utilize
one-step approximation for path evaluation and employ Bayesian Optimization to
search for the optimal path. We conduct comprehensive experiments in a wide
range of transfer settings, showing that PAS improves the attack success rate
by a huge margin for both normally trained and defense models.Comment: Accepted by ICCV202
DiffUTE: Universal Text Editing Diffusion Model
Diffusion model based language-guided image editing has achieved great
success recently. However, existing state-of-the-art diffusion models struggle
with rendering correct text and text style during generation. To tackle this
problem, we propose a universal self-supervised text editing diffusion model
(DiffUTE), which aims to replace or modify words in the source image with
another one while maintaining its realistic appearance. Specifically, we build
our model on a diffusion model and carefully modify the network structure to
enable the model for drawing multilingual characters with the help of glyph and
position information. Moreover, we design a self-supervised learning framework
to leverage large amounts of web data to improve the representation ability of
the model. Experimental results show that our method achieves an impressive
performance and enables controllable editing on in-the-wild images with high
fidelity. Our code will be avaliable in
\url{https://github.com/chenhaoxing/DiffUTE}
CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation
Since the natural language processing (NLP) community started to make large
language models (LLMs), such as GPT-4, act as a critic to evaluate the quality
of generated texts, most of them only train a critique generation model of a
specific scale on specific datasets. We argue that a comprehensive
investigation on the key factor of LLM-based evaluation models, such as scaling
properties, is lacking, so that it is still inconclusive whether these models
have potential to replace GPT-4's evaluation in practical scenarios. In this
paper, we propose a new critique generation model called CritiqueLLM, which
includes a dialogue-based prompting method for high-quality referenced /
reference-free evaluation data. Experimental results show that our model can
achieve comparable evaluation performance to GPT-4 especially in system-level
correlations, and even outperform GPT-4 in 3 out of 8 tasks in a challenging
reference-free setting. We conduct detailed analysis to show promising scaling
properties of our model in the quality of generated critiques. We also
demonstrate that our generated critiques can act as scalable feedback to
directly improve the generation quality of LLMs.Comment: 18 pages, 5 figure
Chronic exposure to low-level lipopolysaccharide dampens influenza-mediated inflammatory response via A20 and PPAR network
Influenza A virus (IAV) infection leads to severe inflammation, and while epithelial-driven inflammatory responses occur via activation of NF-κB, the factors that modulate inflammation, particularly the negative regulators are less well-defined. In this study we show that A20 is a crucial molecular switch that dampens IAV-induced inflammatory responses. Chronic exposure to low-dose LPS environment can restrict this excessive inflammation. The mechanisms that this environment provides to suppress inflammation remain elusive. Here, our evidences show that chronic exposure to low-dose LPS suppressed IAV infection or LPS stimulation-induced inflammation in vitro and in vivo. Chronic low-dose LPS environment increases A20 expression, which in turn positively regulates PPAR-α and -γ, thus dampens the NF-κB signaling pathway and NLRP3 inflammasome activation. Knockout of A20 abolished the inhibitory effect on inflammation. Thus, A20 and its induced PPAR-α and -γ play a key role in suppressing excessive inflammatory responses in the chronic low-dose LPS environment
Design of New energy ship power safety monitoring system based on Internet of things
With the current technology of ship power battery and alternative fuel power system gradually mature, many countries have issued relevant policies to emphasize the promotion of ship new energy, while the current domestic ship related active safety system is still in the state of low automation, low intelligence and low integration. In this regard, the project introduces artificial intelligence algorithm to design a set of new energy ship power module monitoring system for fuel cell ships and pure electric ships, which can be used for marine power battery output management and safety monitoring, mainly including hydrogen fuel cell safety monitoring system, power battery (buffer cell) safety monitoring system and power integrated safety monitoring system. This work combines embedded technology with Internet of things technology and artificial intelligence algorithm to solve the safety management problem of new energy ship power system. If it is applied to the actual ship, obvious social and economic benefits can be achieved
- …