Search CORE

85 research outputs found

Co $^2$ PT: Mitigating Bias in Pre-trained Language Models through Counterfactual Contrastive Prompt Tuning

Author: Caverlee James
Dong Xiangjue
Teleki Maria
Wang Zhuoer
Zhu Ziwei
Publication venue
Publication date: 19/10/2023
Field of study

Pre-trained Language Models are widely used in many important real-world applications. However, recent studies show that these models can encode social biases from large pre-training corpora and even amplify biases in downstream applications. To address this challenge, we propose Co

^2

PT, an efficient and effective debias-while-prompt tuning method for mitigating biases via counterfactual contrastive prompt tuning on downstream tasks. Our experiments conducted on three extrinsic bias benchmarks demonstrate the effectiveness of Co

^2

PT on bias mitigation during the prompt tuning process and its adaptability to existing upstream debiased language models. These findings indicate the strength of Co

^2

PT and provide promising avenues for further enhancement in bias mitigation on downstream tasks.Comment: To appear in Findings of EMNLP 202

arXiv.org e-Print Archive

DiffusionInst: Diffusion Model for Instance Segmentation

Author: Chen Haoxing
Gu Zhangxuan
Lan Jun
Meng Changhua
Wang Weiqiang
Xu Zhuoer
Publication venue
Publication date: 28/12/2022
Field of study

Diffusion frameworks have achieved comparable performance with previous state-of-the-art image generation models. Researchers are curious about its variants in discriminative tasks because of its powerful noise-to-image denoising pipeline. This paper proposes DiffusionInst, a novel framework that represents instances as instance-aware filters and formulates instance segmentation as a noise-to-filter denoising process. The model is trained to reverse the noisy groundtruth without any inductive bias from RPN. During inference, it takes a randomly generated filter as input and outputs mask in one-step or multi-step denoising. Extensive experimental results on COCO and LVIS show that DiffusionInst achieves competitive performance compared to existing instance segmentation models with various backbones, such as ResNet and Swin Transformers. We hope our work could serve as a strong baseline, which could inspire designing more efficient diffusion frameworks for challenging discriminative tasks. Our code is available in https://github.com/chenhaoxing/DiffusionInst

arXiv.org e-Print Archive

Faithful Low-Resource Data-to-Text Generation through Cycle Training

Author: Collins Marcus
Filice Simone
Malmasi Shervin
Rokhlenko Oleg
Vedula Nikhita
Wang Zhuoer
Publication venue
Publication date: 24/05/2023
Field of study

Methods to generate text from structured data have advanced significantly in recent years, primarily due to fine-tuning of pre-trained language models on large datasets. However, such models can fail to produce output faithful to the input data, particularly on out-of-domain data. Sufficient annotated data is often not available for specific domains, leading us to seek an unsupervised approach to improve the faithfulness of output text. Since the problem is fundamentally one of consistency between the representations of the structured data and text, we evaluate the effectiveness of cycle training in this work. Cycle training uses two models which are inverses of each other: one that generates text from structured data, and one which generates the structured data from natural language text. We show that cycle training, when initialized with a small amount of supervised data (100 samples in our case), achieves nearly the same performance as fully supervised approaches for the data-to-text generation task on the WebNLG, E2E, WTQ, and WSQL datasets. We perform extensive empirical analysis with automated evaluation metrics and a newly designed human evaluation schema to reveal different cycle training strategies' effectiveness of reducing various types of generation errors. Our code is publicly available at https://github.com/Edillower/CycleNLG.Comment: 19 pages, 4 figures, ACL 202

arXiv.org e-Print Archive

Backpropagation Path Search On Adversarial Transferability

Author: Cui Shiwen
Gu Zhangxuan
Meng Changhua
Wang Weiqiang
Xu Zhuoer
Zhang Jianping
Publication venue
Publication date: 15/08/2023
Field of study

Deep neural networks are vulnerable to adversarial examples, dictating the imperativeness to test the model's robustness before deployment. Transfer-based attackers craft adversarial examples against surrogate models and transfer them to victim models deployed in the black-box situation. To enhance the adversarial transferability, structure-based attackers adjust the backpropagation path to avoid the attack from overfitting the surrogate model. However, existing structure-based attackers fail to explore the convolution module in CNNs and modify the backpropagation graph heuristically, leading to limited effectiveness. In this paper, we propose backPropagation pAth Search (PAS), solving the aforementioned two problems. We first propose SkipConv to adjust the backpropagation path of convolution by structural reparameterization. To overcome the drawback of heuristically designed backpropagation paths, we further construct a DAG-based search space, utilize one-step approximation for path evaluation and employ Bayesian Optimization to search for the optimal path. We conduct comprehensive experiments in a wide range of transfer settings, showing that PAS improves the attack success rate by a huge margin for both normally trained and defense models.Comment: Accepted by ICCV202

arXiv.org e-Print Archive

DiffUTE: Universal Text Editing Diffusion Model

Author: Chen Haoxing
Gu Zhangxuan
Lan Jun
Li Yaohui
Meng Changhua
Wang Weiqiang
Xu Zhuoer
Zheng Xing
Zhu Huijia
Publication venue
Publication date: 18/05/2023
Field of study

Diffusion model based language-guided image editing has achieved great success recently. However, existing state-of-the-art diffusion models struggle with rendering correct text and text style during generation. To tackle this problem, we propose a universal self-supervised text editing diffusion model (DiffUTE), which aims to replace or modify words in the source image with another one while maintaining its realistic appearance. Specifically, we build our model on a diffusion model and carefully modify the network structure to enable the model for drawing multilingual characters with the help of glyph and position information. Moreover, we design a self-supervised learning framework to leverage large amounts of web data to improve the representation ability of the model. Experimental results show that our method achieves an impressive performance and enables controllable editing on in-the-wild images with high fidelity. Our code will be avaliable in \url{https://github.com/chenhaoxing/DiffUTE}

arXiv.org e-Print Archive

CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation

Author: Cheng Jiale
Dong Yuxiao
Feng Zhuoer
Huang Minlie
Ke Pei
Lei Xuanyu
Liu Xiao
Tang Jie
Wang Hongning
Wang Shengyuan
Wen Bosi
Zeng Aohan
Publication venue
Publication date: 30/11/2023
Field of study

Since the natural language processing (NLP) community started to make large language models (LLMs), such as GPT-4, act as a critic to evaluate the quality of generated texts, most of them only train a critique generation model of a specific scale on specific datasets. We argue that a comprehensive investigation on the key factor of LLM-based evaluation models, such as scaling properties, is lacking, so that it is still inconclusive whether these models have potential to replace GPT-4's evaluation in practical scenarios. In this paper, we propose a new critique generation model called CritiqueLLM, which includes a dialogue-based prompting method for high-quality referenced / reference-free evaluation data. Experimental results show that our model can achieve comparable evaluation performance to GPT-4 especially in system-level correlations, and even outperform GPT-4 in 3 out of 8 tasks in a challenging reference-free setting. We conduct detailed analysis to show promising scaling properties of our model in the quality of generated critiques. We also demonstrate that our generated critiques can act as scalable feedback to directly improve the generation quality of LLMs.Comment: 18 pages, 5 figure

arXiv.org e-Print Archive

Chronic exposure to low-level lipopolysaccharide dampens influenza-mediated inflammatory response via A20 and PPAR network

Author: Alan Chen-Yu Hsu
Alan Chen-Yu Hsu
Alan Chen-Yu Hsu
Fang Wang
Shengyu Jiang
Xiaoping Guo
Xu Zuo
Yinuo Gu
Zhengjie Zhou
Zhuoer Ouyang
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2023
Field of study

Influenza A virus (IAV) infection leads to severe inflammation, and while epithelial-driven inflammatory responses occur via activation of NF-κB, the factors that modulate inflammation, particularly the negative regulators are less well-defined. In this study we show that A20 is a crucial molecular switch that dampens IAV-induced inflammatory responses. Chronic exposure to low-dose LPS environment can restrict this excessive inflammation. The mechanisms that this environment provides to suppress inflammation remain elusive. Here, our evidences show that chronic exposure to low-dose LPS suppressed IAV infection or LPS stimulation-induced inflammation in vitro and in vivo. Chronic low-dose LPS environment increases A20 expression, which in turn positively regulates PPAR-α and -γ, thus dampens the NF-κB signaling pathway and NLRP3 inflammasome activation. Knockout of A20 abolished the inhibitory effect on inflammation. Thus, A20 and its induced PPAR-α and -γ play a key role in suppressing excessive inflammatory responses in the chronic low-dose LPS environment

Directory of Open Access Journals

Design of New energy ship power safety monitoring system based on Internet of things

Author: Qiang Peng
Yixue Wang
Zhuoer Wang
Publication venue: 'EDP Sciences'
Publication date: 21/05/2021
Field of study

With the current technology of ship power battery and alternative fuel power system gradually mature, many countries have issued relevant policies to emphasize the promotion of ship new energy, while the current domestic ship related active safety system is still in the state of low automation, low intelligence and low integration. In this regard, the project introduces artificial intelligence algorithm to design a set of new energy ship power module monitoring system for fuel cell ships and pure electric ships, which can be used for marine power battery output management and safety monitoring, mainly including hydrogen fuel cell safety monitoring system, power battery (buffer cell) safety monitoring system and power integrated safety monitoring system. This work combines embedded technology with Internet of things technology and artificial intelligence algorithm to solve the safety management problem of new energy ship power system. If it is applied to the actual ship, obvious social and economic benefits can be achieved

EDP Sciences OAI-PMH repository (1.2.0)