Search CORE

13 research outputs found

Parameter-Efficient Detoxification with Contrastive Decoding

Author: Niu Tong
Xiong Caiming
Yavuz Semih
Zhou Yingbo
Publication venue
Publication date: 12/01/2024
Field of study

The field of natural language generation has witnessed significant advancements in recent years, including the development of controllable text generation techniques. However, controlling the attributes of the generated text remains a challenge, especially when aiming to avoid undesirable behavior such as toxicity. In this work, we introduce Detoxification Generator (DETOXIGEN), an inference-time algorithm that steers the generation away from unwanted styles. DETOXIGEN is an ensemble of a pre-trained language model (generator) and a detoxifier. The detoxifier is trained intentionally on the toxic data representative of the undesirable attribute, encouraging it to generate text in that style exclusively. During the actual generation, we use the trained detoxifier to produce undesirable tokens for the generator to contrast against at each decoding step. This approach directly informs the generator to avoid generating tokens that the detoxifier considers highly likely. We evaluate DETOXIGEN on the commonly used REALTOXICITYPROMPTS benchmark (Gehman et al., 2020) with various language models as generators. We find that it significantly outperforms previous approaches in detoxification metrics while not compromising on the generation quality. Moreover, the detoxifier is obtained by soft prompt-tuning using the same backbone language model as the generator. Hence, DETOXIGEN requires only a tiny amount of extra weights from the virtual tokens of the detoxifier to be loaded into GPU memory while decoding, making it a promising lightweight, practical, and parameter-efficient detoxification strategy

arXiv.org e-Print Archive

Accurate Reconstruction of Molecular Phylogenies for Proteins Using Codon and Amino Acid Unified Sequence Alignments (CAUSA)

Author: Chandra Sekhar Pedamallu
Jingjie Hu
Qi Wang
Shuang-yong Xu
Xiaolong Wang
Yingbo Niu
Yu Fu
Yue Zhao
Publication venue
Publication date: 28/12/2011
Field of study

Based on molecular clock hypothesis, and neutral theory of molecular evolution, molecular phylogenies have been widely used for inferring evolutionary history of organisms and individual genes. Traditionally, alignments and phylogeny trees of proteins and their coding DNA sequences are constructed separately, thus often different conclusions were drawn. Here we present a new strategy for sequence alignment and phylogenetic tree reconstruction, codon and amino acid unified sequence alignment (CAUSA), which aligns DNA and protein sequences and draw phylogenetic trees in a unified manner. We demonstrated that CAUSA improves both the accuracy of multiple sequence alignments and phylogenetic trees by solving a variety of molecular evolutionary problems in virus, bacteria and mammals. Our results support the hypothesis that the molecular clock for proteins has two pointers existing separately in DNA and protein sequences. It is more accurate to read the molecular clock by combination (additive) of these two pointers, since the ticking rates of them are sometimes consistent, sometimes different. CAUSA software were released as Open Source under GNU/GPL license, and are downloadable free of charge from the website www.dnapluspro.com

Nature Precedings

DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text

Author: Joty Shafiq
Liu Ye
Niu Tong
Wan Yao
Yavuz Semih
Yu Philip S.
Zhao Wenting
Zhou Yingbo
Publication venue
Publication date: 31/10/2023
Field of study

Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when solely relying on their internal knowledge, especially when answering questions that require less commonly known information. Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge. Nonetheless, recent approaches have primarily emphasized retrieval from unstructured text corpora, owing to its seamless integration into prompts. When using structured data such as knowledge graphs, most methods simplify it into natural text, neglecting the underlying structures. Moreover, a significant gap in the current landscape is the absence of a realistic benchmark for evaluating the effectiveness of grounding LLMs on heterogeneous knowledge sources (e.g., knowledge base and text). To fill this gap, we have curated a comprehensive dataset that poses two unique challenges: (1) Two-hop multi-source questions that require retrieving information from both open-domain structured and unstructured knowledge sources; retrieving information from structured knowledge sources is a critical component in correctly answering the questions. (2) The generation of symbolic queries (e.g., SPARQL for Wikidata) is a key requirement, which adds another layer of challenge. Our dataset is created using a combination of automatic generation through predefined reasoning chains and human annotation. We also introduce a novel approach that leverages multiple retrieval tools, including text passage retrieval and symbolic language-assisted retrieval. Our model outperforms previous approaches by a significant margin, demonstrating its effectiveness in addressing the above-mentioned reasoning challenges

arXiv.org e-Print Archive

Glutathione Peroxidase 7 Utilizes Hydrogen Peroxide Generated by Ero1α to Promote Oxidative Protein Folding

Author: Chih-chen Wang
Lei Wang
Lihui Zhang
Roberto Sitia
Yingbo Niu
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

Accurate Reconstruction of Molecular Phylogenies for Proteins Using Codon and Amino Acid Unified Sequence Alignments (CAUSA)

Author: Chandra Sekhar Pedamallu
Jingjie Hu
Qi Wang
Shuang-yong Xu
Xiaolong Wang
Yingbo Niu
Yu Fu
Yue Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Efficient Immunotherapy of Drug-Free Layered Double Hydroxide Nanoparticles via Neutralizing Excess Acid and Blocking Tumor Cell Autophagy

Author: Hou Shengjie
Huang Yaru
Jia Yingbo
Liu Ruitian
Niu Xiaoyun
Sun Xiaoying
Xu Zhi Ping
Yang Jinju
Zhang Lingxiao
Zhang Lun
Zhu Jie
Publication venue: 'American Chemical Society (ACS)'
Publication date: 23/08/2022
Field of study

Cancer immunotherapy efficacy is largely limited by the suppressive tumor immune microenvironment (TIME) where antitumor immune cells are inhibited and tumor antigens continue to mutate or be lost. To remodel the TIME, we here applied weakly alkaline layered double hydroxide nanoparticles (LDH NPs) to neutralize the excess acid and block autophagy of tumor cells for neoadjuvant cancer immunotherapy. Peritumoral injection of LDH NPs provided a long-term and efficient acid-neutralization in the TIME, blocked the lysosome-mediated autophagy pathway in tumor cells, and increased the levels of antitumor tumor-associated macrophages and T cells. These LDH NPs captured tumor antigens released in the tumor tissues and effectively inhibited the growth of both melanoma and colon tumors in vivo. These findings indicate that LDH NPs, as an immunomodulator and adjuvant, successfully "awaken" and promote the host innate and adaptive immune systems, showing promising potential for solid tumor immunotherapy

Institutional Repository of Institute of Process Engineering, CAS (IPE-IR）

Efficient Immunotherapy of Drug-Free Layered Double Hydroxide Nanoparticles via Neutralizing Excess Acid and Blocking Tumor Cell Autophagy

Author: Hou Shengjie
Huang Yaru
Jia Yingbo
Liu Ruitian
Niu Xiaoyun
Sun Xiaoying
Xu Zhi Ping
Yang Jinju
Zhang Lingxiao
Zhang Lun
Zhu Jie
Publication venue: 'American Chemical Society (ACS)'
Publication date: 23/08/2022
Field of study

Institutional Repository of Institute of Process Engineering, CAS (IPE-IR）

Multi-omics data provide insight into the adaptation of the glasshouse plant Rheum nobile to the alpine subnival zone

Author: Congcong Dong
Dandan Wang
Dongshi Wan
Hongyin Hu
Jianquan Liu
Jin Zhang
Jinli Yang
Mingjia Zhu
Minjie Li
Renping Xu
Richard Abbott
Ying Li
Ying Wu
Yingbo Yang
Yongzhi Yang
Zeyu Zheng
Zhenyue Wang
Zhimin Niu
Zhiqiang Lu
Publication venue: Nature Portfolio
Publication date: 01/09/2023
Field of study

Abstract Subnival glasshouse plants provide a text-book example of high-altitude adaptation with reproductive organs enclosed in specialized semi-translucent bracts, monocarpic reproduction and continuous survival under stress. Here, we present genomic, transcriptomic and metabolomic analyses for one such plant, the Noble rhubarb (Rheum nobile). Comparative genomic analyses show that an expanded number of genes and retained genes from two recent whole-genome duplication events are both relevant to subnival adaptation of this species. Most photosynthesis genes are downregulated within bracts compared to within leaves, and indeed bracts exhibit a sharp reduction in photosynthetic pigments, indicating that the bracts no longer perform photosynthesis. Contrastingly, genes related to flavonol synthesis are upregulated, providing enhanced defense against UV irradiation damage. Additionally, anatomically abnormal mesophyll combined with the downregulation of genes related to mesophyll differentiation in bracts illustrates the innovation and specification of the glass-like bracts. We further detect substantial accumulation of antifreeze proteins (e.g. AFPs, LEAs) and various metabolites (e.g. Proline, Protective sugars, procyanidins) in over-wintering roots. These findings provide new insights into subnival adaptation and the evolution of glasshouse alpine plants

Directory of Open Access Journals

University of St. Andrews - Pure

St Andrews Research Repository

Multi-omics data provide insight into the adaptation of the glasshouse plant <i>Rheum nobile </i>to the alpine subnival zone

Author: Abbott Richard
Dong Congcong
Hu Hongyin
Li Minjie
Li Ying
Liu Jianquan
Lu Zhiqiang
Niu Zhimin
Wan Dongshi
Wang Dandan
Wang Zhenyue
Wu Ying
Xu Renping
Yang Jinli
Yang Yingbo
Yang Yongzhi
Zhang Jin
Zheng Zeyu
Zhu Mingjia
Publication venue
Publication date: 04/09/2023
Field of study

Subnival glasshouse plants provide a text-book example of high-altitude adaptation with reproductive organs enclosed in specialized semi-translucent bracts, monocarpic reproduction and continuous survival under stress. Here, we present genomic, transcriptomic and metabolomic analyses for one such plant, the Noble rhubarb (Rheum nobile). Comparative genomic analyses show that an expanded number of genes and retained genes from two recent whole-genome duplication events are both relevant to subnival adaptation of this species. Most photosynthesis genes are downregulated within bracts compared to within leaves, and indeed bracts exhibit a sharp reduction in photosynthetic pigments, indicating that the bracts no longer perform photosynthesis. Contrastingly, genes related to flavonol synthesis are upregulated, providing enhanced defense against UV irradiation damage. Additionally, anatomically abnormal mesophyll combined with the downregulation of genes related to mesophyll differentiation in bracts illustrates the innovation and specification of the glass-like bracts. We further detect substantial accumulation of antifreeze proteins (e.g. AFPs, LEAs) and various metabolites (e.g. Proline, Protective sugars, procyanidins) in over-wintering roots. These findings provide new insights into subnival adaptation and the evolution of glasshouse alpine plants