13 research outputs found
Parameter-Efficient Detoxification with Contrastive Decoding
The field of natural language generation has witnessed significant
advancements in recent years, including the development of controllable text
generation techniques. However, controlling the attributes of the generated
text remains a challenge, especially when aiming to avoid undesirable behavior
such as toxicity. In this work, we introduce Detoxification Generator
(DETOXIGEN), an inference-time algorithm that steers the generation away from
unwanted styles. DETOXIGEN is an ensemble of a pre-trained language model
(generator) and a detoxifier. The detoxifier is trained intentionally on the
toxic data representative of the undesirable attribute, encouraging it to
generate text in that style exclusively. During the actual generation, we use
the trained detoxifier to produce undesirable tokens for the generator to
contrast against at each decoding step. This approach directly informs the
generator to avoid generating tokens that the detoxifier considers highly
likely. We evaluate DETOXIGEN on the commonly used REALTOXICITYPROMPTS
benchmark (Gehman et al., 2020) with various language models as generators. We
find that it significantly outperforms previous approaches in detoxification
metrics while not compromising on the generation quality. Moreover, the
detoxifier is obtained by soft prompt-tuning using the same backbone language
model as the generator. Hence, DETOXIGEN requires only a tiny amount of extra
weights from the virtual tokens of the detoxifier to be loaded into GPU memory
while decoding, making it a promising lightweight, practical, and
parameter-efficient detoxification strategy
Accurate Reconstruction of Molecular Phylogenies for Proteins Using Codon and Amino Acid Unified Sequence Alignments (CAUSA)
Based on molecular clock hypothesis, and neutral theory of molecular evolution, molecular phylogenies have been widely used for inferring evolutionary history of organisms and individual genes. Traditionally, alignments and phylogeny trees of proteins and their coding DNA sequences are constructed separately, thus often different conclusions were drawn. Here we present a new strategy for sequence alignment and phylogenetic tree reconstruction, codon and amino acid unified sequence alignment (CAUSA), which aligns DNA and protein sequences and draw phylogenetic trees in a unified manner. We demonstrated that CAUSA improves both the accuracy of multiple sequence alignments and phylogenetic trees by solving a variety of molecular evolutionary problems in virus, bacteria and mammals. Our results support the hypothesis that the molecular clock for proteins has two pointers existing separately in DNA and protein sequences. It is more accurate to read the molecular clock by combination (additive) of these two pointers, since the ticking rates of them are sometimes consistent, sometimes different. CAUSA software were released as Open Source under GNU/GPL license, and are downloadable free of charge from the website www.dnapluspro.com
DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain Question Answering over Knowledge Base and Text
Large Language Models (LLMs) have exhibited impressive generation
capabilities, but they suffer from hallucinations when solely relying on their
internal knowledge, especially when answering questions that require less
commonly known information. Retrieval-augmented LLMs have emerged as a
potential solution to ground LLMs in external knowledge. Nonetheless, recent
approaches have primarily emphasized retrieval from unstructured text corpora,
owing to its seamless integration into prompts. When using structured data such
as knowledge graphs, most methods simplify it into natural text, neglecting the
underlying structures. Moreover, a significant gap in the current landscape is
the absence of a realistic benchmark for evaluating the effectiveness of
grounding LLMs on heterogeneous knowledge sources (e.g., knowledge base and
text). To fill this gap, we have curated a comprehensive dataset that poses two
unique challenges: (1) Two-hop multi-source questions that require retrieving
information from both open-domain structured and unstructured knowledge
sources; retrieving information from structured knowledge sources is a critical
component in correctly answering the questions. (2) The generation of symbolic
queries (e.g., SPARQL for Wikidata) is a key requirement, which adds another
layer of challenge. Our dataset is created using a combination of automatic
generation through predefined reasoning chains and human annotation. We also
introduce a novel approach that leverages multiple retrieval tools, including
text passage retrieval and symbolic language-assisted retrieval. Our model
outperforms previous approaches by a significant margin, demonstrating its
effectiveness in addressing the above-mentioned reasoning challenges
Efficient Immunotherapy of Drug-Free Layered Double Hydroxide Nanoparticles via Neutralizing Excess Acid and Blocking Tumor Cell Autophagy
Cancer immunotherapy efficacy is largely limited by the suppressive tumor immune microenvironment (TIME) where antitumor immune cells are inhibited and tumor antigens continue to mutate or be lost. To remodel the TIME, we here applied weakly alkaline layered double hydroxide nanoparticles (LDH NPs) to neutralize the excess acid and block autophagy of tumor cells for neoadjuvant cancer immunotherapy. Peritumoral injection of LDH NPs provided a long-term and efficient acid-neutralization in the TIME, blocked the lysosome-mediated autophagy pathway in tumor cells, and increased the levels of antitumor tumor-associated macrophages and T cells. These LDH NPs captured tumor antigens released in the tumor tissues and effectively inhibited the growth of both melanoma and colon tumors in vivo. These findings indicate that LDH NPs, as an immunomodulator and adjuvant, successfully "awaken" and promote the host innate and adaptive immune systems, showing promising potential for solid tumor immunotherapy
Efficient Immunotherapy of Drug-Free Layered Double Hydroxide Nanoparticles via Neutralizing Excess Acid and Blocking Tumor Cell Autophagy
Cancer immunotherapy efficacy is largely limited by the suppressive tumor immune microenvironment (TIME) where antitumor immune cells are inhibited and tumor antigens continue to mutate or be lost. To remodel the TIME, we here applied weakly alkaline layered double hydroxide nanoparticles (LDH NPs) to neutralize the excess acid and block autophagy of tumor cells for neoadjuvant cancer immunotherapy. Peritumoral injection of LDH NPs provided a long-term and efficient acid-neutralization in the TIME, blocked the lysosome-mediated autophagy pathway in tumor cells, and increased the levels of antitumor tumor-associated macrophages and T cells. These LDH NPs captured tumor antigens released in the tumor tissues and effectively inhibited the growth of both melanoma and colon tumors in vivo. These findings indicate that LDH NPs, as an immunomodulator and adjuvant, successfully "awaken" and promote the host innate and adaptive immune systems, showing promising potential for solid tumor immunotherapy
Multi-omics data provide insight into the adaptation of the glasshouse plant Rheum nobile to the alpine subnival zone
Abstract Subnival glasshouse plants provide a text-book example of high-altitude adaptation with reproductive organs enclosed in specialized semi-translucent bracts, monocarpic reproduction and continuous survival under stress. Here, we present genomic, transcriptomic and metabolomic analyses for one such plant, the Noble rhubarb (Rheum nobile). Comparative genomic analyses show that an expanded number of genes and retained genes from two recent whole-genome duplication events are both relevant to subnival adaptation of this species. Most photosynthesis genes are downregulated within bracts compared to within leaves, and indeed bracts exhibit a sharp reduction in photosynthetic pigments, indicating that the bracts no longer perform photosynthesis. Contrastingly, genes related to flavonol synthesis are upregulated, providing enhanced defense against UV irradiation damage. Additionally, anatomically abnormal mesophyll combined with the downregulation of genes related to mesophyll differentiation in bracts illustrates the innovation and specification of the glass-like bracts. We further detect substantial accumulation of antifreeze proteins (e.g. AFPs, LEAs) and various metabolites (e.g. Proline, Protective sugars, procyanidins) in over-wintering roots. These findings provide new insights into subnival adaptation and the evolution of glasshouse alpine plants
Multi-omics data provide insight into the adaptation of the glasshouse plant <i>Rheum nobile </i>to the alpine subnival zone
Subnival glasshouse plants provide a text-book example of high-altitude adaptation with reproductive organs enclosed in specialized semi-translucent bracts, monocarpic reproduction and continuous survival under stress. Here, we present genomic, transcriptomic and metabolomic analyses for one such plant, the Noble rhubarb (Rheum nobile). Comparative genomic analyses show that an expanded number of genes and retained genes from two recent whole-genome duplication events are both relevant to subnival adaptation of this species. Most photosynthesis genes are downregulated within bracts compared to within leaves, and indeed bracts exhibit a sharp reduction in photosynthetic pigments, indicating that the bracts no longer perform photosynthesis. Contrastingly, genes related to flavonol synthesis are upregulated, providing enhanced defense against UV irradiation damage. Additionally, anatomically abnormal mesophyll combined with the downregulation of genes related to mesophyll differentiation in bracts illustrates the innovation and specification of the glass-like bracts. We further detect substantial accumulation of antifreeze proteins (e.g. AFPs, LEAs) and various metabolites (e.g. Proline, Protective sugars, procyanidins) in over-wintering roots. These findings provide new insights into subnival adaptation and the evolution of glasshouse alpine plants