123 research outputs found
Schema-aware Reference as Prompt Improves Data-Efficient Relational Triple and Event Extraction
Information Extraction, which aims to extract structural relational triple or
event from unstructured texts, often suffers from data scarcity issues. With
the development of pre-trained language models, many prompt-based approaches to
data-efficient information extraction have been proposed and achieved
impressive performance. However, existing prompt learning methods for
information extraction are still susceptible to several potential limitations:
(i) semantic gap between natural language and output structure knowledge with
pre-defined schema; (ii) representation learning with locally individual
instances limits the performance given the insufficient features. In this
paper, we propose a novel approach of schema-aware Reference As Prompt (RAP),
which dynamically leverage schema and knowledge inherited from global
(few-shot) training data for each sample. Specifically, we propose a
schema-aware reference store, which unifies symbolic schema and relevant
textual instances. Then, we employ a dynamic reference integration module to
retrieve pertinent knowledge from the datastore as prompts during training and
inference. Experimental results demonstrate that RAP can be plugged into
various existing models and outperforms baselines in low-resource settings on
four datasets of relational triple extraction and event extraction. In
addition, we provide comprehensive empirical ablations and case analysis
regarding different types and scales of knowledge in order to better understand
the mechanisms of RAP. Code is available in https://github.com/zjunlp/RAP.Comment: Work in progres
Editing Large Language Models: Problems, Methods, and Opportunities
Despite the ability to train capable LLMs, the methodology for maintaining
their relevancy and rectifying errors remains elusive. To this end, the past
few years have witnessed a surge in techniques for editing LLMs, the objective
of which is to efficiently alter the behavior of LLMs within a specific domain
without negatively impacting performance across other inputs. This paper
embarks on a deep exploration of the problems, methods, and opportunities
related to model editing for LLMs. In particular, we provide an exhaustive
overview of the task definition and challenges associated with model editing,
along with an in-depth empirical analysis of the most progressive methods
currently at our disposal. We also build a new benchmark dataset to facilitate
a more robust evaluation and pinpoint enduring issues intrinsic to existing
techniques. Our objective is to provide valuable insights into the
effectiveness and feasibility of each editing technique, thereby assisting the
community in making informed decisions on the selection of the most appropriate
method for a specific task or context. Code and datasets are available at
https://github.com/zjunlp/EasyEdit.Comment: EMNLP 2023. Updated with new experiment
Editing Conceptual Knowledge for Large Language Models
Recently, there has been a growing interest in knowledge editing for Large
Language Models (LLMs). Current approaches and evaluations merely explore the
instance-level editing, while whether LLMs possess the capability to modify
concepts remains unclear. This paper pioneers the investigation of editing
conceptual knowledge for LLMs, by constructing a novel benchmark dataset
ConceptEdit and establishing a suite of new metrics for evaluation. The
experimental results reveal that, although existing editing methods can
efficiently modify concept-level definition to some extent, they also have the
potential to distort the related instantial knowledge in LLMs, leading to poor
performance. We anticipate this can inspire further progress in better
understanding LLMs. Our project homepage is available at
https://zjunlp.github.io/project/ConceptEdit.Comment: Work in progress. Code: https://github.com/zjunlp/EasyEdit Dataset:
https://huggingface.co/datasets/zjunlp/ConceptEdi
EasyEdit: An Easy-to-use Knowledge Editing Framework for Large Language Models
Large Language Models (LLMs) usually suffer from knowledge cutoff or fallacy
issues, which means they are unaware of unseen events or generate text with
incorrect facts owing to the outdated/noisy data. To this end, many knowledge
editing approaches for LLMs have emerged -- aiming to subtly inject/edit
updated knowledge or adjust undesired behavior while minimizing the impact on
unrelated inputs. Nevertheless, due to significant differences among various
knowledge editing methods and the variations in task setups, there is no
standard implementation framework available for the community, which hinders
practitioners to apply knowledge editing to applications. To address these
issues, we propose EasyEdit, an easy-to-use knowledge editing framework for
LLMs. It supports various cutting-edge knowledge editing approaches and can be
readily apply to many well-known LLMs such as T5, GPT-J, LlaMA, etc.
Empirically, we report the knowledge editing results on LlaMA-2 with EasyEdit,
demonstrating that knowledge editing surpasses traditional fine-tuning in terms
of reliability and generalization. We have released the source code on GitHub
at https://github.com/zjunlp/EasyEdit, along with Google Colab tutorials and
comprehensive documentation for beginners to get started. Besides, we present
an online system for real-time knowledge editing, and a demo video at
http://knowlm.zjukg.cn/easyedit.mp4.Comment: The project website is https://github.com/zjunlp/EasyEdi
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity
This survey addresses the crucial issue of factuality in Large Language
Models (LLMs). As LLMs find applications across diverse domains, the
reliability and accuracy of their outputs become vital. We define the
Factuality Issue as the probability of LLMs to produce content inconsistent
with established facts. We first delve into the implications of these
inaccuracies, highlighting the potential consequences and challenges posed by
factual errors in LLM outputs. Subsequently, we analyze the mechanisms through
which LLMs store and process facts, seeking the primary causes of factual
errors. Our discussion then transitions to methodologies for evaluating LLM
factuality, emphasizing key metrics, benchmarks, and studies. We further
explore strategies for enhancing LLM factuality, including approaches tailored
for specific domains. We focus two primary LLM configurations standalone LLMs
and Retrieval-Augmented LLMs that utilizes external data, we detail their
unique challenges and potential enhancements. Our survey offers a structured
guide for researchers aiming to fortify the factual reliability of LLMs.Comment: 62 pages; 300+ reference
Phylogeny of the Infraorder Pentatomomorpha Based on Fossil and Extant Morphology, with Description of a New Fossil Family from China
<div><h3>Background</h3><p>An extinct new family of Pentatomomorpha, Venicoridae Yao, Ren & Cai <b>fam. nov.</b>, with 2 new genera and 2 new species (<em>Venicoris solaris</em> Yao, Ren & Rider <b>gen. & sp. nov.</b> and <em>Clavaticoris zhengi</em> Yao, Ren & Cai <b>gen. & sp. nov.</b>) are described from the Early Cretaceous Yixian Formation in Northeast China.</p> <h3>Methodology/Principal Findings</h3><p>A cladistic analysis based on a combination of fossil and extant morphological characters clarified the phylogenetic status of the new family and has allowed the reconstruction of intersuperfamily and interfamily relationships within the Infraorder Pentatomomorpha. The fossil record and diversity of Pentatomomorpha during the Mesozoic is discussed.</p> <h3>Conclusions/Significance</h3><p>Pentatomomorpha is a monophyletic group; Aradoidea and the Trichophora are sister groups; these fossils belong to new family, treated as the sister group of remainder of Trichophora; Pentatomoidea is a monophyletic group; Piesmatidae should be separated as a superfamily, Piesmatoidea. Origin time of Pentatomomorpha should be tracked back to the Middle or Early Triassic.</p> </div
Prevotella genus and its related NOD-like receptor signaling pathway in young males with stage III periodontitis
BackgroundAs periodontitis progresses, the oral microbiota community changes dynamically. In this study, we evaluated the dominant bacteria and their roles in the potential pathway in young males with stage III periodontitis.Methods16S rRNA sequencing was performed to evaluate variations in the composition of oral bacteria between males with stage I and III periodontitis and identify the dominant bacteria of each group. Function prediction was obtained based on 16S rRNA sequencing data. The inhibitor of the predominant pathway for stage III periodontitis was used to investigate the role of the dominant bacteria in periodontitis in vivo and in vitro.ResultsChao1 index, Observed Species and Phylogenetic Diversity (PD) whole tree values were significantly higher in the stage III periodontitis group. β-diversity suggested that samples could be divided according to the stages of periodontitis. The dominant bacteria in stage III periodontitis were Prevotella, Prevotella_7, and Dialister, whereas that in stage I periodontitis was Cardiobacterium. KEGG analysis predicted that variations in the oral microbiome may be related to the NOD-like receptor signaling pathway. The inhibitor of this pathway, NOD-IN-1, decreased P. intermedia -induced Tnf-α mRNA expression and increased P. intermedia -induced Il-6 mRNA expression, consistent with the ELISA results. Immunohistochemistry confirmed the down-regulation of TNF-α and IL-6 expressions by NOD-IN-1 in P. intermedia–induced periodontitis.ConclusionThe composition of the oral bacteria in young males varied according to the stage of periodontitis. The species richness of oral microtia was greater in young males with stage III periodontitis than those with stage I periodontitis. Prevotella was the dominant bacteria in young males with stage III periodontitis, and inhibition of the NOD-like receptor signaling pathway can decrease the periodontal inflammation induced by P. intermedia
SPTAN1/Numb Axis Senses Cell Density To Restrain Cell Growth and Oncogenesis Through Hippo Signaling
The loss of contact inhibition is a key step during carcinogenesis. The Hippo-Yes-associated protein (Hippo/YAP) pathway is an important regulator of cell growth in a cell density-dependent manner. However, how Hippo signaling senses cell density in this context remains elusive. Here, we report that high cell density induced the phosphorylation of spectrin α chain, nonerythrocytic 1 (SPTAN1), a plasma membrane-stabilizing protein, to recruit NUMB endocytic adaptor protein isoforms 1 and 2 (NUMB1/2), which further sequestered microtubule affinity-regulating kinases (MARKs) in the plasma membrane and rendered them inaccessible for phosphorylation and inhibition of the Hippo kinases sterile 20-like kinases MST1 and MST2 (MST1/2). WW45 interaction with MST1/2 was thereby enhanced, resulting in the activation of Hippo signaling to block YAP activity for cell contact inhibition. Importantly, low cell density led to SPTAN1 dephosphorylation and NUMB cytoplasmic location, along with MST1/2 inhibition and, consequently, YAP activation. Moreover, double KO of NUMB and WW45 in the liver led to appreciable organ enlargement and rapid tumorigenesis. Interestingly, NUMB isoforms 3 and 4, which have a truncated phosphotyrosine-binding (PTB) domain and are thus unable to interact with phosphorylated SPTAN1 and activate MST1/2, were selectively upregulated in liver cancer, which correlated with YAP activation. We have thus revealed a SPTAN1/NUMB1/2 axis that acts as a cell density sensor to restrain cell growth and oncogenesis by coupling external cell-cell contact signals to intracellular Hippo signaling
Socializing One Health: an innovative strategy to investigate social and behavioral risks of emerging viral threats
In an effort to strengthen global capacity to prevent, detect, and control infectious diseases in animals and people, the United States Agency for International Development’s (USAID) Emerging Pandemic Threats (EPT) PREDICT project funded development of regional, national, and local One Health capacities for early disease detection, rapid response, disease control, and risk reduction. From the outset, the EPT approach was inclusive of social science research methods designed to understand the contexts and behaviors of communities living and working at human-animal-environment interfaces considered high-risk for virus emergence. Using qualitative and quantitative approaches, PREDICT behavioral research aimed to identify and assess a range of socio-cultural behaviors that could be influential in zoonotic disease emergence, amplification, and transmission. This broad approach to behavioral risk characterization enabled us to identify and characterize human activities that could be linked to the transmission dynamics of new and emerging viruses. This paper provides a discussion of implementation of a social science approach within a zoonotic surveillance framework. We conducted in-depth ethnographic interviews and focus groups to better understand the individual- and community-level knowledge, attitudes, and practices that potentially put participants at risk for zoonotic disease transmission from the animals they live and work with, across 6 interface domains. When we asked highly-exposed individuals (ie. bushmeat hunters, wildlife or guano farmers) about the risk they perceived in their occupational activities, most did not perceive it to be risky, whether because it was normalized by years (or generations) of doing such an activity, or due to lack of information about potential risks. Integrating the social sciences allows investigations of the specific human activities that are hypothesized to drive disease emergence, amplification, and transmission, in order to better substantiate behavioral disease drivers, along with the social dimensions of infection and transmission dynamics. Understanding these dynamics is critical to achieving health security--the protection from threats to health-- which requires investments in both collective and individual health security. Involving behavioral sciences into zoonotic disease surveillance allowed us to push toward fuller community integration and engagement and toward dialogue and implementation of recommendations for disease prevention and improved health security
- …