236,229 research outputs found
Knowledge and pre-trained language models inside and out: a deep-dive into datasets and external knowledge
Pre-trained Language Models (PLMs) have greatly advanced the performance of various NLP tasks and have undoubtedly been serving as foundation models for this field. These pre-trained models are able to capture rich semantic patterns from large-scale text corpora and learn high-quality representations of texts. However, such models still have shortcomings - they underperform when faced with tasks that requires implicit external knowledge to be understood, which is difficult to learn with commonly employed pre-training objectives. Moreover, there lacks a comprehensive understanding of PLMsâ behaviour in learning knowledge during the fine-tuning phase. Therefore, in order to address the aforementioned challenges, we propose a set of approaches to inject external knowledge into PLMs and demonstrate experiments investigating their behaviour in learning knowledge during the fine-tuning phase, primarily focusing on Sentiment Analysis, Question Answering and Video Question Answering.
Specifically, we introduce novel approaches explicitly using textual historical reviews of users and products for improving sentiment analysis. To overcome the problem of context-question lexical overlap and data scarcity for question generation, we propose a novel method making use of linguistic and semantic knowledge with heuristics. Additionally, we explore how to utilise multimodal (visual and acoustic) information/knowledge to improve Video Question Answering.
Experiments conducted on benchmark datasets show that our proposed approaches achieve superior performance compared to state-of-the-art models, demonstrating the effectiveness of our methods for injecting external knowledge. Furthermore, we conduct a set of experiments investigating the learning of knowledge for PLMs for question answering under various scenarios. Results reveal that the internal characteristics of QA datasets can pose strong bias for PLMs when learning from downstream tasks datasets. Finally, we present an in-depth discussion of future directions for improving PLMs with external knowledge
ELAINE: rELiAbility and evIdence-aware News vErifier
Disinformation is one of the main problems of todayâs society, and specifically the viralization of fake news. This research presents ELAINE, a hybrid proposal to detect the veracity of news items that combines content reliability information with external evidence. The external evidence is extracted from a scientific knowledge base that contains medical information associated with coronavirus, organized in a knowledge graph created from a CORD-19 corpus. The information is accessed using Natural Language Question Answering and a set of evidences are extracted and their relevance measured. By combining both reliability and evidence information, the veracity of the news items can be predicted, improving both accuracy and F1 compared with using only reliability information. These results prove that the approach presented is very promising for the veracity detection task
Huatuo-26M, a Large-scale Chinese Medical QA Dataset
In this paper, we release a largest ever medical Question Answering (QA)
dataset with 26 million QA pairs. We benchmark many existing approaches in our
dataset in terms of both retrieval and generation. Experimental results show
that the existing models perform far lower than expected and the released
dataset is still challenging in the pre-trained language model era. Moreover,
we also experimentally show the benefit of the proposed dataset in many
aspects: (i) trained models for other QA datasets in a zero-shot fashion; and
(ii) as external knowledge for retrieval-augmented generation (RAG); and (iii)
improving existing pre-trained language models by using the QA pairs as a
pre-training corpus in continued training manner. We believe that this dataset
will not only contribute to medical research but also facilitate both the
patients and clinical doctors. See
\url{https://github.com/FreedomIntelligence/Huatuo-26M}
Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation
Knowledge-intensive tasks (e.g., open-domain question answering (QA)) require
a substantial amount of factual knowledge and often rely on external
information for assistance. Recently, large language models (LLMs) (e.g.,
ChatGPT), have demonstrated impressive prowess in solving a wide range of tasks
with world knowledge, including knowledge-intensive tasks. However, it remains
unclear how well LLMs are able to perceive their factual knowledge boundaries,
particularly how they behave when incorporating retrieval augmentation. In this
study, we present an initial analysis of the factual knowledge boundaries of
LLMs and how retrieval augmentation affects LLMs on open-domain QA. Specially,
we focus on three primary research questions and analyze them by examining QA
performance, priori judgement and posteriori judgement of LLMs. We show
evidence that LLMs possess unwavering confidence in their capabilities to
respond to questions and the accuracy of their responses. Furthermore,
retrieval augmentation proves to be an effective approach in enhancing LLMs'
awareness of knowledge boundaries, thereby improving their judgemental
abilities. Additionally, we also find that LLMs have a propensity to rely on
the provided retrieval results when formulating answers, while the quality of
these results significantly impacts their reliance. The code to reproduce this
work is available at https://github.com/RUCAIBox/LLM-Knowledge-Boundary
Scene Graph Generation with External Knowledge and Image Reconstruction
Scene graph generation has received growing attention with the advancements
in image understanding tasks such as object detection, attributes and
relationship prediction,~\etc. However, existing datasets are biased in terms
of object and relationship labels, or often come with noisy and missing
annotations, which makes the development of a reliable scene graph prediction
model very challenging. In this paper, we propose a novel scene graph
generation algorithm with external knowledge and image reconstruction loss to
overcome these dataset issues. In particular, we extract commonsense knowledge
from the external knowledge base to refine object and phrase features for
improving generalizability in scene graph generation. To address the bias of
noisy object annotations, we introduce an auxiliary image reconstruction path
to regularize the scene graph generation network. Extensive experiments show
that our framework can generate better scene graphs, achieving the
state-of-the-art performance on two benchmark datasets: Visual Relationship
Detection and Visual Genome datasets.Comment: 10 pages, 5 figures, Accepted in CVPR 201
- âŠ