22 research outputs found
Indirect supervision for relation extraction using question-answer pairs
Automatic relation extraction (RE) for types of interest is of great importance for interpreting massive text corpora in an efficient manner. For example, we want to identify the relationship "president_of" between entities "Donald Trump" and "United States" in a sentence expressing such a relation. Traditional RE models have heavily relied on human-annotated corpus for training, which can be costly in generating labeled data and become obstacles when dealing with more relation types. Thus, more RE extraction systems have shifted to be built upon training data automatically acquired by linking to knowledge bases (distant supervision). However, due to the incompleteness of knowledge bases and the context-agnostic labeling, the training data collected via distant supervision (DS) can be very noisy. In recent years, as increasing attention has been brought to tackling question-answering (QA) tasks, user feedback or datasets of such tasks become more accessible. In this paper, we propose a novel framework, ReQuest, to leverage question-answer pairs as an indirect source of supervision for relation extraction, and study how to use such supervision to reduce noise induced from DS. Our model jointly embeds relation mentions, types, QA entity mention pairs and text features in two low-dimensional spaces (RE and QA), where objects with same relation types or semantically similar question-answer pairs have similar representations. Shared features connect these two spaces, carrying clearer semantic knowledge from both sources. ReQuest, then use these learned embeddings to estimate the types of test relation mentions. We formulate a global objective function and adopt a novel margin-based QA loss to reduce noise in DS by exploiting semantic evidence from the QA dataset. Our experimental results achieve an average of 11% improvement in F1 score on two public RE datasets combined with TREC QA dataset. Codes and datasets can be downloaded at https://github.com/ellenmellon/ReQuest
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Despite their remarkable capabilities, large language models (LLMs) often
produce responses containing factual inaccuracies due to their sole reliance on
the parametric knowledge they encapsulate. Retrieval-Augmented Generation
(RAG), an ad hoc approach that augments LMs with retrieval of relevant
knowledge, decreases such issues. However, indiscriminately retrieving and
incorporating a fixed number of retrieved passages, regardless of whether
retrieval is necessary, or passages are relevant, diminishes LM versatility or
can lead to unhelpful response generation. We introduce a new framework called
Self-Reflective Retrieval-Augmented Generation (Self-RAG) that enhances an LM's
quality and factuality through retrieval and self-reflection. Our framework
trains a single arbitrary LM that adaptively retrieves passages on-demand, and
generates and reflects on retrieved passages and its own generations using
special tokens, called reflection tokens. Generating reflection tokens makes
the LM controllable during the inference phase, enabling it to tailor its
behavior to diverse task requirements. Experiments show that Self-RAG (7B and
13B parameters) significantly outperforms state-of-the-art LLMs and
retrieval-augmented models on a diverse set of tasks. Specifically, Self-RAG
outperforms ChatGPT and retrieval-augmented Llama2-chat on Open-domain QA,
reasoning and fact verification tasks, and it shows significant gains in
improving factuality and citation accuracy for long-form generations relative
to these models.Comment: 30 pages, 2 figures, 12 table
CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement Learning
Compared to standard retrieval tasks, passage retrieval for conversational
question answering (CQA) poses new challenges in understanding the current user
question, as each question needs to be interpreted within the dialogue context.
Moreover, it can be expensive to re-train well-established retrievers such as
search engines that are originally developed for non-conversational queries. To
facilitate their use, we develop a query rewriting model CONQRR that rewrites a
conversational question in the context into a standalone question. It is
trained with a novel reward function to directly optimize towards retrieval
using reinforcement learning and can be adapted to any off-the-shelf retriever.
We show that CONQRR achieves state-of-the-art results on a recent open-domain
CQA dataset containing conversations from three different sources, and is
effective for two different off-the-shelf retrievers. Our extensive analysis
also shows the robustness of CONQRR to out-of-domain dialogues as well as to
zero query rewriting supervision
A Controllable Model of Grounded Response Generation
Current end-to-end neural conversation models inherently lack the flexibility
to impose semantic control in the response generation process, often resulting
in uninteresting responses. Attempts to boost informativeness alone come at the
expense of factual accuracy, as attested by pretrained language models'
propensity to "hallucinate" facts. While this may be mitigated by access to
background knowledge, there is scant guarantee of relevance and informativeness
in generated responses. We propose a framework that we call controllable
grounded response generation (CGRG), in which lexical control phrases are
either provided by a user or automatically extracted by a control phrase
predictor from dialogue context and grounding knowledge. Quantitative and
qualitative results show that, using this framework, a transformer based model
with a novel inductive attention mechanism, trained on a conversation-like
Reddit dataset, outperforms strong generation baselines.Comment: AAAI 202
DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations
Applications that could benefit from automatic understanding of human-human
conversations often come with challenges associated with private information in
real-world data such as call center or clinical conversations. Working with
protected data also increases costs of annotation, which limits technology
development. To address these challenges, we propose DIALGEN, a
human-in-the-loop semi-automated dialogue generation framework. DIALGEN uses a
language model (ChatGPT) that can follow schema and style specifications to
produce fluent conversational text, generating a complex conversation through
iteratively generating subdialogues and using human feedback to correct
inconsistencies or redirect the flow. In experiments on structured
summarization of agent-client information gathering calls, framed as dialogue
state tracking, we show that DIALGEN data enables significant improvement in
model performance