367 research outputs found
Schema2QA: High-Quality and Low-Cost Q&A Agents for the Structured Web
Building a question-answering agent currently requires large annotated
datasets, which are prohibitively expensive. This paper proposes Schema2QA, an
open-source toolkit that can generate a Q&A system from a database schema
augmented with a few annotations for each field. The key concept is to cover
the space of possible compound queries on the database with a large number of
in-domain questions synthesized with the help of a corpus of generic query
templates. The synthesized data and a small paraphrase set are used to train a
novel neural network based on the BERT pretrained model. We use Schema2QA to
generate Q&A systems for five Schema.org domains, restaurants, people, movies,
books and music, and obtain an overall accuracy between 64% and 75% on
crowdsourced questions for these domains. Once annotations and paraphrases are
obtained for a Schema.org schema, no additional manual effort is needed to
create a Q&A agent for any website that uses the same schema. Furthermore, we
demonstrate that learning can be transferred from the restaurant to the hotel
domain, obtaining a 64% accuracy on crowdsourced questions with no manual
effort. Schema2QA achieves an accuracy of 60% on popular restaurant questions
that can be answered using Schema.org. Its performance is comparable to Google
Assistant, 7% lower than Siri, and 15% higher than Alexa. It outperforms all
these assistants by at least 18% on more complex, long-tail questions
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
This paper presents the first few-shot LLM-based chatbot that almost never
hallucinates and has high conversationality and low latency. WikiChat is
grounded on the English Wikipedia, the largest curated free-text corpus.
WikiChat generates a response from an LLM, retains only the grounded facts,
and combines them with additional information it retrieves from the corpus to
form factual and engaging responses. We distill WikiChat based on GPT-4 into a
7B-parameter LLaMA model with minimal loss of quality, to significantly improve
its latency, cost and privacy, and facilitate research and deployment.
Using a novel hybrid human-and-LLM evaluation methodology, we show that our
best system achieves 97.3% factual accuracy in simulated conversations. It
significantly outperforms all retrieval-based and LLM-based baselines, and by
3.9%, 38.6% and 51.0% on head, tail and recent knowledge compared to GPT-4.
Compared to previous state-of-the-art retrieval-based chatbots, WikiChat is
also significantly more informative and engaging, just like an LLM.
WikiChat achieves 97.9% factual accuracy in conversations with human users
about recent topics, 55.0% better than GPT-4, while receiving significantly
higher user ratings and more favorable comments.Comment: Findings of EMNLP 202
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models
We study how to apply large language models to write grounded and organized
long-form articles from scratch, with comparable breadth and depth to Wikipedia
pages. This underexplored problem poses new challenges at the pre-writing
stage, including how to research the topic and prepare an outline prior to
writing. We propose STORM, a writing system for the Synthesis of Topic Outlines
through Retrieval and Multi-perspective Question Asking. STORM models the
pre-writing stage by (1) discovering diverse perspectives in researching the
given topic, (2) simulating conversations where writers carrying different
perspectives pose questions to a topic expert grounded on trusted Internet
sources, (3) curating the collected information to create an outline.
For evaluation, we curate FreshWiki, a dataset of recent high-quality
Wikipedia articles, and formulate outline assessments to evaluate the
pre-writing stage. We further gather feedback from experienced Wikipedia
editors. Compared to articles generated by an outline-driven
retrieval-augmented baseline, more of STORM's articles are deemed to be
organized (by a 25% absolute increase) and broad in coverage (by 10%). The
expert feedback also helps identify new challenges for generating grounded long
articles, such as source bias transfer and over-association of unrelated facts.Comment: 27 pages, NAACL 2024 Main Conferenc
SUQL: Conversational Search over Structured and Unstructured Data with Large Language Models
While most conversational agents are grounded on either free-text or
structured knowledge, many knowledge corpora consist of hybrid sources. This
paper presents the first conversational agent that supports the full generality
of hybrid data access for large knowledge corpora, through a language we
developed called SUQL (Structured and Unstructured Query Language).
Specifically, SUQL extends SQL with free-text primitives (summary and answer),
so information retrieval can be composed with structured data accesses
arbitrarily in a formal, succinct, precise, and interpretable notation. With
SUQL, we propose the first semantic parser, an LLM with in-context learning,
that can handle hybrid data sources.
Our in-context learning-based approach, when applied to the HybridQA dataset,
comes within 8.9% exact match and 7.1% F1 of the SOTA, which was trained on 62K
data samples. More significantly, unlike previous approaches, our technique is
applicable to large databases and free-text corpora. We introduce a dataset
consisting of crowdsourced questions and conversations on Yelp, a large, real
restaurant knowledge base with structured and unstructured data. We show that
our few-shot conversational agent based on SUQL finds an entity satisfying all
user requirements 90.3% of the time, compared to 63.4% for a baseline based on
linearization
Heart failure around the world
With increasingly large sample sizes required to demonstrate event reduction, heart failure outcome trials are no longer being performed in a small group of selected patients and countries, but at a global scale with worldwide contribution of patients from countries with considerable differences in background therapy, socioeconomic status and healthcare practices. Recent studies have highlighted how socioeconomic determinants rather than geographical factors may underlie the heterogeneity of patient populations across the globe. Therefore, in this review, we evaluated (i) regional differences in patient characteristics and outcomes in recent epidemiologic studies; (ii) regional differences in worldwide representativeness of clinical trial populations; and (iii) the role of socioeconomic determinants in driving country differences in heart failure trial enrolment and clinical outcomes
The Effect of Epstein-Barr Virus Latent Membrane Protein 2 Expression on the Kinetics of Early B Cell Infection
Infection of human B cells with wild-type Epstein-Barr virus (EBV) in vitro leads to activation and proliferation that result in efficient production of lymphoblastoid cell lines (LCLs). Latent Membrane Protein 2 (LMP2) is expressed early after infection and previous research has suggested a possible role in this process. Therefore, we generated recombinant EBV with knockouts of either or both protein isoforms, LMP2A and LMP2B (Ξ2A, Ξ2B, Ξ2A/Ξ2B) to study the effect of LMP2 in early B cell infection. Infection of B cells with Ξ2A and Ξ2A/Ξ2B viruses led to a marked decrease in activation and proliferation relative to wild-type (wt) viruses, and resulted in higher percentages of apoptotic B cells. Ξ2B virus infection showed activation levels comparable to wt, but fewer numbers of proliferating B cells. Early B cell infection with wt, Ξ2A and Ξ2B viruses did not result in changes in latent gene expression, with the exception of elevated LMP2B transcript in Ξ2A virus infection. Infection with Ξ2A and Ξ2B viruses did not affect viral latency, determined by changes in LMP1/Zebra expression following BCR stimulation. However, BCR stimulation of Ξ2A/Ξ2B cells resulted in decreased LMP1 expression, which suggests loss of stability in viral latency. Long-term outgrowth assays revealed that LMP2A, but not LMP2B, is critical for efficient long-term growth of B cells in vitro. The lowest levels of activation, proliferation, and LCL formation were observed when both isoforms were deleted. These results suggest that LMP2A appears to be critical for efficient activation, proliferation and survival of EBV-infected B cells at early times after infection, which impacts the efficient long-term growth of B cells in culture. In contrast, LMP2B did not appear to play a significant role in these processes, and long-term growth of infected B cells was not affected by the absence of this protein. Β© 2013 Wasil et al
- β¦