367 research outputs found

    Schema2QA: High-Quality and Low-Cost Q&A Agents for the Structured Web

    Full text link
    Building a question-answering agent currently requires large annotated datasets, which are prohibitively expensive. This paper proposes Schema2QA, an open-source toolkit that can generate a Q&A system from a database schema augmented with a few annotations for each field. The key concept is to cover the space of possible compound queries on the database with a large number of in-domain questions synthesized with the help of a corpus of generic query templates. The synthesized data and a small paraphrase set are used to train a novel neural network based on the BERT pretrained model. We use Schema2QA to generate Q&A systems for five Schema.org domains, restaurants, people, movies, books and music, and obtain an overall accuracy between 64% and 75% on crowdsourced questions for these domains. Once annotations and paraphrases are obtained for a Schema.org schema, no additional manual effort is needed to create a Q&A agent for any website that uses the same schema. Furthermore, we demonstrate that learning can be transferred from the restaurant to the hotel domain, obtaining a 64% accuracy on crowdsourced questions with no manual effort. Schema2QA achieves an accuracy of 60% on popular restaurant questions that can be answered using Schema.org. Its performance is comparable to Google Assistant, 7% lower than Siri, and 15% higher than Alexa. It outperforms all these assistants by at least 18% on more complex, long-tail questions

    WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia

    Full text link
    This paper presents the first few-shot LLM-based chatbot that almost never hallucinates and has high conversationality and low latency. WikiChat is grounded on the English Wikipedia, the largest curated free-text corpus. WikiChat generates a response from an LLM, retains only the grounded facts, and combines them with additional information it retrieves from the corpus to form factual and engaging responses. We distill WikiChat based on GPT-4 into a 7B-parameter LLaMA model with minimal loss of quality, to significantly improve its latency, cost and privacy, and facilitate research and deployment. Using a novel hybrid human-and-LLM evaluation methodology, we show that our best system achieves 97.3% factual accuracy in simulated conversations. It significantly outperforms all retrieval-based and LLM-based baselines, and by 3.9%, 38.6% and 51.0% on head, tail and recent knowledge compared to GPT-4. Compared to previous state-of-the-art retrieval-based chatbots, WikiChat is also significantly more informative and engaging, just like an LLM. WikiChat achieves 97.9% factual accuracy in conversations with human users about recent topics, 55.0% better than GPT-4, while receiving significantly higher user ratings and more favorable comments.Comment: Findings of EMNLP 202

    Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models

    Full text link
    We study how to apply large language models to write grounded and organized long-form articles from scratch, with comparable breadth and depth to Wikipedia pages. This underexplored problem poses new challenges at the pre-writing stage, including how to research the topic and prepare an outline prior to writing. We propose STORM, a writing system for the Synthesis of Topic Outlines through Retrieval and Multi-perspective Question Asking. STORM models the pre-writing stage by (1) discovering diverse perspectives in researching the given topic, (2) simulating conversations where writers carrying different perspectives pose questions to a topic expert grounded on trusted Internet sources, (3) curating the collected information to create an outline. For evaluation, we curate FreshWiki, a dataset of recent high-quality Wikipedia articles, and formulate outline assessments to evaluate the pre-writing stage. We further gather feedback from experienced Wikipedia editors. Compared to articles generated by an outline-driven retrieval-augmented baseline, more of STORM's articles are deemed to be organized (by a 25% absolute increase) and broad in coverage (by 10%). The expert feedback also helps identify new challenges for generating grounded long articles, such as source bias transfer and over-association of unrelated facts.Comment: 27 pages, NAACL 2024 Main Conferenc

    SUQL: Conversational Search over Structured and Unstructured Data with Large Language Models

    Full text link
    While most conversational agents are grounded on either free-text or structured knowledge, many knowledge corpora consist of hybrid sources. This paper presents the first conversational agent that supports the full generality of hybrid data access for large knowledge corpora, through a language we developed called SUQL (Structured and Unstructured Query Language). Specifically, SUQL extends SQL with free-text primitives (summary and answer), so information retrieval can be composed with structured data accesses arbitrarily in a formal, succinct, precise, and interpretable notation. With SUQL, we propose the first semantic parser, an LLM with in-context learning, that can handle hybrid data sources. Our in-context learning-based approach, when applied to the HybridQA dataset, comes within 8.9% exact match and 7.1% F1 of the SOTA, which was trained on 62K data samples. More significantly, unlike previous approaches, our technique is applicable to large databases and free-text corpora. We introduce a dataset consisting of crowdsourced questions and conversations on Yelp, a large, real restaurant knowledge base with structured and unstructured data. We show that our few-shot conversational agent based on SUQL finds an entity satisfying all user requirements 90.3% of the time, compared to 63.4% for a baseline based on linearization

    Heart failure around the world

    Get PDF
    With increasingly large sample sizes required to demonstrate event reduction, heart failure outcome trials are no longer being performed in a small group of selected patients and countries, but at a global scale with worldwide contribution of patients from countries with considerable differences in background therapy, socioeconomic status and healthcare practices. Recent studies have highlighted how socioeconomic determinants rather than geographical factors may underlie the heterogeneity of patient populations across the globe. Therefore, in this review, we evaluated (i) regional differences in patient characteristics and outcomes in recent epidemiologic studies; (ii) regional differences in worldwide representativeness of clinical trial populations; and (iii) the role of socioeconomic determinants in driving country differences in heart failure trial enrolment and clinical outcomes

    The Effect of Epstein-Barr Virus Latent Membrane Protein 2 Expression on the Kinetics of Early B Cell Infection

    Get PDF
    Infection of human B cells with wild-type Epstein-Barr virus (EBV) in vitro leads to activation and proliferation that result in efficient production of lymphoblastoid cell lines (LCLs). Latent Membrane Protein 2 (LMP2) is expressed early after infection and previous research has suggested a possible role in this process. Therefore, we generated recombinant EBV with knockouts of either or both protein isoforms, LMP2A and LMP2B (Ξ”2A, Ξ”2B, Ξ”2A/Ξ”2B) to study the effect of LMP2 in early B cell infection. Infection of B cells with Ξ”2A and Ξ”2A/Ξ”2B viruses led to a marked decrease in activation and proliferation relative to wild-type (wt) viruses, and resulted in higher percentages of apoptotic B cells. Ξ”2B virus infection showed activation levels comparable to wt, but fewer numbers of proliferating B cells. Early B cell infection with wt, Ξ”2A and Ξ”2B viruses did not result in changes in latent gene expression, with the exception of elevated LMP2B transcript in Ξ”2A virus infection. Infection with Ξ”2A and Ξ”2B viruses did not affect viral latency, determined by changes in LMP1/Zebra expression following BCR stimulation. However, BCR stimulation of Ξ”2A/Ξ”2B cells resulted in decreased LMP1 expression, which suggests loss of stability in viral latency. Long-term outgrowth assays revealed that LMP2A, but not LMP2B, is critical for efficient long-term growth of B cells in vitro. The lowest levels of activation, proliferation, and LCL formation were observed when both isoforms were deleted. These results suggest that LMP2A appears to be critical for efficient activation, proliferation and survival of EBV-infected B cells at early times after infection, which impacts the efficient long-term growth of B cells in culture. In contrast, LMP2B did not appear to play a significant role in these processes, and long-term growth of infected B cells was not affected by the absence of this protein. Β© 2013 Wasil et al
    • …
    corecore