95 research outputs found

    Inside {ASCENT}: {E}xploring a Deep Commonsense Knowledge Base and its Usage in Question Answering

    Get PDF
    ASCENT is a fully automated methodology for extracting and consolidating commonsense assertions from web contents (Nguyen et al., WWW 2021). It advances traditional triple-based commonsense knowledge representation by capturing semantic facets like locations and purposes, and composite concepts, i.e., subgroups and related aspects of subjects. In this demo, we present a web portal that allows users to understand its construction process, explore its content, and observe its impact in the use case of question answering. The demo website and an introductory video are both available online

    The Life Cycle of Knowledge in Big Language Models: A Survey

    Full text link
    Knowledge plays a critical role in artificial intelligence. Recently, the extensive success of pre-trained language models (PLMs) has raised significant attention about how knowledge can be acquired, maintained, updated and used by language models. Despite the enormous amount of related studies, there still lacks a unified view of how knowledge circulates within language models throughout the learning, tuning, and application processes, which may prevent us from further understanding the connections between current progress or realizing existing limitations. In this survey, we revisit PLMs as knowledge-based systems by dividing the life circle of knowledge in PLMs into five critical periods, and investigating how knowledge circulates when it is built, maintained and used. To this end, we systematically review existing studies of each period of the knowledge life cycle, summarize the main challenges and current limitations, and discuss future directions.Comment: paperlist: https://github.com/c-box/KnowledgeLifecycl

    Probing Neural Language Models for Human Tacit Assumptions

    Get PDF
    Humans carry stereotypic tacit assumptions (STAs) (Prince,1978), or propositional beliefs about generic concepts. Suchassociations are crucial for understanding natural language.We construct a diagnostic set of word prediction prompts toevaluate whether recent neural contextualized language mod-els trained on large text corpora capture STAs. Our promptsare based on human responses in a psychological study of con-ceptual associations. We find models to be profoundly effec-tive at retrieving concepts given associated properties. Our re-sults demonstrate empirical evidence that stereotypic concep-tual representations are captured in neural models derived fromsemi-supervised linguistic exposure

    LMentry: A Language Model Benchmark of Elementary Language Tasks

    Full text link
    As the performance of large language models rapidly improves, benchmarks are getting larger and more complex as well. We present LMentry, a benchmark that avoids this "arms race" by focusing on a compact set of tasks that are trivial to humans, e.g. writing a sentence containing a specific word, identifying which words in a list belong to a specific category, or choosing which of two words is longer. LMentry is specifically designed to provide quick and interpretable insights into the capabilities and robustness of large language models. Our experiments reveal a wide variety of failure cases that, while immediately obvious to humans, pose a considerable challenge for large language models, including OpenAI's latest 175B-parameter instruction-tuned model, TextDavinci002. LMentry complements contemporary evaluation approaches of large language models, providing a quick, automatic, and easy-to-run "unit test", without resorting to large benchmark suites of complex tasks.Comment: 24 pages, 2 figure

    Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering

    Full text link
    Generative models for open domain question answering have proven to be competitive, without resorting to external knowledge. While promising, this approach requires to use models with billions of parameters, which are expensive to train and query. In this paper, we investigate how much these models can benefit from retrieving text passages, potentially containing evidence. We obtain state-of-the-art results on the Natural Questions and TriviaQA open benchmarks. Interestingly, we observe that the performance of this method significantly improves when increasing the number of retrieved passages. This is evidence that generative models are good at aggregating and combining evidence from multiple passages
    corecore