10 research outputs found

    Natural Language Deduction with Incomplete Information

    Full text link
    A growing body of work studies how to answer a question or verify a claim by generating a natural language "proof": a chain of deductive inferences yielding the answer based on a set of premises. However, these methods can only make sound deductions when they follow from evidence that is given. We propose a new system that can handle the underspecified setting where not all premises are stated at the outset; that is, additional assumptions need to be materialized to prove a claim. By using a natural language generation model to abductively infer a premise given another premise and a conclusion, we can impute missing pieces of evidence needed for the conclusion to be true. Our system searches over two fringes in a bidirectional fashion, interleaving deductive (forward-chaining) and abductive (backward-chaining) generation steps. We sample multiple possible outputs for each step to achieve coverage of the search space, at the same time ensuring correctness by filtering low-quality generations with a round-trip validation procedure. Results on a modified version of the EntailmentBank dataset and a new dataset called Everyday Norms: Why Not? show that abductive generation with validation can recover premises across in- and out-of-domain settings.Comment: Conference of EMNLP 202

    Natural Language Deduction through Search over Statement Compositions

    Full text link
    In settings from fact-checking to question answering, we frequently want to know whether a collection of evidence (premises) entails a hypothesis. Existing methods primarily focus on the end-to-end discriminative version of this task, but less work has treated the generative version in which a model searches over the space of statements entailed by the premises to constructively derive the hypothesis. We propose a system for doing this kind of deductive reasoning in natural language by decomposing the task into separate steps coordinated by a search procedure, producing a tree of intermediate conclusions that faithfully reflects the system's reasoning process. Our experiments on the EntailmentBank dataset (Dalvi et al., 2021) demonstrate that the proposed system can successfully prove true statements while rejecting false ones. Moreover, it produces natural language explanations with a 17% absolute higher step validity than those produced by an end-to-end T5 model.Comment: Findings of EMNLP 202

    MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning

    Full text link
    While large language models (LLMs) equipped with techniques like chain-of-thought prompting have demonstrated impressive capabilities, they still fall short in their ability to reason robustly in complex settings. However, evaluating LLM reasoning is challenging because system capabilities continue to grow while benchmark datasets for tasks like logical deduction have remained static. We introduce MuSR, a dataset for evaluating language models on multistep soft reasoning tasks specified in a natural language narrative. This dataset has two crucial features. First, it is created through a novel neurosymbolic synthetic-to-natural generation algorithm, enabling the construction of complex reasoning instances that challenge GPT-4 (e.g., murder mysteries roughly 1000 words in length) and which can be scaled further as more capable LLMs are released. Second, our dataset instances are free text narratives corresponding to real-world domains of reasoning; this makes it simultaneously much more challenging than other synthetically-crafted benchmarks while remaining realistic and tractable for human annotators to solve with high accuracy. We evaluate a range of LLMs and prompting techniques on this dataset and characterize the gaps that remain for techniques like chain-of-thought to perform robust reasoning

    Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways

    Full text link
    We present Lil-Bevo, our submission to the BabyLM Challenge. We pretrained our masked language models with three ingredients: an initial pretraining with music data, training on shorter sequences before training on longer ones, and masking specific tokens to target some of the BLiMP subtasks. Overall, our baseline models performed above chance, but far below the performance levels of larger LLMs trained on more data. We found that training on short sequences performed better than training on longer sequences.Pretraining on music may help performance marginally, but, if so, the effect seems small. Our targeted Masked Language Modeling augmentation did not seem to improve model performance in general, but did seem to help on some of the specific BLiMP tasks that we were targeting (e.g., Negative Polarity Items). Training performant LLMs on small amounts of data is a difficult but potentially informative task. While some of our techniques showed some promise, more work is needed to explore whether they can improve performance more than the modest gains here. Our code is available at https://github.com/venkatasg/Lil-Bevo and out models at https://huggingface.co/collections/venkatasg/babylm-653591cdb66f4bf68922873aComment: Proceedings of the BabyLM Challeng

    Artificial intelligence and indigenous perspectives: Protecting and empowering intelligent human beings

    No full text
    As 'control' is increasingly ceded to AI systems, potentially Artificial General Intelligence (AGI) humanity may be facing an identity crisis sooner rather than later, whereby the notion of 'intelligence' no longer remains solely our own. This paper characterizes the problem in terms of an impending loss of control and proposes a relational shift in our attitude towards AI. The shortcomings of value alignment as a solution to the problem are outlined which necessitate an extension of these principles. One such approach is considering strongly relational Indigenous epistemologies. The value of Indigenous perspectives has not been canvassed widely in the literature. Their utility becomes clear when considering the existence of well-developed epistemologies adept at accounting for the non-human, a task that defies Western anthropocentrism. Accommodating AI by considering it as part of our network is a step towards building a symbiotic relationship. Given that AGI questions our fundamental notions of what it means to have human rights, it is argued that in order to co-exist, we find assistance in Indigenous traditions such as the Hawaiian and Lakota ontologies. Lakota rituals provide comfort with the conception of non-human soulbearer while Hawaiian stories provide possible relational schema to frame our relationship with AI

    Schwann cell precursors represent a neural crest-like hub state with biased multipotency

    Get PDF
    Schwann cell precursors (SCPs) are nerve-associated progenitors that can generate myelinating and non-myelinating Schwann cells but also are multipotent like the neural crest cells from which they originate. SCPs are omnipresent along outgrowing peripheral nerves throughout the body of vertebrate embryos. By using single-cell transcriptomics to generate a gene expression atlas of the entire neural crest lineage, we show that early SCPs and late migratory crest cells have similar transcriptional profiles characterised by a multipotent “hub” state containing cells biased towards traditional neural crest fates. SCPs keep diverging from the neural crest after being primed towards terminal Schwann cells and other fates, with different subtypes residing in distinct anatomical locations. Functional experiments using CRISPR-Cas9 loss-of-function further show that knockout of the common “hub” gene Sox8 causes defects in neural crest-derived cells along peripheral nerves by facilitating differentiation of SCPs towards sympathoadrenal fates. Finally, specific tumour populations found in melanoma, neurofibroma and neuroblastoma map to different stages of SCP/Schwann cell development. Overall, SCPs resemble migrating neural crest cells that maintain multipotency and become transcriptionally primed towards distinct lineages

    Responses to catastrophic AGI risk: a survey

    No full text
    corecore