47 research outputs found

    Language models show human-like content effects on reasoning

    Full text link
    Abstract reasoning is a key ability for an intelligent system. Large language models achieve above-chance performance on abstract reasoning tasks, but exhibit many imperfections. However, human abstract reasoning is also imperfect, and depends on our knowledge and beliefs about the content of the reasoning problem. For example, humans reason much more reliably about logical rules that are grounded in everyday situations than arbitrary rules about abstract attributes. The training experiences of language models similarly endow them with prior expectations that reflect human knowledge and beliefs. We therefore hypothesized that language models would show human-like content effects on abstract reasoning problems. We explored this hypothesis across three logical reasoning tasks: natural language inference, judging the logical validity of syllogisms, and the Wason selection task (Wason, 1968). We find that state of the art large language models (with 7 or 70 billion parameters; Hoffman et al., 2022) reflect many of the same patterns observed in humans across these tasks -- like humans, models reason more effectively about believable situations than unrealistic or abstract ones. Our findings have implications for understanding both these cognitive effects, and the factors that contribute to language model performance

    Binocular vision and foraging in ducks, geese and swans (Anatidae)

    Get PDF
    Wide variation in visual field configuration across avian species is hypothesized to be driven primarily by foraging ecology and predator detection. While some studies of selected taxa have identified relationships between foraging ecology and binocular field characteristics in particular species, few have accounted for the relevance of shared ancestry. We conducted a large-scale, comparative analysis across 39 Anatidae species to investigate the relationship between the foraging ecology traits of diet or behaviour and binocular field parameters, while controlling for phylogeny. We used phylogenetic models to examine correlations between traits and binocular field characteristics, using unidimensional and morphometric approaches. We found that foraging behaviour influenced three parameters of binocular field size: maximum binocular field width, vertical binocular field extent, and angular separation between the eye-bill projection and the direction of maximum binocular field width. Foraging behaviour and body mass each influenced two descriptors of binocular field shape. Phylogenetic relatedness had minimal influence on binocular field size and shape, apart from vertical binocular field extent. Binocular field differences are associated with specific foraging behaviours, as related to the perceptual challenges of obtaining different food items from aquatic and terrestrial environments

    Can language models learn from explanations in context?

    Full text link
    Large language models can perform new tasks by adapting to a few in-context examples. For humans, rapid learning from examples can benefit from explanations that connect examples to task principles. We therefore investigate whether explanations of few-shot examples can allow language models to adapt more effectively. We annotate a set of 40 challenging tasks from BIG-Bench with explanations of answers to a small subset of questions, as well as a variety of matched control explanations. We evaluate the effects of various zero-shot and few-shot prompts that include different types of explanations, instructions, and controls on the performance of a range of large language models. We analyze these results using statistical multilevel modeling techniques that account for the nested dependencies among conditions, tasks, prompts, and models. We find that explanations of examples can improve performance. Adding untuned explanations to a few-shot prompt offers a modest improvement in performance; about 1/3 the effect size of adding few-shot examples, but twice the effect size of task instructions. We then show that explanations tuned for performance on a small validation set offer substantially larger benefits; building a prompt by selecting examples and explanations together substantially improves performance over selecting examples alone. Hand-tuning explanations can substantially improve performance on challenging tasks. Furthermore, even untuned explanations outperform carefully matched controls, suggesting that the benefits are due to the link between an example and its explanation, rather than lower-level features of the language used. However, only large models can benefit from explanations. In summary, explanations can support the in-context learning abilities of large language models o

    Tell me why! Explanations support learning relational and causal structure

    Full text link
    Inferring the abstract relational and causal structure of the world is a major challenge for reinforcement-learning (RL) agents. For humans, language--particularly in the form of explanations--plays a considerable role in overcoming this challenge. Here, we show that language can play a similar role for deep RL agents in complex environments. While agents typically struggle to acquire relational and causal knowledge, augmenting their experience by training them to predict language descriptions and explanations can overcome these limitations. We show that language can help agents learn challenging relational tasks, and examine which aspects of language contribute to its benefits. We then show that explanations can help agents to infer not only relational but also causal structure. Language can shape the way that agents to generalize out-of-distribution from ambiguous, causally-confounded training, and explanations even allow agents to learn to perform experimental interventions to identify causal relationships. Our results suggest that language description and explanation may be powerful tools for improving agent learning and generalization.Comment: ICML 2022; 23 page

    Methodological criteria for the assessment of moderators in systematic reviews of randomised controlled trials : a consensus study

    Get PDF
    Background: Current methodological guidelines provide advice about the assessment of sub-group analysis within RCTs, but do not specify explicit criteria for assessment. Our objective was to provide researchers with a set of criteria that will facilitate the grading of evidence for moderators, in systematic reviews. Method: We developed a set of criteria from methodological manuscripts (n = 18) using snowballing technique, and electronic database searches. Criteria were reviewed by an international Delphi panel (n = 21), comprising authors who have published methodological papers in this area, and researchers who have been active in the study of sub-group analysis in RCTs. We used the Research ANd Development/University of California Los Angeles appropriateness method to assess consensus on the quantitative data. Free responses were coded for consensus and disagreement. In a subsequent round additional criteria were extracted from the Cochrane Reviewers’ Handbook, and the process was repeated. Results: The recommendations are that meta-analysts report both confirmatory and exploratory findings for subgroups analysis. Confirmatory findings must only come from studies in which a specific theory/evidence based apriori statement is made. Exploratory findings may be used to inform future/subsequent trials. However, for inclusion in the meta-analysis of moderators, the following additional criteria should be applied to each study: Baseline factors should be measured prior to randomisation, measurement of baseline factors should be of adequate reliability and validity, and a specific test of the interaction between baseline factors and interventions must be presented. Conclusions: There is consensus from a group of 21 international experts that methodological criteria to assess moderators within systematic reviews of RCTs is both timely and necessary. The consensus from the experts resulted in five criteria divided into two groups when synthesising evidence: confirmatory findings to support hypotheses about moderators and exploratory findings to inform future research. These recommendations are discussed in reference to previous recommendations for evaluating and reporting moderator studies

    Embryo movement is more frequent in avian brood parasites than birds with parental reproductive strategies.

    Get PDF
    Funder: Tanzanian Commission for Science and TechnologyFunder: Tanzania Wildlife Research InstituteFunder: NERCFunder: National Science FoundationFunder: Ministry of EducationFunder: German Academic Exchange ServiceFunder: University of Cape TownFunder: Max-Planck-GesellschaftMovement of the embryo is essential for musculoskeletal development in vertebrates, yet little is known about whether, and why, species vary. Avian brood parasites exhibit feats of strength in early life as adaptations to exploit the hosts that rear them. We hypothesized that an increase in embryonic movement could allow brood parasites to develop the required musculature for these demands. We measured embryo movement across incubation for multiple brood-parasitic and non-parasitic bird species. Using a phylogenetically controlled analysis, we found that brood parasites exhibited significantly increased muscular movement during incubation compared to non-parasites. This suggests that increased embryo movement may facilitate the development of the stronger musculoskeletal system required for the demanding tasks undertaken by young brood parasites

    Effects of antiplatelet therapy on stroke risk by brain imaging features of intracerebral haemorrhage and cerebral small vessel diseases: subgroup analyses of the RESTART randomised, open-label trial

    Get PDF
    Background Findings from the RESTART trial suggest that starting antiplatelet therapy might reduce the risk of recurrent symptomatic intracerebral haemorrhage compared with avoiding antiplatelet therapy. Brain imaging features of intracerebral haemorrhage and cerebral small vessel diseases (such as cerebral microbleeds) are associated with greater risks of recurrent intracerebral haemorrhage. We did subgroup analyses of the RESTART trial to explore whether these brain imaging features modify the effects of antiplatelet therapy

    Salmonella Typhi-specific multifunctional CD8+ T cells play a dominant role in protection from typhoid fever in humans.

    Get PDF
    BACKGROUND: Typhoid fever, caused by the human-restricted organism Salmonella Typhi (S. Typhi), is a major public health problem worldwide. Development of novel vaccines remains imperative, but is hampered by an incomplete understanding of the immune responses that correlate with protection. METHODS: Recently, a controlled human infection model was re-established in which volunteers received ~10(3) cfu wild-type S. Typhi (Quailes strain) orally. Twenty-one volunteers were evaluated for their cell-mediated immune (CMI) responses. Ex vivo PBMC isolated before and up to 1 year after challenge were exposed to three S. Typhi-infected targets, i.e., autologous B lymphoblastoid cell-lines (B-LCL), autologous blasts and HLA-E restricted AEH B-LCL cells. CMI responses were evaluated using 14-color multiparametric flow cytometry to detect simultaneously five intracellular cytokines/chemokines (i.e., IL-17A, IL-2, IFN-g, TNF-a and MIP-1b) and a marker of degranulation/cytotoxic activity (CD107a). RESULTS: Herein we provide the first evidence that S. Typhi-specific CD8+ responses correlate with clinical outcome in humans challenged with wild-type S. Typhi. Higher multifunctional S. Typhi-specific CD8+ baseline responses were associated with protection against typhoid and delayed disease onset. Moreover, following challenge, development of typhoid fever was accompanied by decreases in circulating S. Typhi-specific CD8+ T effector/memory (TEM) with gut homing potential, suggesting migration to the site(s) of infection. In contrast, protection against disease was associated with low or no changes in circulating S. Typhi-specific TEM. CONCLUSIONS: These studies provide novel insights into the protective immune responses against typhoid disease that will aid in selection and development of new vaccine candidates

    Increasing frailty is associated with higher prevalence and reduced recognition of delirium in older hospitalised inpatients: results of a multi-centre study

    Get PDF
    Purpose: Delirium is a neuropsychiatric disorder delineated by an acute change in cognition, attention, and consciousness. It is common, particularly in older adults, but poorly recognised. Frailty is the accumulation of deficits conferring an increased risk of adverse outcomes. We set out to determine how severity of frailty, as measured using the CFS, affected delirium rates, and recognition in hospitalised older people in the United Kingdom. Methods: Adults over 65 years were included in an observational multi-centre audit across UK hospitals, two prospective rounds, and one retrospective note review. Clinical Frailty Scale (CFS), delirium status, and 30-day outcomes were recorded. Results: The overall prevalence of delirium was 16.3% (483). Patients with delirium were more frail than patients without delirium (median CFS 6 vs 4). The risk of delirium was greater with increasing frailty [OR 2.9 (1.8–4.6) in CFS 4 vs 1–3; OR 12.4 (6.2–24.5) in CFS 8 vs 1–3]. Higher CFS was associated with reduced recognition of delirium (OR of 0.7 (0.3–1.9) in CFS 4 compared to 0.2 (0.1–0.7) in CFS 8). These risks were both independent of age and dementia. Conclusion: We have demonstrated an incremental increase in risk of delirium with increasing frailty. This has important clinical implications, suggesting that frailty may provide a more nuanced measure of vulnerability to delirium and poor outcomes. However, the most frail patients are least likely to have their delirium diagnosed and there is a significant lack of research into the underlying pathophysiology of both of these common geriatric syndromes
    corecore