50 research outputs found
Language models show human-like content effects on reasoning
Abstract reasoning is a key ability for an intelligent system. Large language
models achieve above-chance performance on abstract reasoning tasks, but
exhibit many imperfections. However, human abstract reasoning is also
imperfect, and depends on our knowledge and beliefs about the content of the
reasoning problem. For example, humans reason much more reliably about logical
rules that are grounded in everyday situations than arbitrary rules about
abstract attributes. The training experiences of language models similarly
endow them with prior expectations that reflect human knowledge and beliefs. We
therefore hypothesized that language models would show human-like content
effects on abstract reasoning problems. We explored this hypothesis across
three logical reasoning tasks: natural language inference, judging the logical
validity of syllogisms, and the Wason selection task (Wason, 1968). We find
that state of the art large language models (with 7 or 70 billion parameters;
Hoffman et al., 2022) reflect many of the same patterns observed in humans
across these tasks -- like humans, models reason more effectively about
believable situations than unrealistic or abstract ones. Our findings have
implications for understanding both these cognitive effects, and the factors
that contribute to language model performance
Binocular vision and foraging in ducks, geese and swans (Anatidae)
Wide variation in visual field configuration across avian species is hypothesized to be driven primarily by foraging ecology and predator detection. While some studies of selected taxa have identified relationships between foraging ecology and binocular field characteristics in particular species, few have accounted for the relevance of shared ancestry. We conducted a large-scale, comparative analysis across 39 Anatidae species to investigate the relationship between the foraging ecology traits of diet or behaviour and binocular field parameters, while controlling for phylogeny. We used phylogenetic models to examine correlations between traits and binocular field characteristics, using unidimensional and morphometric approaches. We found that foraging behaviour influenced three parameters of binocular field size: maximum binocular field width, vertical binocular field extent, and angular separation between the eye-bill projection and the direction of maximum binocular field width. Foraging behaviour and body mass each influenced two descriptors of binocular field shape. Phylogenetic relatedness had minimal influence on binocular field size and shape, apart from vertical binocular field extent. Binocular field differences are associated with specific foraging behaviours, as related to the perceptual challenges of obtaining different food items from aquatic and terrestrial environments
Can language models learn from explanations in context?
Large language models can perform new tasks by adapting to a few in-context
examples. For humans, rapid learning from examples can benefit from
explanations that connect examples to task principles. We therefore investigate
whether explanations of few-shot examples can allow language models to adapt
more effectively. We annotate a set of 40 challenging tasks from BIG-Bench with
explanations of answers to a small subset of questions, as well as a variety of
matched control explanations. We evaluate the effects of various zero-shot and
few-shot prompts that include different types of explanations, instructions,
and controls on the performance of a range of large language models. We analyze
these results using statistical multilevel modeling techniques that account for
the nested dependencies among conditions, tasks, prompts, and models. We find
that explanations of examples can improve performance. Adding untuned
explanations to a few-shot prompt offers a modest improvement in performance;
about 1/3 the effect size of adding few-shot examples, but twice the effect
size of task instructions. We then show that explanations tuned for performance
on a small validation set offer substantially larger benefits; building a
prompt by selecting examples and explanations together substantially improves
performance over selecting examples alone. Hand-tuning explanations can
substantially improve performance on challenging tasks. Furthermore, even
untuned explanations outperform carefully matched controls, suggesting that the
benefits are due to the link between an example and its explanation, rather
than lower-level features of the language used. However, only large models can
benefit from explanations. In summary, explanations can support the in-context
learning abilities of large language models o
Tell me why! Explanations support learning relational and causal structure
Inferring the abstract relational and causal structure of the world is a
major challenge for reinforcement-learning (RL) agents. For humans,
language--particularly in the form of explanations--plays a considerable role
in overcoming this challenge. Here, we show that language can play a similar
role for deep RL agents in complex environments. While agents typically
struggle to acquire relational and causal knowledge, augmenting their
experience by training them to predict language descriptions and explanations
can overcome these limitations. We show that language can help agents learn
challenging relational tasks, and examine which aspects of language contribute
to its benefits. We then show that explanations can help agents to infer not
only relational but also causal structure. Language can shape the way that
agents to generalize out-of-distribution from ambiguous, causally-confounded
training, and explanations even allow agents to learn to perform experimental
interventions to identify causal relationships. Our results suggest that
language description and explanation may be powerful tools for improving agent
learning and generalization.Comment: ICML 2022; 23 page
Methodological criteria for the assessment of moderators in systematic reviews of randomised controlled trials : a consensus study
Background: Current methodological guidelines provide advice about the assessment of sub-group analysis within
RCTs, but do not specify explicit criteria for assessment. Our objective was to provide researchers with a set of
criteria that will facilitate the grading of evidence for moderators, in systematic reviews.
Method: We developed a set of criteria from methodological manuscripts (n = 18) using snowballing technique,
and electronic database searches. Criteria were reviewed by an international Delphi panel (n = 21), comprising
authors who have published methodological papers in this area, and researchers who have been active in the
study of sub-group analysis in RCTs. We used the Research ANd Development/University of California Los Angeles
appropriateness method to assess consensus on the quantitative data. Free responses were coded for consensus
and disagreement. In a subsequent round additional criteria were extracted from the Cochrane Reviewers’
Handbook, and the process was repeated.
Results: The recommendations are that meta-analysts report both confirmatory and exploratory findings for subgroups
analysis. Confirmatory findings must only come from studies in which a specific theory/evidence based apriori
statement is made. Exploratory findings may be used to inform future/subsequent trials. However, for
inclusion in the meta-analysis of moderators, the following additional criteria should be applied to each study:
Baseline factors should be measured prior to randomisation, measurement of baseline factors should be of
adequate reliability and validity, and a specific test of the interaction between baseline factors and interventions
must be presented.
Conclusions: There is consensus from a group of 21 international experts that methodological criteria to assess
moderators within systematic reviews of RCTs is both timely and necessary. The consensus from the experts
resulted in five criteria divided into two groups when synthesising evidence: confirmatory findings to support
hypotheses about moderators and exploratory findings to inform future research. These recommendations are
discussed in reference to previous recommendations for evaluating and reporting moderator studies
Embryo movement is more frequent in avian brood parasites than birds with parental reproductive strategies.
Funder: Tanzanian Commission for Science and TechnologyFunder: Tanzania Wildlife Research InstituteFunder: NERCFunder: National Science FoundationFunder: Ministry of EducationFunder: German Academic Exchange ServiceFunder: University of Cape TownFunder: Max-Planck-GesellschaftMovement of the embryo is essential for musculoskeletal development in vertebrates, yet little is known about whether, and why, species vary. Avian brood parasites exhibit feats of strength in early life as adaptations to exploit the hosts that rear them. We hypothesized that an increase in embryonic movement could allow brood parasites to develop the required musculature for these demands. We measured embryo movement across incubation for multiple brood-parasitic and non-parasitic bird species. Using a phylogenetically controlled analysis, we found that brood parasites exhibited significantly increased muscular movement during incubation compared to non-parasites. This suggests that increased embryo movement may facilitate the development of the stronger musculoskeletal system required for the demanding tasks undertaken by young brood parasites
Recommended from our members
Eggshell composition and surface properties of avian brood-parasitic species compared with non-parasitic species
The eggs of avian obligate brood-parasitic species have multiple adaptations to deceive hosts and optimize development in host nests. While the structure and composition of the eggshell in all birds is essential for embryo growth and protection from external threats, parasitic eggs may face specific challenges such as high microbial loads, rapid laying and ejection by the host parents. We set out to assess whether eggshells of avian brood-parasitic species have either (i) specialized structural properties, to meet the demands of a brood-parasitic strategy or (ii) similar structural properties to eggs of their hosts, due to the similar nest environment. We measured the surface topography (roughness), wettability (how well surfaces repel water) and calcium content of eggshells of a phylogenetically and geographically diverse range of brood-parasitic species (representing four of the seven independent lineages of avian brood-parasitic species), their hosts and close relatives of the parasites. These components of the eggshell structure have been demonstrated previously to influence such factors as the risk of microbial infection and overall shell strength. Within a phylogenetically controlled framework, we found no overall significant differences in eggshell roughness, wettability and calcium content between (i) parasitic and non-parasitic species, or (ii) parasitic species and their hosts. Both the wettability and calcium content of the eggs from brood-parasitic species were not more similar to those of their hosts' eggs than expected by chance. By contrast, the mean surface roughness of the eggs of brood-parasitic species was more similar to that of their hosts’ eggs than expected by chance, suggesting brood-parasitic species may have evolved to lay eggs that match the host nest environment for this trait. The lack of significant overall differences between parasitic and non-parasitic species, including hosts, in the traits we measured, suggests that phylogenetic signal, as well as general adaptations to the nest environment and for embryo development, outweigh any influence of a parasitic lifestyle on these eggshell properties
Effects of antiplatelet therapy on stroke risk by brain imaging features of intracerebral haemorrhage and cerebral small vessel diseases: subgroup analyses of the RESTART randomised, open-label trial
Background
Findings from the RESTART trial suggest that starting antiplatelet therapy might reduce the risk of recurrent symptomatic intracerebral haemorrhage compared with avoiding antiplatelet therapy. Brain imaging features of intracerebral haemorrhage and cerebral small vessel diseases (such as cerebral microbleeds) are associated with greater risks of recurrent intracerebral haemorrhage. We did subgroup analyses of the RESTART trial to explore whether these brain imaging features modify the effects of antiplatelet therapy
Salmonella Typhi-specific multifunctional CD8+ T cells play a dominant role in protection from typhoid fever in humans.
BACKGROUND: Typhoid fever, caused by the human-restricted organism Salmonella Typhi (S. Typhi), is a major public health problem worldwide. Development of novel vaccines remains imperative, but is hampered by an incomplete understanding of the immune responses that correlate with protection.
METHODS: Recently, a controlled human infection model was re-established in which volunteers received ~10(3) cfu wild-type S. Typhi (Quailes strain) orally. Twenty-one volunteers were evaluated for their cell-mediated immune (CMI) responses. Ex vivo PBMC isolated before and up to 1 year after challenge were exposed to three S. Typhi-infected targets, i.e., autologous B lymphoblastoid cell-lines (B-LCL), autologous blasts and HLA-E restricted AEH B-LCL cells. CMI responses were evaluated using 14-color multiparametric flow cytometry to detect simultaneously five intracellular cytokines/chemokines (i.e., IL-17A, IL-2, IFN-g, TNF-a and MIP-1b) and a marker of degranulation/cytotoxic activity (CD107a).
RESULTS: Herein we provide the first evidence that S. Typhi-specific CD8+ responses correlate with clinical outcome in humans challenged with wild-type S. Typhi. Higher multifunctional S. Typhi-specific CD8+ baseline responses were associated with protection against typhoid and delayed disease onset. Moreover, following challenge, development of typhoid fever was accompanied by decreases in circulating S. Typhi-specific CD8+ T effector/memory (TEM) with gut homing potential, suggesting migration to the site(s) of infection. In contrast, protection against disease was associated with low or no changes in circulating S. Typhi-specific TEM.
CONCLUSIONS: These studies provide novel insights into the protective immune responses against typhoid disease that will aid in selection and development of new vaccine candidates
Increasing frailty is associated with higher prevalence and reduced recognition of delirium in older hospitalised inpatients: results of a multi-centre study
Purpose:
Delirium is a neuropsychiatric disorder delineated by an acute change in cognition, attention, and consciousness. It is common, particularly in older adults, but poorly recognised. Frailty is the accumulation of deficits conferring an increased risk of adverse outcomes. We set out to determine how severity of frailty, as measured using the CFS, affected delirium rates, and recognition in hospitalised older people in the United Kingdom.
Methods:
Adults over 65 years were included in an observational multi-centre audit across UK hospitals, two prospective rounds, and one retrospective note review. Clinical Frailty Scale (CFS), delirium status, and 30-day outcomes were recorded.
Results:
The overall prevalence of delirium was 16.3% (483). Patients with delirium were more frail than patients without delirium (median CFS 6 vs 4). The risk of delirium was greater with increasing frailty [OR 2.9 (1.8–4.6) in CFS 4 vs 1–3; OR 12.4 (6.2–24.5) in CFS 8 vs 1–3]. Higher CFS was associated with reduced recognition of delirium (OR of 0.7 (0.3–1.9) in CFS 4 compared to 0.2 (0.1–0.7) in CFS 8). These risks were both independent of age and dementia.
Conclusion:
We have demonstrated an incremental increase in risk of delirium with increasing frailty. This has important clinical implications, suggesting that frailty may provide a more nuanced measure of vulnerability to delirium and poor outcomes. However, the most frail patients are least likely to have their delirium diagnosed and there is a significant lack of research into the underlying pathophysiology of both of these common geriatric syndromes