5,087 research outputs found

    Task interruption

    Get PDF

    Emergent inabilities? Inverse scaling over the course of pretraining

    Full text link
    Does inverse scaling only occur as a function of model size, or can it also occur over the course of training? We carry out an exploratory study investigating whether the performance of language models on specific tasks can decrease (while general performance remains high) during training on the language modeling task. We find 8 tasks on which Pythia 12B (Biderman et al., 2023) shows decreased performance over the course of training. Five of these tasks (TruthfulQA-MC1, TruthfulQA-MC2, Hindsight Neglect, Memo Trap, and Pattern Match Suppression) additionally show a consistent relationship whereby larger language models show a greater decrease in performance the more they are trained, despite showing standard (positive) scaling overall. This highlights the importance of testing performance at all relevant benchmarks any time models are trained on additional data, even if their overall performance improvesComment: Accepted to Findings of EMNLP 202

    Late Albian adaptive radiation in the calcareous nannofossil genus Eiffellithus

    Get PDF

    Do language models make human-like predictions about the coreferents of Italian anaphoric zero pronouns?

    Full text link
    Some languages allow arguments to be omitted in certain contexts. Yet human language comprehenders reliably infer the intended referents of these zero pronouns, in part because they construct expectations about which referents are more likely. We ask whether Neural Language Models also extract the same expectations. We test whether 12 contemporary language models display expectations that reflect human behavior when exposed to sentences with zero pronouns from five behavioral experiments conducted in Italian by Carminati (2005). We find that three models - XGLM 2.9B, 4.5B, and 7.5B - capture the human behavior from all the experiments, with others successfully modeling some of the results. This result suggests that human expectations about coreference can be derived from exposure to language, and also indicates features of language models that allow them to better reflect human behavior.Comment: Accepted at COLING 202

    Does clinical management improve outcomes following self-Harm? Results from the multicentre study of self-harm in England

    Get PDF
    Background Evidence to guide clinical management of self-harm is sparse, trials have recruited selected samples, and psychological treatments that are suggested in guidelines may not be available in routine practice. Aims To examine how the management that patients receive in hospital relates to subsequent outcome. Methods We identified episodes of self-harm presenting to three UK centres (Derby, Manchester, Oxford) over a 10 year period (2000 to 2009). We used established data collection systems to investigate the relationship between four aspects of management (psychosocial assessment, medical admission, psychiatric admission, referral for specialist mental health follow up) and repetition of self-harm within 12 months, adjusted for differences in baseline demographic and clinical characteristics. Results 35,938 individuals presented with self-harm during the study period. In two of the three centres, receiving a psychosocial assessment was associated with a 40% lower risk of repetition, Hazard Ratios (95% CIs): Centre A 0.99 (0.90ā€“1.09); Centre B 0.59 (0.48ā€“0.74); Centre C 0.59 (0.52ā€“0.68). There was little indication that the apparent protective effects were mediated through referral and follow up arrangements. The association between psychosocial assessment and a reduced risk of repetition appeared to be least evident in those from the most deprived areas. Conclusion These findings add to the growing body of evidence that thorough assessment is central to the management of self-harm, but further work is needed to elucidate the possible mechanisms and explore the effects in different clinical subgroups

    Unraveling the Relation Between Reading Comprehension and Print Exposure

    Get PDF
    The purpose of this study was to test the directionality of influence between reading comprehension (RC) and print exposure (PE), thereby estimating genetic and environmental effects of this relation. The sample consisted of 910 twins in fourth through ninth grades (MageĀ =Ā 12.33Ā years, SDĀ =Ā 1.41) from the Florida Twin Project on Reading, Behavior, and Environment. Using direction-of-causation model in a twin design, results supported a direction of influence running from RC to PE. This relation was underpinned by genetic and environmental factors of RC as well as PE. Implications for reading education are discussed

    Chemically defined culture media: rational recipes or witches' brew?

    Get PDF
    A rational approach to study cells, tissues or even organs is to isolate them from the body and bring them into a controlled, and therefore reproducible, environment. In vivo, cells are surrounded by the extracellular matrix, and the body fluids nourish them. In vitro, these fluids are replaced by culture media. In the early days of tissue culture, tissue was cultured in a drop of clotted lymph. The early-day natural nutrient media have gradually become replaced by media of a more defined composition, culminating in the advent of completely defined culture media.Biomedical Reviews 1996; 6: 111-119

    Chlamydia control activities in Europe: cross-sectional survey

    Get PDF
    Background: Chlamydia is the most commonly reported bacterial sexually transmitted infection in Europe. The objective of the Screening for Chlamydia in Europe (SCREen) project was to describe current and planned chlamydia control activities in Europe. Methods: The authors sent a questionnaire asking about different aspects of chlamydia epidemiology and control to public health and clinical experts in each country in 2007. The principles of sexually transmitted infection control were used to develop a typology comprising five categories of chlamydia control activities. Each country was assigned to a category, based on responses to the questionnaire. Results: Experts in 29 of 33 (88%) invited countries responded. Thirteen of 29 countries (45%) had no current chlamydia control activities. Six countries in this group stated that there were plans to introduce chlamydia screening programmes. There were five countries (17%) with case management guidelines only. Three countries (10%) also recommended case finding amongst partners of diagnosed chlamydia cases or people with another sexually transmitted infection. Six countries (21%) further specified groups of asymptomatic people eligible for opportunistic chlamydia testing. Two countries (7%) reported a chlamydia screening programme. There was no consistent association between the per capita gross domestic product of a country and the intensity of chlamydia control activities (Pā€‰=ā€‰0.816). Conclusion: A newly developed classification system allowed the breadth of ongoing national chlamydia control activities to be described and categorized. Chlamydia control strategies should ensure that clinical guidelines to optimize chlamydia diagnosis and case management have been implemented before considering the appropriateness of screening programmes
    • ā€¦
    corecore