498 research outputs found

    Emergent inabilities? Inverse scaling over the course of pretraining

    Full text link
    Does inverse scaling only occur as a function of model size, or can it also occur over the course of training? We carry out an exploratory study investigating whether the performance of language models on specific tasks can decrease (while general performance remains high) during training on the language modeling task. We find 8 tasks on which Pythia 12B (Biderman et al., 2023) shows decreased performance over the course of training. Five of these tasks (TruthfulQA-MC1, TruthfulQA-MC2, Hindsight Neglect, Memo Trap, and Pattern Match Suppression) additionally show a consistent relationship whereby larger language models show a greater decrease in performance the more they are trained, despite showing standard (positive) scaling overall. This highlights the importance of testing performance at all relevant benchmarks any time models are trained on additional data, even if their overall performance improvesComment: Accepted to Findings of EMNLP 202

    Late Albian adaptive radiation in the calcareous nannofossil genus Eiffellithus

    Get PDF

    Do language models make human-like predictions about the coreferents of Italian anaphoric zero pronouns?

    Full text link
    Some languages allow arguments to be omitted in certain contexts. Yet human language comprehenders reliably infer the intended referents of these zero pronouns, in part because they construct expectations about which referents are more likely. We ask whether Neural Language Models also extract the same expectations. We test whether 12 contemporary language models display expectations that reflect human behavior when exposed to sentences with zero pronouns from five behavioral experiments conducted in Italian by Carminati (2005). We find that three models - XGLM 2.9B, 4.5B, and 7.5B - capture the human behavior from all the experiments, with others successfully modeling some of the results. This result suggests that human expectations about coreference can be derived from exposure to language, and also indicates features of language models that allow them to better reflect human behavior.Comment: Accepted at COLING 202

    Can Peanuts Fall in Love with Distributional Semantics?

    Full text link
    The context in which a sentence appears can drastically alter our expectations about upcoming words - for example, following a short story involving an anthropomorphic peanut, experimental participants are more likely to expect the sentence 'the peanut was in love' than 'the peanut was salted', as indexed by N400 amplitude (Nieuwland & van Berkum, 2006). This rapid and dynamic updating of comprehenders' expectations about the kind of events that a peanut may take part in based on context has been explained using the construct of Situation Models - updated mental representations of key elements of an event under discussion, in this case, the peanut protagonist. However, recent work showing that N400 amplitude can be predicted based on distributional information alone raises the question whether situation models are in fact necessary for the kinds of contextual effects observed in previous work. To investigate this question, we attempt to model the results of Nieuwland and van Berkum (2006) using six computational language models and three sets of word vectors, none of which have explicit situation models or semantic grounding. We find that the effect found by Nieuwland and van Berkum (2006) can be fully modeled by two language models and two sets of word vectors, with others showing a reduced effect. Thus, at least some processing effects normally explained through situation models may not in fact require explicit situation models

    Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models

    Full text link
    Abstract grammatical knowledge - of parts of speech and grammatical patterns - is key to the capacity for linguistic generalization in humans. But how abstract is grammatical knowledge in large language models? In the human literature, compelling evidence for grammatical abstraction comes from structural priming. A sentence that shares the same grammatical structure as a preceding sentence is processed and produced more readily. Because confounds exist when using stimuli in a single language, evidence of abstraction is even more compelling from crosslingual structural priming, where use of a syntactic structure in one language primes an analogous structure in another language. We measure crosslingual structural priming in large language models, comparing model behavior to human experimental results from eight crosslingual experiments covering six languages, and four monolingual structural priming experiments in three non-English languages. We find evidence for abstract monolingual and crosslingual grammatical representations in the models that function similarly to those found in humans. These results demonstrate that grammatical representations in multilingual language models are not only similar across languages, but they can causally influence text produced in different languages.Comment: Accepted at EMNLP 202

    Price Discovery and the Accuracy of Consolidated Data Feeds in the U.S. Equity Markets

    Full text link
    Both the scientific community and the popular press have paid much attention to the speed of the Securities Information Processor, the data feed consolidating all trades and quotes across the US stock market. Rather than the speed of the Securities Information Processor, or SIP, we focus here on its accuracy. Relying on Trade and Quote data, we provide various measures of SIP latency relative to high-speed data feeds between exchanges, known as direct feeds. We use first differences to highlight not only the divergence between the direct feeds and the SIP, but also the fundamental inaccuracy of the SIP. We find that as many as 60 percent or more of trades are reported out of sequence for stocks with high trade volume, therefore skewing simple measures such as returns. While not yet definitive, this analysis supports our preliminary conclusion that the underlying infrastructure of the SIP is currently unable to keep pace with the trading activity in today's stock market.Comment: 18 pages, 20 figures, 2 table

    Accelerating slip rates on the Puente Hills blind thrust fault system beneath metropolitan Los Angeles, California, USA

    Get PDF
    Slip rates represent the average displacement across a fault over time and are essential to estimating earthquake recurrence for probabilistic seismic hazard assessments. We demonstrate that the slip rate on the western segment of the Puente Hills blind thrust fault system, which is beneath downtown Los Angeles, California (USA), has accelerated from ∼0.22 mm/yr in the late Pleistocene to ∼1.33 mm/yr in the Holocene. Our analysis is based on syntectonic strata derived from the Los Angeles River, which has continuously buried a fold scarp above the blind thrust. Slip on the fault beneath our field site began during the late-middle Pleistocene and progressively increased into the Holocene. This increase in rate implies that the magnitudes and/or the frequency of earthquakes on this fault segment have increased over time. This challenges the characteristic earthquake model and presents an evolving and potentially increasing seismic hazard to metropolitan Los Angeles

    Are genetic risk factors for psychosis also associated with dimension-specific psychotic experiences in adolescence?

    Get PDF
    Psychosis has been hypothesised to be a continuously distributed quantitative phenotype and disorders such as schizophrenia and bipolar disorder represent its extreme manifestations. Evidence suggests that common genetic variants play an important role in liability to both schizophrenia and bipolar disorder. Here we tested the hypothesis that these common variants would also influence psychotic experiences measured dimensionally in adolescents in the general population. Our aim was to test whether schizophrenia and bipolar disorder polygenic risk scores (PRS), as well as specific single nucleotide polymorphisms (SNPs) previously identified as risk variants for schizophrenia, were associated with adolescent dimension-specific psychotic experiences. Self-reported Paranoia, Hallucinations, Cognitive Disorganisation, Grandiosity, Anhedonia, and Parent-rated Negative Symptoms, as measured by the Specific Psychotic Experiences Questionnaire (SPEQ), were assessed in a community sample of 2,152 16-year-olds. Polygenic risk scores were calculated using estimates of the log of odds ratios from the Psychiatric Genomics Consortium GWAS stage-1 mega-analysis of schizophrenia and bipolar disorder. The polygenic risk analyses yielded no significant associations between schizophrenia and bipolar disorder PRS and the SPEQ measures. The analyses on the 28 individual SNPs previously associated with schizophrenia found that two SNPs in TCF4 returned a significant association with the SPEQ Paranoia dimension, rs17512836 (p-value=2.57x10-4) and rs9960767 (p-value=6.23x10-4). Replication in an independent sample of 16-year-olds (N=3,427) assessed using the Psychotic-Like Symptoms Questionnaire (PLIKS-Q), a composite measure of multiple positive psychotic experiences, failed to yield significant results. Future research with PRS derived from larger samples, as well as larger adolescent validation samples, would improve the predictive power to test these hypotheses further. The challenges of relating adult clinical diagnostic constructs such as schizophrenia to adolescent psychotic experiences at a genetic level are discussed

    A Comparison of Rectal Diazepam Gel and Placebo for Acute Repetitive Seizures

    Get PDF
    ABSTRACT Background Acute repetitive seizures are readily recognizable episodes involving increased seizure frequency. Urgent treatment is often required. Rectal diazepam gel is a promising therapy. Methods We conducted a randomized, doubleblind, parallel-group, placebo-controlled study of home-based treatment for acute repetitive seizures. Patients were randomly assigned to receive either rectal diazepam gel, at doses ranging from 0.2 to 0.5 mg per kilogram of body weight on the basis of age, or placebo. Children received one dose at the onset of acute repetitive seizures and a second dose four hours later. Adults received three doses — one dose at onset, and two more doses 4 and 12 hours after onset. Treatment was administered by a care giver, such as a parent, who had received special training. The number of seizures after the first dose was counted for 12 hours in children and for 24 hours in adults. Results Of 125 study patients (64 assigned to diazepam and 61 to placebo) with a history of acute repetitive seizures, 91 (47 children and 44 adults) were treated for an exacerbation of seizures during the study period. Diazepam treatment was superior to placebo with regard to the outcome variables related to efficacy: reduced seizure frequency (P\u3c0.001) and improved global assessment of treatment outcome by the care giver (frequency and severity of seizures and drug toxicity) (P\u3c0.001). Post hoc analysis showed diazepam to be superior to placebo in reducing seizure frequency in both children (P\u3c0.001) and adults (P=0.02), but only in children was it superior with regard to improvement in global outcome (P\u3c0.001). The time to the first recurrence of seizures after initial treatment was longer for the patients receiving diazepam (P\u3c0.001). Thirty-five patients reported at least one adverse effect of treatment; somnolence was the most frequent. Respiratory depression was not reported. Conclusions Rectal diazepam gel, administered at home by trained care givers, is an effective and welltolerated treatment for acute repetitive seizures. (N Engl J Med 1998;338:1869-75.
    • …
    corecore