48 research outputs found

    In-Context Learning Creates Task Vectors

    Full text link
    In-context learning (ICL) in Large Language Models (LLMs) has emerged as a powerful new learning paradigm. However, its underlying mechanism is still not well understood. In particular, it is challenging to map it to the "standard" machine learning framework, where one uses a training set SS to find a best-fitting function f(x)f(x) in some hypothesis class. Here we make progress on this problem by showing that the functions learned by ICL often have a very simple structure: they correspond to the transformer LLM whose only inputs are the query xx and a single "task vector" calculated from the training set. Thus, ICL can be seen as compressing SS into a single task vector θ(S)\boldsymbol{\theta}(S) and then using this task vector to modulate the transformer to produce the output. We support the above claim via comprehensive experiments across a range of models and tasks.Comment: Accepted at Findings of EMNLP 202

    Evaluating the Ripple Effects of Knowledge Editing in Language Models

    Full text link
    Modern language models capture a large body of factual knowledge. However, some facts can be incorrectly induced or become obsolete over time, resulting in factually incorrect generations. This has led to the development of various editing methods that allow updating facts encoded by the model. Evaluation of these methods has primarily focused on testing whether an individual fact has been successfully injected, and if similar predictions for other subjects have not changed. Here we argue that such evaluation is limited, since injecting one fact (e.g. ``Jack Depp is the son of Johnny Depp'') introduces a ``ripple effect'' in the form of additional facts that the model needs to update (e.g.``Jack Depp is the sibling of Lily-Rose Depp''). To address this issue, we propose a novel set of evaluation criteria that consider the implications of an edit on related facts. Using these criteria, we then construct \ripple{}, a diagnostic benchmark of 5K factual edits, capturing a variety of types of ripple effects. We evaluate prominent editing methods on \ripple{}, showing that current methods fail to introduce consistent changes in the model's knowledge. In addition, we find that a simple in-context editing baseline obtains the best scores on our benchmark, suggesting a promising research direction for model editing

    The Effects of External Jugular Compression Applied during Head Impact Exposure on Longitudinal Changes in Brain Neuroanatomical and Neurophysiological Biomarkers: A Preliminary Investigation

    Get PDF
    Objectives: Utilize a prospective in vivo clinical trial to evaluate the potential for mild neck compression applied during head impact exposure to reduce anatomical and physiological biomarkers of brain injury. Methods: This project utilized a prospective randomized controlled trial to evaluate effects of mild jugular vein (neck) compression (collar) relative to controls (no collar) during a competitive hockey season (males; 16.3 ± 1.2 years). The collar was designed to mildly compress the jugular vein bilaterally with the goal to increase intracranial blood volume to reduce risk of brain slosh injury during head impact exposure. Helmet sensors were used to collect daily impact data in excess of 20 g (games and practices) and the primary outcome measures, which included changes in white matter (WM) microstructure, were assessed by diffusion tensor imaging (DTI). Specifically, four DTI measures: fractional anisotropy, mean diffusivity (MD), axial diffusivity, and radial diffusivity (RD) were used in the study. These metrics were analyzed using the tract-based Spatial Statistics (TBSS) approach – a voxel-based analysis. In addition, electroencephalography-derived event-related potentials were used to assess changes in brain network activation (BNA) between study groups. Results: For athletes not wearing the collar, DTI measures corresponding to a disruption of WM microstructure, including MD and RD, increased significantly from pre-season to mid-season (p 0.05). In addition to these anatomical findings, electrophysiological network analysis of the degree of congruence in the network electrophysiological activation pattern demonstrated concomitant changes in brain network dynamics in the non-collar group only (p < 0.05). Similar to the DTI findings, the increased change in BNA score in the non-collar relative to the collar group was statistically significant (p < 0.01). Changes in DTI outcomes were also directly correlated with altered brain network dynamics (r = 0.76; p < 0.05) as measured by BNA. Conclusion: Group differences in the longitudinal changes in both neuroanatomical and electrophysiological measures, as well as the correlation between the measures, provide initial evidence indicating that mild jugular vein compression may have reduced alterations in the WM response to head impacts during a competitive hockey season. The data indicate sport-related alterations in WM microstructure were ameliorated by application of jugular compression during head impact exposure. These results may lead to a novel line of research inquiry to evaluate the effects of protecting the brain from sports-related head impacts via optimized intracranial fluid dynamics

    The Relative Contribution of High-Gamma Linguistic Processing Stages of Word Production, and Motor Imagery of Articulation in Class Separability of Covert Speech Tasks in EEG Data

    Get PDF
    Word production begins with high-Gamma automatic linguistic processing functions followed by speech motor planning and articulation. Phonetic properties are processed in both linguistic and motor stages of word production. Four phonetically dissimilar phonemic structures “BA”, “FO”, “LE”, and “RY” were chosen as covert speech tasks. Ten neurologically healthy volunteers with the age range of 21–33 participated in this experiment. Participants were asked to covertly speak a phonemic structure when they heard an auditory cue. EEG was recorded with 64 electrodes at 2048 samples/s. Initially, one-second trials were used, which contained linguistic and motor imagery activities. The four-class true positive rate was calculated. In the next stage, 312 ms trials were used to exclude covert articulation from analysis. By eliminating the covert articulation stage, the four-class grand average classification accuracy dropped from 96.4% to 94.5%. The most valuable features emerge after Auditory cue recognition (~100 ms post onset), and within the 70–128 Hz frequency range. The most significant identified brain regions were the Prefrontal Cortex (linked to stimulus driven executive control), Wernicke’s area (linked to Phonological code retrieval), the right IFG, and Broca’s area (linked to syllabification). Alpha and Beta band oscillations associated with motor imagery do not contain enough information to fully reflect the complexity of speech movements. Over 90% of the most class-dependent features were in the 30-128 Hz range, even during the covert articulation stage. As a result, compared to linguistic functions, the contribution of motor imagery of articulation in class separability of covert speech tasks from EEG data is negligible

    Brain Network Activation Analysis Utilizing Spatiotemporal Features for Event Related Potentials Classification

    Get PDF
    The purpose of this study was to introduce an improved tool for automated classification of event-related potentials (ERPs) using spatiotemporally parcellated events incorporated into a functional brain network activation (BNA) analysis. The auditory oddball ERP paradigm was selected to demonstrate and evaluate the improved tool. Methods: The ERPs of each subject were decomposed into major dynamic spatiotemporal events. Then, a set of spatiotemporal events representing the group was generated by aligning and clustering the spatiotemporal events of all individual subjects. The temporal relationship between the common group events generated a network, which is the spatiotemporal reference BNA model. Scores were derived by comparing each subject’s spatiotemporal events to the reference BNA model and were then entered into a support vector machine classifier to classify subjects into relevant subgroups. The reliability of the BNA scores (test-retest repeatability using intraclass correlation) and their utility as a classification tool were examined in the context of Target-Novel classification. Results: BNA intraclass correlation values of repeatability ranged between 0.51 and 0.82 for the known ERP components N100, P200 and P300. Classification accuracy was high when the trained data were validated on the same subjects for different visits (AUCs 0.93 and 0.95). The classification accuracy remained high for a test group recorded at a different clinical center with a different recording system (AUCs 0.81, 0.85 for 2 visits). Conclusion: The improved spatiotemporal BNA analysis demonstrates high classification accuracy. The BNA analysis method holds promise as a tool for diagnosis, follow-up and drug development associated with different neurological conditions

    Simulation of cognitive disturbances by a dynamic threshold semantic neural network

    No full text
    corecore