Search CORE

48 research outputs found

In-Context Learning Creates Task Vectors

Author: Geva Mor
Globerson Amir
Hendel Roee
Publication venue
Publication date: 24/10/2023
Field of study

In-context learning (ICL) in Large Language Models (LLMs) has emerged as a powerful new learning paradigm. However, its underlying mechanism is still not well understood. In particular, it is challenging to map it to the "standard" machine learning framework, where one uses a training set

S

to find a best-fitting function

f(x)

in some hypothesis class. Here we make progress on this problem by showing that the functions learned by ICL often have a very simple structure: they correspond to the transformer LLM whose only inputs are the query

x

and a single "task vector" calculated from the training set. Thus, ICL can be seen as compressing

S

into a single task vector

\boldsymbol{\theta}(S)

and then using this task vector to modulate the transformer to produce the output. We support the above claim via comprehensive experiments across a range of models and tasks.Comment: Accepted at Findings of EMNLP 202

arXiv.org e-Print Archive

Evaluating the Ripple Effects of Knowledge Editing in Language Models

Author: Biran Eden
Cohen Roi
Geva Mor
Globerson Amir
Yoran Ori
Publication venue
Publication date: 24/07/2023
Field of study

Modern language models capture a large body of factual knowledge. However, some facts can be incorrectly induced or become obsolete over time, resulting in factually incorrect generations. This has led to the development of various editing methods that allow updating facts encoded by the model. Evaluation of these methods has primarily focused on testing whether an individual fact has been successfully injected, and if similar predictions for other subjects have not changed. Here we argue that such evaluation is limited, since injecting one fact (e.g. ``Jack Depp is the son of Johnny Depp'') introduces a ``ripple effect'' in the form of additional facts that the model needs to update (e.g.``Jack Depp is the sibling of Lily-Rose Depp''). To address this issue, we propose a novel set of evaluation criteria that consider the implications of an edit on related facts. Using these criteria, we then construct \ripple{}, a diagnostic benchmark of 5K factual edits, capturing a variety of types of ripple effects. We evaluate prominent editing methods on \ripple{}, showing that current methods fail to introduce consistent changes in the model's knowledge. In addition, we find that a simple in-context editing baseline obtains the best scores on our benchmark, suggesting a promising research direction for model editing

arXiv.org e-Print Archive

Automated Non-Invasive Identification and Localization of Focal Epileptic Activity by Exploiting Information Derived from Surface EEG Recordings

Author: Alon Friedman
Amir Geva
Dan Kerem
Mayer Aladjem
Merav Ben-Asher
Publication venue: 'IntechOpen'
Publication date: 29/02/2012
Field of study

IntechOpen

The Effects of External Jugular Compression Applied during Head Impact Exposure on Longitudinal Changes in Brain Neuroanatomical and Neurophysiological Biomarkers: A Preliminary Investigation

Author: Adams Janet
Altaye Mekibib
Barber Foss Kim D.
Clark Joseph F.
Dicesare Chris
Geva Amir
Gubanich Paul J.
Khoury Jane C.
Kiefer Adam W.
Krueger Darcy
Leach James
Meehan William P.
Mihalik Jason P.
Myer Gregory D.
Reches Amit
Smith David
Thomas Staci
Weiss Michal
Yuan Weihong
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2016
Field of study

Objectives: Utilize a prospective in vivo clinical trial to evaluate the potential for mild neck compression applied during head impact exposure to reduce anatomical and physiological biomarkers of brain injury. Methods: This project utilized a prospective randomized controlled trial to evaluate effects of mild jugular vein (neck) compression (collar) relative to controls (no collar) during a competitive hockey season (males; 16.3 ± 1.2 years). The collar was designed to mildly compress the jugular vein bilaterally with the goal to increase intracranial blood volume to reduce risk of brain slosh injury during head impact exposure. Helmet sensors were used to collect daily impact data in excess of 20 g (games and practices) and the primary outcome measures, which included changes in white matter (WM) microstructure, were assessed by diffusion tensor imaging (DTI). Specifically, four DTI measures: fractional anisotropy, mean diffusivity (MD), axial diffusivity, and radial diffusivity (RD) were used in the study. These metrics were analyzed using the tract-based Spatial Statistics (TBSS) approach – a voxel-based analysis. In addition, electroencephalography-derived event-related potentials were used to assess changes in brain network activation (BNA) between study groups. Results: For athletes not wearing the collar, DTI measures corresponding to a disruption of WM microstructure, including MD and RD, increased significantly from pre-season to mid-season (p 0.05). In addition to these anatomical findings, electrophysiological network analysis of the degree of congruence in the network electrophysiological activation pattern demonstrated concomitant changes in brain network dynamics in the non-collar group only (p < 0.05). Similar to the DTI findings, the increased change in BNA score in the non-collar relative to the collar group was statistically significant (p < 0.01). Changes in DTI outcomes were also directly correlated with altered brain network dynamics (r = 0.76; p < 0.05) as measured by BNA. Conclusion: Group differences in the longitudinal changes in both neuroanatomical and electrophysiological measures, as well as the correlation between the measures, provide initial evidence indicating that mild jugular vein compression may have reduced alterations in the WM response to head impacts during a competitive hockey season. The data indicate sport-related alterations in WM microstructure were ameliorated by application of jugular compression during head impact exposure. These results may lead to a novel line of research inquiry to evaluate the effects of protecting the brain from sports-related head impacts via optimized intracranial fluid dynamics

Crossref

Harvard University - DASH

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Carolina Digital Repository

The Relative Contribution of High-Gamma Linguistic Processing Stages of Word Production, and Motor Imagery of Articulation in Class Separability of Covert Speech Tasks in EEG Data

Author: A Cummingsn
A Delorme
A Flinker
A Llorens
A Vuckovic
A Vuckovic
Amir Jahangiri
BW McMenamin
BW McMenamina
C Porcaro
C Vidaurre
C Xun
CL Asplund
DL Davies
E Kaan
EC Leuthardt
Eduard Kraft
EM Whitham
F Darvas
F Pulvermoller
Francisco Sepulveda
GD Schott
II Goncharova
J Numminena
J Onton
JD Greenlee
JS Brumberg
K Sutha
K Watkins
LY Ganushchak
M Fukuda
M Leinenger
MA Pitts
NO Schiller
NS Dhanjal
O Creutzfeldt
P Indefrey
P Indefrey
PK McGuire
Q Shie
RC Martin
RQ Quiroga
S Blanco
S Chakrabarti
S Geva
S Qian
SD Muthukumaraswamy
SM Varghese
SN Baker
U Maulik
V Jurcak
V Kumar
VL Towle
X Chi
X Pei
X Tian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/02/2019
Field of study

Word production begins with high-Gamma automatic linguistic processing functions followed by speech motor planning and articulation. Phonetic properties are processed in both linguistic and motor stages of word production. Four phonetically dissimilar phonemic structures “BA”, “FO”, “LE”, and “RY” were chosen as covert speech tasks. Ten neurologically healthy volunteers with the age range of 21–33 participated in this experiment. Participants were asked to covertly speak a phonemic structure when they heard an auditory cue. EEG was recorded with 64 electrodes at 2048 samples/s. Initially, one-second trials were used, which contained linguistic and motor imagery activities. The four-class true positive rate was calculated. In the next stage, 312 ms trials were used to exclude covert articulation from analysis. By eliminating the covert articulation stage, the four-class grand average classification accuracy dropped from 96.4% to 94.5%. The most valuable features emerge after Auditory cue recognition (~100 ms post onset), and within the 70–128 Hz frequency range. The most significant identified brain regions were the Prefrontal Cortex (linked to stimulus driven executive control), Wernicke’s area (linked to Phonological code retrieval), the right IFG, and Broca’s area (linked to syllabification). Alpha and Beta band oscillations associated with motor imagery do not contain enough information to fully reflect the complexity of speech movements. Over 90% of the most class-dependent features were in the 30-128 Hz range, even during the covert articulation stage. As a result, compared to linguistic functions, the contribution of motor imagery of articulation in class separability of covert speech tasks from EEG data is negligible

University of Essex Research Repository

Crossref

Brain Network Activation Analysis Utilizing Spatiotemporal Features for Event Related Potentials Classification

Author: Amir Geva
Amit Reches
Yaki Stern
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2016
Field of study

The purpose of this study was to introduce an improved tool for automated classification of event-related potentials (ERPs) using spatiotemporally parcellated events incorporated into a functional brain network activation (BNA) analysis. The auditory oddball ERP paradigm was selected to demonstrate and evaluate the improved tool. Methods: The ERPs of each subject were decomposed into major dynamic spatiotemporal events. Then, a set of spatiotemporal events representing the group was generated by aligning and clustering the spatiotemporal events of all individual subjects. The temporal relationship between the common group events generated a network, which is the spatiotemporal reference BNA model. Scores were derived by comparing each subject’s spatiotemporal events to the reference BNA model and were then entered into a support vector machine classifier to classify subjects into relevant subgroups. The reliability of the BNA scores (test-retest repeatability using intraclass correlation) and their utility as a classification tool were examined in the context of Target-Novel classification. Results: BNA intraclass correlation values of repeatability ranged between 0.51 and 0.82 for the known ERP components N100, P200 and P300. Classification accuracy was high when the trained data were validated on the same subjects for different visits (AUCs 0.93 and 0.95). The classification accuracy remained high for a test group recorded at a different clinical center with a different recording system (AUCs 0.81, 0.85 for 2 visits). Conclusion: The improved spatiotemporal BNA analysis demonstrates high classification accuracy. The BNA analysis method holds promise as a tool for diagnosis, follow-up and drug development associated with different neurological conditions

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Simulation of cognitive disturbances by a dynamic threshold semantic neural network

Author: AMIR B. GEVA
AVI PELED
Publication venue: 'Cambridge University Press (CUP)'
Publication date
Field of study

Crossref

Errors in a nonlinear graphic-semantic mapping task resulting from lesions in Boltzmann machine: Is it relevant to dyslexia?

Author: AMIR B. GEVA
LIOR SHTRAM
SHAI POLICKER
Publication venue: 'Cambridge University Press (CUP)'
Publication date
Field of study

Crossref