61 research outputs found
ScanDL: A Diffusion Model for Generating Synthetic Scanpaths on Texts
Eye movements in reading play a crucial role in psycholinguistic research
studying the cognitive mechanisms underlying human language processing. More
recently, the tight coupling between eye movements and cognition has also been
leveraged for language-related machine learning tasks such as the
interpretability, enhancement, and pre-training of language models, as well as
the inference of reader- and text-specific properties. However, scarcity of eye
movement data and its unavailability at application time poses a major
challenge for this line of research. Initially, this problem was tackled by
resorting to cognitive models for synthesizing eye movement data. However, for
the sole purpose of generating human-like scanpaths, purely data-driven
machine-learning-based methods have proven to be more suitable. Following
recent advances in adapting diffusion processes to discrete data, we propose
ScanDL, a novel discrete sequence-to-sequence diffusion model that generates
synthetic scanpaths on texts. By leveraging pre-trained word representations
and jointly embedding both the stimulus text and the fixation sequence, our
model captures multi-modal interactions between the two inputs. We evaluate
ScanDL within- and across-dataset and demonstrate that it significantly
outperforms state-of-the-art scanpath generation methods. Finally, we provide
an extensive psycholinguistic analysis that underlines the model's ability to
exhibit human-like reading behavior. Our implementation is made available at
https://github.com/DiLi-Lab/ScanDL.Comment: EMNLP 202
Using Gaze for Behavioural Biometrics
A principled approach to the analysis of eye movements for behavioural biometrics is laid
down. The approach grounds in foraging theory, which provides a sound basis to capture the unique-
ness of individual eye movement behaviour. We propose a composite Ornstein-Uhlenbeck process for
quantifying the exploration/exploitation signature characterising the foraging eye behaviour. The rel-
evant parameters of the composite model, inferred from eye-tracking data via Bayesian analysis, are
shown to yield a suitable feature set for biometric identification; the latter is eventually accomplished
via a classical classification technique. A proof of concept of the method is provided by measuring
its identification performance on a publicly available dataset. Data and code for reproducing the
analyses are made available. Overall, we argue that the approach offers a fresh view on either the
analyses of eye-tracking data and prospective applications in this field
Detecting expert’s eye using a multiple-kernel Relevance Vector Machine
Decoding mental states from the pattern of neural activity or overt behavior is an intensely pursued goal. Here we applied machine learning to detect expertise from the oculomotor behavior of novice and expert billiard players during free viewing of a filmed billiard match with no specific task, and in a dynamic trajectory prediction task involving ad-hoc, occluded billiard shots. We have adopted a ground framework for feature space fusion and a Bayesian sparse classifier, namely, a Relevance Vector Machine. By testing different combinations of simple oculomotor features (gaze shifts amplitude and direction, and fixation duration), we could classify on an individual basis which group - novice or expert - the observers belonged to with an accuracy of 82% and 87%, respectively for the match and the shots. These results provide evidence that, at least in the particular domain of billiard sport, a signature of expertise is hidden in very basic aspects of oculomotor behavior, and that expertise can be detected at the individual level both with ad-hoc testing conditions and under naturalistic conditions - and suitable data mining. Our procedure paves the way for the development of a test for the “expert’s eye”, and promotes the use of eye movements as an additional signal source in Brain-Computer-Interface (BCI) systems
Pre-Trained Language Models Augmented with Synthetic Scanpaths for Natural Language Understanding
Human gaze data offer cognitive information that reflects natural language
comprehension. Indeed, augmenting language models with human scanpaths has
proven beneficial for a range of NLP tasks, including language understanding.
However, the applicability of this approach is hampered because the abundance
of text corpora is contrasted by a scarcity of gaze data. Although models for
the generation of human-like scanpaths during reading have been developed, the
potential of synthetic gaze data across NLP tasks remains largely unexplored.
We develop a model that integrates synthetic scanpath generation with a
scanpath-augmented language model, eliminating the need for human gaze data.
Since the model's error gradient can be propagated throughout all parts of the
model, the scanpath generator can be fine-tuned to downstream tasks. We find
that the proposed model not only outperforms the underlying language model, but
achieves a performance that is comparable to a language model augmented with
real human gaze data. Our code is publicly available.Comment: Pre-print for EMNLP 202
The Role of Eye Gaze in Security and Privacy Applications: Survey and Future HCI Research Directions
For the past 20 years, researchers have investigated the use of eye tracking in security applications. We present a holistic view on gaze-based security applications. In particular, we canvassed the literature and classify the utility of gaze in security applications into a) authentication, b) privacy protection, and c) gaze monitoring during security critical tasks. This allows us to chart several research directions, most importantly 1) conducting field studies of implicit and explicit gaze-based authentication due to recent advances in eye tracking, 2) research on gaze-based privacy protection and gaze monitoring in security critical tasks which are under-investigated yet very promising areas, and 3) understanding the privacy implications of pervasive eye tracking. We discuss the most promising opportunities and most pressing challenges of eye tracking for security that will shape research in gaze-based security applications for the next decade
Deep Eyedentification: Biometric Identification using Micro-Movements of the Eye
We study involuntary micro-movements of the eye for biometric identification.
While prior studies extract lower-frequency macro-movements from the output of
video-based eye-tracking systems and engineer explicit features of these
macro-movements, we develop a deep convolutional architecture that processes
the raw eye-tracking signal. Compared to prior work, the network attains a
lower error rate by one order of magnitude and is faster by two orders of
magnitude: it identifies users accurately within seconds
Scanpath modeling and classification with Hidden Markov Models
How people look at visual information reveals fundamental information about them; their interests and their states of mind. Previous studies showed that scanpath, i.e., the sequence of eye movements made by an observer exploring a visual stimulus, can be used to infer observer-related (e.g., task at hand) and stimuli-related (e.g., image semantic category) information. However, eye movements are complex signals and many of these studies rely on limited gaze descriptors and bespoke datasets. Here, we provide a turnkey method for scanpath modeling and classification. This method relies on variational hidden Markov models (HMMs) and discriminant analysis (DA). HMMs encapsulate the dynamic and individualistic dimensions of gaze behavior, allowing DA to capture systematic patterns diagnostic of a given class of observers and/or stimuli. We test our approach on two very different datasets. Firstly, we use fixations recorded while viewing 800 static natural scene images, and infer an observer-related characteristic: the task at hand. We achieve an average of 55.9% correct classification rate (chance = 33%). We show that correct classification rates positively correlate with the number of salient regions present in the stimuli. Secondly, we use eye positions recorded while viewing 15 conversational videos, and infer a stimulus-related characteristic: the presence or absence of original soundtrack. We achieve an average 81.2% correct classification rate (chance = 50%). HMMs allow to integrate bottom-up, top-down, and oculomotor influences into a single model of gaze behavior. This synergistic approach between behavior and machine learning will open new avenues for simple quantification of gazing behavior. We release SMAC with HMM, a Matlab toolbox freely available to the community under an open-source license agreement.published_or_final_versio
- …