Search CORE

1,680 research outputs found

Embodied & Situated Language Processing

Author: Coventry Kenny
Engelhardt Paul
Taylor Lawrence
Publication venue
Publication date: 01/08/2012
Field of study

Toward Improving the Evaluation of Visual Attention Models: a Crowdsourcing Approach

Author: Gori Marco
Melacci Stefano
Zanca Dario
Publication venue
Publication date: 01/01/2020
Field of study

Human visual attention is a complex phenomenon. A computational modeling of this phenomenon must take into account where people look in order to evaluate which are the salient locations (spatial distribution of the fixations), when they look in those locations to understand the temporal development of the exploration (temporal order of the fixations), and how they move from one location to another with respect to the dynamics of the scene and the mechanics of the eyes (dynamics). State-of-the-art models focus on learning saliency maps from human data, a process that only takes into account the spatial component of the phenomenon and ignore its temporal and dynamical counterparts. In this work we focus on the evaluation methodology of models of human visual attention. We underline the limits of the current metrics for saliency prediction and scanpath similarity, and we introduce a statistical measure for the evaluation of the dynamics of the simulated eye movements. While deep learning models achieve astonishing performance in saliency prediction, our analysis shows their limitations in capturing the dynamics of the process. We find that unsupervised gravitational models, despite of their simplicity, outperform all competitors. Finally, exploiting a crowd-sourcing platform, we present a study aimed at evaluating how strongly the scanpaths generated with the unsupervised gravitational models appear plausible to naive and expert human observers

arXiv.org e-Print Archive

Archivio della Ricerca - Università degli Studi di Siena

Walking across Wikipedia: a scale-free network model of semantic memory retrieval.

Author: Kello Christopher T
Thompson Graham W
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

Semantic knowledge has been investigated using both online and offline methods. One common online method is category recall, in which members of a semantic category like "animals" are retrieved in a given period of time. The order, timing, and number of retrievals are used as assays of semantic memory processes. One common offline method is corpus analysis, in which the structure of semantic knowledge is extracted from texts using co-occurrence or encyclopedic methods. Online measures of semantic processing, as well as offline measures of semantic structure, have yielded data resembling inverse power law distributions. The aim of the present study is to investigate whether these patterns in data might be related. A semantic network model of animal knowledge is formulated on the basis of Wikipedia pages and their overlap in word probability distributions. The network is scale-free, in that node degree is related to node frequency as an inverse power law. A random walk over this network is shown to simulate a number of results from a category recall experiment, including power law-like distributions of inter-response intervals. Results are discussed in terms of theories of semantic structure and processing

Directory of Open Access Journals

PubMed Central

Frontiers - Publisher Connector

eScholarship - University of California

Time Course and Hazard Function: A Distributional Analysis of Fixation Duration in Reading

Author: Feng Gary
Publication venue: University of Bern
Publication date: 22/12/2009
Field of study

Reading processes affect not only the mean of fixation duration but also its distribution function. This paper introduces a set of hypotheses that link the timing and strength of a reading process to the hazard function of a fixation duration distribution. Analyses based on large corpora of reading eye movements show a surprisingly robust hazard function across languages, age, individual differences, and a number of processing variables. The data suggest that eye movements are generated stochastically based on a stereotyped time course that is independent of reading variables. High-level reading processes, however, modulate eye movement programming by increasing or decreasing the momentary saccade rate during a narrow time window. Implications to theories and analyses of reading eye movement are discussed

Journal of Eye Movement Research

BOP Serials

Using Gaze for Behavioural Biometrics

Author: Alessandro D’Amelio
Giuseppe Boccignone
Sabrina Patania
Sathya Bursic
Vittorio Cuculo
Publication venue
Publication date: 01/01/2023
Field of study

A principled approach to the analysis of eye movements for behavioural biometrics is laid down. The approach grounds in foraging theory, which provides a sound basis to capture the unique- ness of individual eye movement behaviour. We propose a composite Ornstein-Uhlenbeck process for quantifying the exploration/exploitation signature characterising the foraging eye behaviour. The rel- evant parameters of the composite model, inferred from eye-tracking data via Bayesian analysis, are shown to yield a suitable feature set for biometric identification; the latter is eventually accomplished via a classical classification technique. A proof of concept of the method is provided by measuring its identification performance on a publicly available dataset. Data and code for reproducing the analyses are made available. Overall, we argue that the approach offers a fresh view on either the analyses of eye-tracking data and prospective applications in this field

Directory of Open Access Journals

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Gravitational Models Explain Shifts on Human Visual Attention

Author: Gori Marco
Melacci Stefano
Rufa Alessandra
Zanca Dario
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

Visual attention refers to the human brain's ability to select relevant sensory information for preferential processing, improving performance in visual and cognitive tasks. It proceeds in two phases. One in which visual feature maps are acquired and processed in parallel. Another where the information from these maps is merged in order to select a single location to be attended for further and more complex computations and reasoning. Its computational description is challenging, especially if the temporal dynamics of the process are taken into account. Numerous methods to estimate saliency have been proposed in the last three decades. They achieve almost perfect performance in estimating saliency at the pixel level, but the way they generate shifts in visual attention fully depends on winner-take-all (WTA) circuitry. WTA is implemented} by the biological hardware in order to select a location with maximum saliency, towards which to direct overt attention. In this paper we propose a gravitational model (GRAV) to describe the attentional shifts. Every single feature acts as an attractor and {the shifts are the result of the joint effects of the attractors. In the current framework, the assumption of a single, centralized saliency map is no longer necessary, though still plausible. Quantitative results on two large image datasets show that this model predicts shifts more accurately than winner-take-all

arXiv.org e-Print Archive

Archivio della Ricerca - Università degli Studi di Siena

INRIA a CCSD electronic archive server

Visual attention and perception in scene understanding for social robotics

Author: HE HONGSHENG
Publication venue
Publication date: 31/05/2012
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

An integrated theory of language production and comprehension

Author: Chang F.
Kidd E.
Rowland C.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2013
Field of study

Currently, production and comprehension are regarded as quite distinct in accounts of language processing. In rejecting this dichotomy, we instead assert that producing and understanding are interwoven, and that this interweaving is what enables people to predict themselves and each other. We start by noting that production and comprehension are forms of action and action perception. We then consider the evidence for interweaving in action, action perception, and joint action, and explain such evidence in terms of prediction. Specifically, we assume that actors construct forward models of their actions before they execute those actions, and that perceivers of others' actions covertly imitate those actions, then construct forward models of those actions. We use these accounts of action, action perception, and joint action to develop accounts of production, comprehension, and interactive language. Importantly, they incorporate well-defined levels of linguistic representation (such as semantics, syntax, and phonology). We show (a) how speakers and comprehenders use covert imitation and forward modeling to make predictions at these levels of representation, (b) how they interweave production and comprehension processes, and (c) how they use these predictions to monitor the upcoming utterances. We show how these accounts explain a range of behavioral and neuroscientific data on language processing and discuss some of the implications of our proposal

CiteSeerX

Edinburgh Research Explorer

Enlighten

MPG.PuRe