Search CORE

3,239 research outputs found

Holistic gaze strategy to categorize facial expression of varying intensities

Author: Guo Kun
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 03/08/2012
Field of study

Using faces representing exaggerated emotional expressions, recent behaviour and eye-tracking studies have suggested a dominant role of individual facial features in transmitting diagnostic cues for decoding facial expressions. Considering that in everyday life we frequently view low-intensity expressive faces in which local facial cues are more ambiguous, we probably need to combine expressive cues from more than one facial feature to reliably decode naturalistic facial affects. In this study we applied a morphing technique to systematically vary intensities of six basic facial expressions of emotion, and employed a self-paced expression categorization task to measure participants’ categorization performance and associated gaze patterns. The analysis of pooled data from all expressions showed that increasing expression intensity would improve categorization accuracy, shorten reaction time and reduce number of fixations directed at faces. The proportion of fixations and viewing time directed at internal facial features (eyes, nose and mouth region), however, was not affected by varying levels of intensity. Further comparison between individual facial expressions revealed that although proportional gaze allocation at individual facial features was quantitatively modulated by the viewed expressions, the overall gaze distribution in face viewing was qualitatively similar across different facial expressions and different intensities. It seems that we adopt a holistic viewing strategy to extract expressive cues from all internal facial features in processing of naturalistic facial expressions

University of Lincoln Institutional Repository

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Object Detection Through Exploration With A Foveated Visual Field

Author: A Borji
A Lewis
A Torralba
B Alexe
BR Beutter
BW Tatler
C Bradley
C Morvan
CA Curcio
CA Curcio
CA Curcio
CH Lampert
CJ Ludwig
DG Lowe
DM Dacey
DM Levi
Emre Akbas
GJ Zelinsky
GL Malcolm
H Larochelle
H Strasburger
H Yamamoto
I Kokkinos
J Elder
J Freeman
J Hosang
J Najemnik
J Najemnik
J Rovamo
JH Elder
JM Findlay
JM Findlay
K Koehler
L Itti
L Itti
L Zhaoping
L Zhaoping
LW Renninger
MB Neider
MF Land
Miguel P. Eckstein
MJ Choi
MP Eckstein
MP Eckstein
MP Eckstein
MP Eckstein
MP Eckstein
ND Bruce
NJ Butko
NJ Marshall
P Azzopardi
P Kontschieder
P Verghese
P Viola
PF Felzenszwalb
R Rosenholtz
S Ren
S Zhang
SC Mack
T Malisiewicz
T Wertheim
TJ Preston
W Zhang
Wolfgang Einhäuser
X Chen
Z Li
ZP Li
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/10/2017
Field of study

We present a foveated object detector (FOD) as a biologically-inspired alternative to the sliding window (SW) approach which is the dominant method of search in computer vision object detection. Similar to the human visual system, the FOD has higher resolution at the fovea and lower resolution at the visual periphery. Consequently, more computational resources are allocated at the fovea and relatively fewer at the periphery. The FOD processes the entire scene, uses retino-specific object detection classifiers to guide eye movements, aligns its fovea with regions of interest in the input image and integrates observations across multiple fixations. Our approach combines modern object detectors from computer vision with a recent model of peripheral pooling regions found at the V1 layer of the human visual system. We assessed various eye movement strategies on the PASCAL VOC 2007 dataset and show that the FOD performs on par with the SW detector while bringing significant computational cost savings.Comment: An extended version of this manuscript was published in PLOS Computational Biology (October 2017) at https://doi.org/10.1371/journal.pcbi.100574

arXiv.org e-Print Archive

CiteSeerX

Crossref

Directory of Open Access Journals

OpenMETU (Middle East Technical University)

Objects predict fixations better than early saliency

Author: Einhäuser Wolfgang
Perona Pietro
Spain Merrielle
Publication venue: 'Association for Research in Vision and Ophthalmology (ARVO)'
Publication date: 20/11/2008
Field of study

Humans move their eyes while looking at scenes and pictures. Eye movements correlate with shifts in attention and are thought to be a consequence of optimal resource allocation for high-level tasks such as visual recognition. Models of attention, such as “saliency maps,” are often built on the assumption that “early” features (color, contrast, orientation, motion, and so forth) drive attention directly. We explore an alternative hypothesis: Observers attend to “interesting” objects. To test this hypothesis, we measure the eye position of human observers while they inspect photographs of common natural scenes. Our observers perform different tasks: artistic evaluation, analysis of content, and search. Immediately after each presentation, our observers are asked to name objects they saw. Weighted with recall frequency, these objects predict fixations in individual images better than early saliency, irrespective of task. Also, saliency combined with object positions predicts which objects are frequently named. This suggests that early saliency has only an indirect effect on attention, acting through recognized objects. Consequently, rather than treating attention as mere preprocessing step for object recognition, models of both need to be integrated

Caltech Authors

Skim reading: an adaptive strategy for reading on the web

Author: Drieghe Denis
Fitzsimmons Gemma
Weal Mark
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

It has been suggested that readers spend a great deal of time skim reading on the Web and that if readers skim read they reduce their comprehension of what they have read. There have been a number of studies exploring skim reading, but relatively little exists on the skim reading of hypertext and Webpages. In the experiment documented here, we utilised eye tracking methodology to explore how readers skim read hypertext and how hyperlinks affect reading behaviour. The results show that the readers read faster when they were skim reading and comprehension was reduced. However, the presence of hyperlinks seemed to assist the readers in picking out important information when skim reading. We suggest that readers engage in an adaptive information foraging strategy where they attempt to minimise comprehension loss while maintaining a high reading speed. Readers use hyperlinks as markers to suggest important information and use them to read through the text in an efficient and effective way. This suggests that skim reading may not be as damaging to comprehension when reading hypertext, but it does mean that the words we choose to hyperlink become very important to comprehension for those skim reading text on the Web

Southampton (e-Prints Soton)

Crossref

Human Attention in Image Captioning: Dataset and Analysis

Author: Borji Ali
He Sen
Pugeault Nicolas
Tavakoli Hamed R.
Publication venue
Publication date: 07/08/2019
Field of study

In this work, we present a novel dataset consisting of eye movements and verbal descriptions recorded synchronously over images. Using this data, we study the differences in human attention during free-viewing and image captioning tasks. We look into the relationship between human attention and language constructs during perception and sentence articulation. We also analyse attention deployment mechanisms in the top-down soft attention approach that is argued to mimic human attention in captioning tasks, and investigate whether visual saliency can help image captioning. Our study reveals that (1) human attention behaviour differs in free-viewing and image description tasks. Humans tend to fixate on a greater variety of regions under the latter task, (2) there is a strong relationship between described objects and attended objects (

97\%

of the described objects are being attended), (3) a convolutional neural network as feature encoder accounts for human-attended regions during image captioning to a great extent (around

78\%

), (4) soft-attention mechanism differs from human attention, both spatially and temporally, and there is low correlation between caption scores and attention consistency scores. These indicate a large gap between humans and machines in regards to top-down attention, and (5) by integrating the soft attention model with image saliency, we can significantly improve the model's performance on Flickr30k and MSCOCO benchmarks. The dataset can be found at: https://github.com/SenHe/Human-Attention-in-Image-Captioning.Comment: To appear at ICCV 201

arXiv.org e-Print Archive

Crossref

Open Research Exeter

Enlighten

Some , And Possibly All, Scalar Inferences Are Not Delayed: Evidence For Immediate Pragmatic Enrichment

Author: Agresti
Allopenna
Bott
Breheny
Calvert
Carston
Chambers
Chierchia
Chierchia
Cooper
Dahan
Dahan
Daniel J. Grodner
Gallistel
Geurts
Grice
Grodner
Hirschberg
Horn
Horn
Huang
Jaeger
Kanan
Kass
Kathleen M. Carbary
Levinson
Magnuson
Matin
Michael K. Tanenhaus
Murphy
Natalie M. Klein
Noveck
Noveck
Pickering
Pollack
Postal
Rips
Russell
Sadock
Salverda
Salverda
Sauerland
Soames
Sperber
Swinney
Swinney
Tanenhaus
Tanenhaus
Publication venue: 'Transformative Works and Cultures'
Publication date: 01/07/2010
Field of study

Scalar inferences are commonly generated when a speaker uses a weaker expression rather than a stronger alternative, e.g., John ate some of the apples implies that he did not eat them all. This article describes a visual-world study investigating how and when perceivers compute these inferences. Participants followed spoken instructions containing the scalar quantifier some directing them to interact with one of several referential targets (e.g., Click on the girl who has some of the balloons). Participants fixated on the target compatible with the implicated meaning of some and avoided a competitor compatible with the literal meaning prior to a disambiguating noun. Further, convergence on the target was as fast for some as for the non-scalar quantifiers none and all. These findings indicate that the scalar inference is computed immediately and is not delayed relative to the literal interpretation of some. It is argued that previous demonstrations that scalar inferences increase processing time are not necessarily due to delays in generating the inference itself, but rather arise because integrating the interpretation of the inference with relevant information in the context may require additional time. With sufficient contextual support, processing delays disappear

Crossref

Works

PubMed Central