134 research outputs found
On the Distribution of Salient Objects in Web Images and its Influence on Salient Object Detection
It has become apparent that a Gaussian center bias can serve as an important
prior for visual saliency detection, which has been demonstrated for predicting
human eye fixations and salient object detection. Tseng et al. have shown that
the photographer's tendency to place interesting objects in the center is a
likely cause for the center bias of eye fixations. We investigate the influence
of the photographer's center bias on salient object detection, extending our
previous work. We show that the centroid locations of salient objects in
photographs of Achanta and Liu's data set in fact correlate strongly with a
Gaussian model. This is an important insight, because it provides an empirical
motivation and justification for the integration of such a center bias in
salient object detection algorithms and helps to understand why Gaussian models
are so effective. To assess the influence of the center bias on salient object
detection, we integrate an explicit Gaussian center bias model into two
state-of-the-art salient object detection algorithms. This way, first, we
quantify the influence of the Gaussian center bias on pixel- and segment-based
salient object detection. Second, we improve the performance in terms of F1
score, Fb score, area under the recall-precision curve, area under the receiver
operating characteristic curve, and hit-rate on the well-known data set by
Achanta and Liu. Third, by debiasing Cheng et al.'s region contrast model, we
exemplarily demonstrate that implicit center biases are partially responsible
for the outstanding performance of state-of-the-art algorithms. Last but not
least, as a result of debiasing Cheng et al.'s algorithm, we introduce a
non-biased salient object detection method, which is of interest for
applications in which the image data is not likely to have a photographer's
center bias (e.g., image data of surveillance cameras or autonomous robots)
Pupil dilation as an index of preferred mutual gaze duration
Most animals look at each other to signal threat or interest. In humans, this social interaction is usually punctuated with brief periods of mutual eye contact. Deviations from this pattern of gazing behaviour generally make us feel uncomfortable and are a defining characteristic of clinical conditions such as autism or schizophrenia, yet it is unclear what constitutes normal eye contact. Here, we measured, across a wide range of ages, cultures and personality types, the period of direct gaze that feels comfortable and examined whether autonomic factors linked to arousal were indicative of people’s preferred amount of eye contact. Surprisingly, we find that preferred period of gaze duration is not dependent on fundamental characteristics such as gender, personality traits or attractiveness. However, we do find that subtle pupillary changes, indicative of physiological arousal, correlate with the amount of eye contact people find comfortable. Specifically, people preferring longer durations of eye contact display faster increases in pupil size when viewing another person than those preferring shorter durations. These results reveal that a person’s preferred duration of eye contact is signalled by physiological indices (pupil dilation) beyond volitional control that may play a modulatory role in gaze behaviour
Overt Attention and Context Factors: The Impact of Repeated Presentations, Image Type, and Individual Motivation
The present study investigated the dynamic of the attention focus during observation of different categories of complex scenes and simultaneous consideration of individuals' memory and motivational state. We repeatedly presented four types of complex visual scenes in a pseudo-randomized order and recorded eye movements. Subjects were divided into groups according to their motivational disposition in terms of action orientation and individual rating of scene interest
Perceptual benefit of objecthood
Object-based attention facilitates the processing of features that form the object. Two hypotheses are conceivable for how object-based attention is deployed to an object's features: first, the object is attended by selecting its features; alternatively, a configuration of features as such is attended by selecting the object representation they form. Only for the latter alternative, the perception of a feature configuration as entity ("objecthood") is a necessary condition for object-based attention. Disentangling the two alternatives requires the comparison of identical feature configurations that induce the perception of an object in one condition ("bound") and do not do so in another condition ("unbound"). We used an ambiguous stimulus, whose percept spontaneously switches between bound and unbound, while the stimulus itself remains unchanged. We tested discrimination on the boundary of the diamond as well as detection of probes inside and outside the diamond. We found discrimination performance to be increased if features were perceptually bound into an object. Furthermore, detection performance was higher within and lower outside the bound object as compared to the unbound configuration. Consequently, the facilitation of processing by object-based attention requires objecthood, that is, a unified internal representation of an "object"-not a mere collection of features
Integration of Eye-tracking Methods in Visual Comfort Assessments
Discomfort glare, among different aspects of visual discomfort is a phenomenon which is little understood and hard to quantify. As this phenomenon is dependent on the building occupant’s view direction and on the relative position of the glare source, a deeper knowledge of one’s visual behavior within a space could provide pertinent insights into better understanding glare. To address this need, we set up an experiment to investigate dependencies of view direction distribution to a selected range of brightness and contrast distributions in a standard office scenario. The participants were asked to perform a series of tasks including reading, thinking, filling in a questionnaire and waiting. The direction of their view was monitored by recording participants’ eye movements using eye-tracking methods. Preliminary results show that different facade configurations have different effects on the eye movement patterns, with a strong dependency on the performed task. This pilot study will serve as a first step to integrate eye-tracking methods into visual comfort assessments and lead to a better understanding of the impact of discomfort glare on visual behavior
Influence of Low-Level Stimulus Features, Task Dependent Factors, and Spatial Biases on Overt Visual Attention
Visual attention is thought to be driven by the interplay between low-level visual features and task dependent information content of local image regions, as well as by spatial viewing biases. Though dependent on experimental paradigms and model assumptions, this idea has given rise to varying claims that either bottom-up or top-down mechanisms dominate visual attention. To contribute toward a resolution of this discussion, here we quantify the influence of these factors and their relative importance in a set of classification tasks. Our stimuli consist of individual image patches (bubbles). For each bubble we derive three measures: a measure of salience based on low-level stimulus features, a measure of salience based on the task dependent information content derived from our subjects' classification responses and a measure of salience based on spatial viewing biases. Furthermore, we measure the empirical salience of each bubble based on our subjects' measured eye gazes thus characterizing the overt visual attention each bubble receives. A multivariate linear model relates the three salience measures to overt visual attention. It reveals that all three salience measures contribute significantly. The effect of spatial viewing biases is highest and rather constant in different tasks. The contribution of task dependent information is a close runner-up. Specifically, in a standardized task of judging facial expressions it scores highly. The contribution of low-level features is, on average, somewhat lower. However, in a prototypical search task, without an available template, it makes a strong contribution on par with the two other measures. Finally, the contributions of the three factors are only slightly redundant, and the semi-partial correlation coefficients are only slightly lower than the coefficients for full correlations. These data provide evidence that all three measures make significant and independent contributions and that none can be neglected in a model of human overt visual attention
Precisely timed oculomotor and parietal EEG activity in perceptual switching
Blinks and saccades cause transient interruptions of visual input. To investigate how such effects influence our perceptual state, we analyzed the time courses of blink and saccade rates in relation to perceptual switching in the Necker cube. Both time courses of blink and saccade rates showed peaks at different moments along the switching process. A peak in blinking rate appeared 1,000 ms prior to the switching responses. Blinks occurring around this peak were associated with subsequent switching to the preferred interpretation of the Necker cube. Saccade rates showed a peak 150 ms prior to the switching response. The direction of saccades around this peak was predictive of the perceived orientation of the Necker cube afterwards. Peak blinks were followed and peak saccades were preceded by transient parietal theta band activity indicating the changing of the perceptual interpretation. Precisely-timed blinks, therefore, can initiate perceptual switching, and precisely-timed saccades can facilitate an ongoing change of interpretation
Longer fixation duration while viewing face images
The spatio-temporal properties of saccadic eye movements can be influenced by the cognitive demand and the characteristics of the observed scene. Probably due to its crucial role in social communication, it is argued that face perception may involve different cognitive processes compared with non-face object or scene perception. In this study, we investigated whether and how face and natural scene images can influence the patterns of visuomotor activity. We recorded monkeys’ saccadic eye movements as they freely viewed monkey face and natural scene images. The face and natural scene images attracted similar number of fixations, but viewing of faces was accompanied by longer fixations compared with natural scenes. These longer fixations were dependent on the context of facial features. The duration of fixations directed at facial contours decreased when the face images were scrambled, and increased at the later stage of normal face viewing. The results suggest that face and natural scene images can generate different patterns of visuomotor activity. The extra fixation duration on faces may be correlated with the detailed analysis of facial features
Modelling search for people in 900 scenes: A combined source model of eye guidance
How predictable are human eye movements during search in real world scenes? We recorded 14 observers’ eye movements as they performed a search task (person detection) in 912 outdoor scenes. Observers were highly consistent in the regions fixated during search, even when the target was absent from the scene. These eye movements were used to evaluate computational models of search guidance from three sources: Saliency, target features, and scene context. Each of these models independently outperformed a cross-image control in predicting human fixations. Models that combined sources of guidance ultimately predicted 94% of human agreement, with the scene context component providing the most explanatory power. None of the models, however, could reach the precision and fidelity of an attentional map defined by human fixations. This work puts forth a benchmark for computational models of search in real world scenes. Further improvements in modelling should capture mechanisms underlying the selectivity of observers’ fixations during search.National Eye Institute (Integrative Training Program in Vision grant T32 EY013935)Massachusetts Institute of Technology (Singleton Graduate Research Fellowship)National Science Foundation (U.S.) (Graduate Research Fellowship)National Science Foundation (U.S.) (CAREER Award (0546262))National Science Foundation (U.S.) (NSF contract (0705677))National Science Foundation (U.S.) (Career Award (0747120)
Object Detection Through Exploration With A Foveated Visual Field
We present a foveated object detector (FOD) as a biologically-inspired
alternative to the sliding window (SW) approach which is the dominant method of
search in computer vision object detection. Similar to the human visual system,
the FOD has higher resolution at the fovea and lower resolution at the visual
periphery. Consequently, more computational resources are allocated at the
fovea and relatively fewer at the periphery. The FOD processes the entire
scene, uses retino-specific object detection classifiers to guide eye
movements, aligns its fovea with regions of interest in the input image and
integrates observations across multiple fixations. Our approach combines modern
object detectors from computer vision with a recent model of peripheral pooling
regions found at the V1 layer of the human visual system. We assessed various
eye movement strategies on the PASCAL VOC 2007 dataset and show that the FOD
performs on par with the SW detector while bringing significant computational
cost savings.Comment: An extended version of this manuscript was published in PLOS
Computational Biology (October 2017) at
https://doi.org/10.1371/journal.pcbi.100574
- …