Search CORE

119 research outputs found

Discovering salient objects from videos using spatiotemporal salient region detection

Author: Achanta
Borji
Bruhn
Davis
Filippone
Fu
Fu
Gheorghita Ghinea
Goferman
Gopalakrishnan
Guo
Han
Harel
Heikkilä
Itti
Jung
Kannan
Kim
Koch
Liu
Liu
Mahadevan
Marat
Rajkumar Kannan
Ren
Ren
Riche
Seo
Sridhar Swaminathan
Wixson
Wu
Yang
Publication venue: 'Elsevier BV'
Publication date: 01/08/2015
Field of study

Detecting salient objects from images and videos has many useful applications in computer vision. In this paper, a novel spatiotemporal salient region detection approach is proposed. The proposed approach computes spatiotemporal saliency by estimating spatial and temporal saliencies separately. The spatial saliency of an image is computed by estimating the color contrast cue and color distribution cue. The estimations of these cues exploit the patch level and region level image abstractions in a unified way. The aforementioned cues are fused to compute an initial spatial saliency map, which is further refined to emphasize saliencies of objects uniformly, and to suppress saliencies of background noises. The final spatial saliency map is computed by integrating the refined saliency map with center prior map. The temporal saliency is computed based on local and global temporal saliencies estimations using patch level optical flow abstractions. Both local and global temporal saliencies are fused to compute the temporal saliency. Finally, spatial and temporal saliencies are integrated to generate a spatiotemporal saliency map. The proposed temporal and spatiotemporal salient region detection approaches are extensively experimented on challenging salient object detection video datasets. The experimental results show that the proposed approaches achieve an improved performance than several state-of-the-art saliency detection approaches. In order to compensate different needs in respect of the speed/accuracy tradeoff, faster variants of the spatial, temporal and spatiotemporal salient region detection approaches are also presented in this paper

Crossref

Brunel University Research Archive

Saliency detection via combining global shape and local cue estimation

Author: A Borji
A Wang
C Yang
E Erdem
G Zhu
H Lu
I Rahman
J Yang
K Huang
K Zhang
K Zhang
L Itti
M Jian
M Jian
MM Cheng
MM Cheng
MW Jian
N Tong
N Tong
S Goferman
X Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/09/2017
Field of study

Crossref

Portsmouth University Research Portal (Pure)

How does image noise affect actual and predicted human gaze allocation in assessing image quality?

Author: Acik
Allard
Betz
Borji
Castelhano
Einhäuser
Erdem
Garcia-Diaz
Garcia-Diaz
Goferman
Guo
Guo
Harel
Henderson
Isaacowitz
Itti
Judd
Judd
Judd
Kanan
Krieger
Loschky
Mannan
Mannan
Navalpakkam
Nuthmann
Nuthmann
Parkhurst
Parkhurst
Pollux
Pomplun
Péteri
Reinagel
Reingold
Rubner
Sheikh
Tatler
Tatler
Tatler
Torralba
Torralba
Tseng
van Diepen
van Diepen
Watson
Winkler
Zhang
Zhao
Publication venue: 'Elsevier BV'
Publication date: 01/01/2013
Field of study

A central research question in natural vision is how to allocate fixation to extract informative cues for scene perception. With high quality images, psychological and computational studies have made significant progress to understand and predict human gaze allocation in scene exploration. However, it is unclear whether these findings can be generalised to degraded naturalistic visual inputs. In this eye-tracking and computational study, we methodically distorted both man-made and natural scenes with Gaussian low-pass filter, circular averaging filter and Additive Gaussian white noise, and monitored participants’ gaze behaviour in assessing perceived image qualities. Compared with original high quality images, distorted images attracted fewer numbers of fixations but longer fixation durations, shorter saccade distance and stronger central fixation bias. This impact of image noise manipulation on gaze distribution was mainly determined by noise intensity rather than noise type, and was more pronounced for natural scenes than for man-made scenes. We furthered compared four high performing visual attention models in predicting human gaze allocation in degraded scenes, and found that model performance lacked human-like sensitivity to noise type and intensity, and was considerably worse than human performance measured as inter-observer variance. Furthermore, the central fixation bias is a major predictor for human gaze allocation, which becomes more prominent with increased noise intensity. Our results indicate a crucial role of external noise intensity in determining scene-viewing gaze behaviour, which should be considered in the development of realistic human-vision-inspired attention models

University of Lincoln Institutional Repository

University of Newcastle's Digital Repository

Queen's University Belfast Research Portal

Elsevier - Publisher Connector

Crossref

Ghent University Academic Bibliography

PubMed Central

Oxford University Research Archive

What has been missed for predicting human attention in viewing driving clips?

Author: Acik
Anderson
Ban
Berg
Betz
Borji
Borji
Bruce
Carmi
Cunningham
Dorr
Einhäuser
Gabbiani
Gavin
Goferman
Green
Guo
Guo
Guo
Hall
Henderson
Hill
Hou
Itti
Itti
Judd
Kanan
Kandil
Lander
Lappi
Le Meur
Ma
Mahadevan
Mannan
Marat
Nabatilan
Parkhurst
Pollux
Pollux
Pollux
Reinagel
Rind
Rind
Roebuck
Rothenstein
Röhrbein
Tatler
Tatler
Torralba
Tseng
Vuong
Wang
Xu
Xu
Yue
Yue
Zou
Publication venue: 'PeerJ'
Publication date: 01/02/2017
Field of study

Recent research progress on the topic of human visual attention allocation in scene perception and its simulation is based mainly on studies with static images. However, natural vision requires us to extract visual information that constantly changes due to egocentric movements or dynamics of the world. It is unclear to what extent spatio-temporal regularity, an inherent regularity in dynamic vision, affects human gaze distribution and saliency computation in visual attention models. In this free-viewing eye-tracking study we manipulated the spatio-temporal regularity of traffic videos by presenting them in normal video sequence, reversed video sequence, normal frame sequence, and randomised frame sequence. The recorded human gaze allocation was then used as the ‘ground truth’ to examine the predictive ability of a number of state-of-the-art visual attention models. The analysis revealed high inter-observer agreement across individual human observers, but all the tested attention models performed significantly worse than humans. The inferior predictability of the models was evident from indistinguishable gaze prediction irrespective of stimuli presentation sequence, and weak central fixation bias. Our findings suggest that a realistic visual attention model for the processing of dynamic scenes should incorporate human visual sensitivity with spatio-temporal regularity and central fixation bias

University of Lincoln Institutional Repository

Crossref

Directory of Open Access Journals

PubMed Central