Search CORE

33 research outputs found

What has been missed for predicting human attention in viewing driving clips?

Author: Acik
Anderson
Ban
Berg
Betz
Borji
Borji
Bruce
Carmi
Cunningham
Dorr
Einhäuser
Gabbiani
Gavin
Goferman
Green
Guo
Guo
Guo
Hall
Henderson
Hill
Hou
Itti
Itti
Judd
Kanan
Kandil
Lander
Lappi
Le Meur
Ma
Mahadevan
Mannan
Marat
Nabatilan
Parkhurst
Pollux
Pollux
Pollux
Reinagel
Rind
Rind
Roebuck
Rothenstein
Röhrbein
Tatler
Tatler
Torralba
Tseng
Vuong
Wang
Xu
Xu
Yue
Yue
Zou
Publication venue: 'PeerJ'
Publication date: 01/02/2017
Field of study

Recent research progress on the topic of human visual attention allocation in scene perception and its simulation is based mainly on studies with static images. However, natural vision requires us to extract visual information that constantly changes due to egocentric movements or dynamics of the world. It is unclear to what extent spatio-temporal regularity, an inherent regularity in dynamic vision, affects human gaze distribution and saliency computation in visual attention models. In this free-viewing eye-tracking study we manipulated the spatio-temporal regularity of traffic videos by presenting them in normal video sequence, reversed video sequence, normal frame sequence, and randomised frame sequence. The recorded human gaze allocation was then used as the ‘ground truth’ to examine the predictive ability of a number of state-of-the-art visual attention models. The analysis revealed high inter-observer agreement across individual human observers, but all the tested attention models performed significantly worse than humans. The inferior predictability of the models was evident from indistinguishable gaze prediction irrespective of stimuli presentation sequence, and weak central fixation bias. Our findings suggest that a realistic visual attention model for the processing of dynamic scenes should incorporate human visual sensitivity with spatio-temporal regularity and central fixation bias

University of Lincoln Institutional Repository

Crossref

Directory of Open Access Journals

PubMed Central

FastSal: a Computationally Efficient Network for Visual Saliency Prediction

Author: Hu Feiyan
McGuinness Kevin
Publication venue
Publication date: 25/08/2020
Field of study

This paper focuses on the problem of visual saliency prediction, predicting regions of an image that tend to attract human visual attention, under a constrained computational budget. We modify and test various recent efficient convolutional neural network architectures like EfficientNet and MobileNetV2 and compare them with existing state-of-the-art saliency models such as SalGAN and DeepGaze II both in terms of standard accuracy metrics like AUC and NSS, and in terms of the computational complexity and model size. We find that MobileNetV2 makes an excellent backbone for a visual saliency model and can be effective even without a complex decoder. We also show that knowledge transfer from a more computationally expensive model like DeepGaze II can be achieved via pseudo-labelling an unlabelled dataset, and that this approach gives result on-par with many state-of-the-art algorithms with a fraction of the computational cost and model size. Source code is available at https://github.com/feiyanhu/FastSal

arXiv.org e-Print Archive

Irish Universities

DCU Online Research Access Service

Intelligent and Energy-Efficient Data Prioritization in Green Smart Cities: Current Challenges and Future Directions

Author: Baik Sung Wook
Lloret Jaime
Muhammad Khan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/02/2019
Field of study

[EN] The excessive use of digital devices such as cameras and smartphones in smart cities has produced huge data repositories that require automatic tools for efficient browsing, searching, and management. Data prioritization (DP) is a technique that produces a condensed form of the original data by analyzing its contents. Current DP studies are either concerned with data collected through stable capturing devices or focused on prioritization of data of a certain type such as surveillance, sports, or industry. This necessitates the need for DP tools that intelligently and cost-effectively prioritize a large variety of data for detecting abnormal events and hence effectively manage them, thereby making the current smart cities greener. In this article, we first carry out an in-depth investigation of the recent approaches and trends of DP for data of different natures, genres, and domains of two decades in green smart cities. Next, we propose an energy-efficient DP framework by intelligent integration of the Internet of Things, artificial intelligence, and big data analytics. Experimental evaluation on real-world surveillance data verifies the energy efficiency and applicability of this framework in green smart cities. Finally, this article highlights the key challenges of DP, its future requirements, and propositions for integration into green smart citiesThis work was supported by a National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (no. 2016R-1A2B4011712).Muhammad, K.; Lloret, J.; Baik, SW. (2019). Intelligent and Energy-Efficient Data Prioritization in Green Smart Cities: Current Challenges and Future Directions. IEEE Communications Magazine. 57(2):60-65. https://doi.org/10.1109/MCOM.2018.1800371S606557

RiuNet

The Alleviation of Perceptual Blindness During Driving in Urban Areas Guided by Saccades Recommendation

Author: Guo Kun
Park Seop Hyeong
Xu Jiawei
Zhang Xiaoqin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

In advanced industrial applications, computational visual attention models (CVAMs) could predict visual attention very similarly to actual human attention allocation. This has been used as a very important component of technology in advanced driver assistance systems (ADAS). Given that the biological inspiration of the driving-related CVAMs could be obtained from skilled drivers in complex driving conditions, in which the driver’s attention is constantly directed at various salient and informative visual stimuli by alternating the eye fixations via saccades to drive safely, this paper proposes a saccade recommendation strategy to enhance the driving safety under urban road environment, particularly when the driver’s vision is often impaired by the visual crowding. The altered and directed saccades are collected and optimized by extracting four innate features from human dynamic vision. A neural network is designed to classify preferable saccades to reduce perceptual blindness due to visual crowding under urban scenes. A state-of- the-art CVAM is firstly adopted to localize the predicted eye fixation locations (EFLs) in driving video clips. Besides, human subjects’ gaze at the recommended EFLs is measured via an eye-tracker. The time delays between the predicted EFLs and drivers’ EFLs are analyzed under different driving conditions, followed by the time delays between the predicted EFLs and the driver’s hand control. The visually safe margin is then measured by mediating the driving speed and the total delay. Experimental results demonstrate that the recommended saccades can effectively reduce the amount of perceptual blindness, which is known to be of help to further improve road driving safety

University of Lincoln Institutional Repository

Remote Sensing Scene Classification Based on Convolutional Neural Networks Pre-Trained Using Attention-Guided Sparse Filters

Author: Ackland Stephen
Chengyi Wang
Dongxu He
Jiansheng Chen
Jingbo Chen
Ma Zhong
Publication venue: 'MDPI AG'
Publication date: 10/02/2018
Field of study

Open access articleSemantic-level land-use scene classiﬁcation is a challenging problem, in which deep learning methods, e.g., convolutional neural networks (CNNs), have shown remarkable capacity. However, a lack of sufﬁcient labeled images has proved a hindrance to increasing the land-use scene classiﬁcation accuracy of CNNs. Aiming at this problem, this paper proposes a CNN pre-training method under the guidance of a human visual attention mechanism. Speciﬁcally, a computational visual attention model is used to automatically extract salient regions in unlabeled images. Then, sparse ﬁlters are adopted to learn features from these salient regions, with the learnt parameters used to initialize the convolutional layers of the CNN. Finally, the CNN is further ﬁne-tuned on labeled images. Experiments are performed on the UCMerced and AID datasets, which show that when combined with a demonstrative CNN, our method can achieve 2.24% higher accuracy than a plain CNN and can obtain an overall accuracy of 92.43% when combined with AlexNet. The results indicate that the proposed method can effectively improve CNN performance using easy-to-access unlabeled images and thus will enhance the performance of land-use scene classiﬁcation especially when a large-scale labeled dataset is unavailable

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

De Montfort University Open Research Archive

Image synthesis based on a model of human vision

Author: Brown Ross A.
Publication venue: 'Queensland University of Technology'
Publication date: 01/01/2003
Field of study

Modern computer graphics systems are able to construct renderings of such high quality that viewers are deceived into regarding the images as coming from a photographic source. Large amounts of computing resources are expended in this rendering process, using complex mathematical models of lighting and shading. However, psychophysical experiments have revealed that viewers only regard certain informative regions within a presented image. Furthermore, it has been shown that these visually important regions contain low-level visual feature differences that attract the attention of the viewer. This thesis will present a new approach to image synthesis that exploits these experimental findings by modulating the spatial quality of image regions by their visual importance. Efficiency gains are therefore reaped, without sacrificing much of the perceived quality of the image. Two tasks must be undertaken to achieve this goal. Firstly, the design of an appropriate region-based model of visual importance, and secondly, the modification of progressive rendering techniques to effect an importance-based rendering approach. A rule-based fuzzy logic model is presented that computes, using spatial feature differences, the relative visual importance of regions in an image. This model improves upon previous work by incorporating threshold effects induced by global feature difference distributions and by using texture concentration measures. A modified approach to progressive ray-tracing is also presented. This new approach uses the visual importance model to guide the progressive refinement of an image. In addition, this concept of visual importance has been incorporated into supersampling, texture mapping and computer animation techniques. Experimental results are presented, illustrating the efficiency gains reaped from using this method of progressive rendering. This visual importance-based rendering approach is expected to have applications in the entertainment industry, where image fidelity may be sacrificed for efficiency purposes, as long as the overall visual impression of the scene is maintained. Different aspects of the approach should find many other applications in image compression, image retrieval, progressive data transmission and active robotic vision

Queensland University of Technology ePrints Archive

Performance Evaluation of Object Proposal Generators for Salient Object Detection

Author
Publication venue
Publication date: 01/01/2019
Field of study

abstract: The detection and segmentation of objects appearing in a natural scene, often referred to as Object Detection, has gained a lot of interest in the computer vision field. Although most existing object detectors aim to detect all the objects in a given scene, it is important to evaluate whether these methods are capable of detecting the salient objects in the scene when constraining the number of proposals that can be generated due to constraints on timing or computations during execution. Salient objects are objects that tend to be more fixated by human subjects. The detection of salient objects is important in applications such as image collection browsing, image display on small devices, and perceptual compression. This thesis proposes a novel evaluation framework that analyses the performance of popular existing object proposal generators in detecting the most salient objects. This work also shows that, by incorporating saliency constraints, the number of generated object proposals and thus the computational cost can be decreased significantly for a target true positive detection rate (TPR). As part of the proposed framework, salient ground-truth masks are generated from the given original ground-truth masks for a given dataset. Given an object detection dataset, this work constructs salient object location ground-truth data, referred to here as salient ground-truth data for short, that only denotes the locations of salient objects. This is obtained by first computing a saliency map for the input image and then using it to assign a saliency score to each object in the image. Objects whose saliency scores are sufficiently high are referred to as salient objects. The detection rates are analyzed for existing object proposal generators with respect to the original ground-truth masks and the generated salient ground-truth masks. As part of this work, a salient object detection database with salient ground-truth masks was constructed from the PASCAL VOC 2007 dataset. Not only does this dataset aid in analyzing the performance of existing object detectors for salient object detection, but it also helps in the development of new object detection methods and evaluating their performance in terms of successful detection of salient objects.Dissertation/ThesisMasters Thesis Electrical Engineering 201

ASU Digital Repository

Discovering salient objects from videos using spatiotemporal salient region detection

Author: Achanta
Borji
Bruhn
Davis
Filippone
Fu
Fu
Gheorghita Ghinea
Goferman
Gopalakrishnan
Guo
Han
Harel
Heikkilä
Itti
Jung
Kannan
Kim
Koch
Liu
Liu
Mahadevan
Marat
Rajkumar Kannan
Ren
Ren
Riche
Seo
Sridhar Swaminathan
Wixson
Wu
Yang
Publication venue: 'Elsevier BV'
Publication date: 01/08/2015
Field of study

Detecting salient objects from images and videos has many useful applications in computer vision. In this paper, a novel spatiotemporal salient region detection approach is proposed. The proposed approach computes spatiotemporal saliency by estimating spatial and temporal saliencies separately. The spatial saliency of an image is computed by estimating the color contrast cue and color distribution cue. The estimations of these cues exploit the patch level and region level image abstractions in a unified way. The aforementioned cues are fused to compute an initial spatial saliency map, which is further refined to emphasize saliencies of objects uniformly, and to suppress saliencies of background noises. The final spatial saliency map is computed by integrating the refined saliency map with center prior map. The temporal saliency is computed based on local and global temporal saliencies estimations using patch level optical flow abstractions. Both local and global temporal saliencies are fused to compute the temporal saliency. Finally, spatial and temporal saliencies are integrated to generate a spatiotemporal saliency map. The proposed temporal and spatiotemporal salient region detection approaches are extensively experimented on challenging salient object detection video datasets. The experimental results show that the proposed approaches achieve an improved performance than several state-of-the-art saliency detection approaches. In order to compensate different needs in respect of the speed/accuracy tradeoff, faster variants of the spatial, temporal and spatiotemporal salient region detection approaches are also presented in this paper

Crossref

Brunel University Research Archive

Psychophysical assessment of perceived interest in natural images: The ROI-D database

Author: Engelke Ulrich
Zepernick Hans-Jürgen
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

We introduce a novel region-of-interest (ROI) database for natural image content, the ROI-D database. The database consists of ROI maps created from manual selections obtained in a psychophysical experiment with 20 participants. The presented stimuli were 42 photographic images taken from 3 publicly available image quality databases. In addition to the ROI selections, dominance ratings were recorded that provide further insight into the interest of the selected ROI in relation to the background. In this paper, the experiment is described, the resulting ROI database is analysed, and possible applications of the database are discussed. The ROI-D database is made freely available to the image processing research community

Blekinge Institute of Technology

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line