12,991 research outputs found
Light Field Saliency Detection with Deep Convolutional Networks
Light field imaging presents an attractive alternative to RGB imaging because
of the recording of the direction of the incoming light. The detection of
salient regions in a light field image benefits from the additional modeling of
angular patterns. For RGB imaging, methods using CNNs have achieved excellent
results on a range of tasks, including saliency detection. However, it is not
trivial to use CNN-based methods for saliency detection on light field images
because these methods are not specifically designed for processing light field
inputs. In addition, current light field datasets are not sufficiently large to
train CNNs. To overcome these issues, we present a new Lytro Illum dataset,
which contains 640 light fields and their corresponding ground-truth saliency
maps. Compared to current light field saliency datasets [1], [2], our new
dataset is larger, of higher quality, contains more variation and more types of
light field inputs. This makes our dataset suitable for training deeper
networks and benchmarking. Furthermore, we propose a novel end-to-end CNN-based
framework for light field saliency detection. Specifically, we propose three
novel MAC (Model Angular Changes) blocks to process light field micro-lens
images. We systematically study the impact of different architecture variants
and compare light field saliency with regular 2D saliency. Our extensive
comparisons indicate that our novel network significantly outperforms
state-of-the-art methods on the proposed dataset and has desired generalization
abilities on other existing datasets.Comment: 14 pages, 14 figure
Light Field Salient Object Detection: A Review and Benchmark
Salient object detection (SOD) is a long-standing research topic in computer
vision and has drawn an increasing amount of research interest in the past
decade. This paper provides the first comprehensive review and benchmark for
light field SOD, which has long been lacking in the saliency community.
Firstly, we introduce preliminary knowledge on light fields, including theory
and data forms, and then review existing studies on light field SOD, covering
ten traditional models, seven deep learning-based models, one comparative
study, and one brief review. Existing datasets for light field SOD are also
summarized with detailed information and statistical analyses. Secondly, we
benchmark nine representative light field SOD models together with several
cutting-edge RGB-D SOD models on four widely used light field datasets, from
which insightful discussions and analyses, including a comparison between light
field SOD and RGB-D SOD models, are achieved. Besides, due to the inconsistency
of datasets in their current forms, we further generate complete data and
supplement focal stacks, depth maps and multi-view images for the inconsistent
datasets, making them consistent and unified. Our supplemental data makes a
universal benchmark possible. Lastly, because light field SOD is quite a
special problem attributed to its diverse data representations and high
dependency on acquisition hardware, making it differ greatly from other
saliency detection tasks, we provide nine hints into the challenges and future
directions, and outline several open issues. We hope our review and
benchmarking could help advance research in this field. All the materials
including collected models, datasets, benchmarking results, and supplemented
light field datasets will be publicly available on our project site
https://github.com/kerenfu/LFSOD-Survey
Low Light Image Enhancement and Saliency Object Detection
University of Technology Sydney. Faculty of Engineering and Information Technology.Low light images represent a series of image types with great potential. Their research focuses on images and videos of the environment at dusk and near darkness. It can be widely used in night safety monitoring, license plate recognition, night scene shot, special target recognition at dusk, and other emergency events that occur under light scenes. After the environment is enhanced and combined with other tasks in computer vision and pattern recognition, it can bring many results, such as saliency detection and object detection under low illumination, and abnormal detection in crowded places under low-light environment. For the enhancement of low light and low light scenes, using traditional methods often results in over-exposure and halo conditions. Therefore, using deep learning network technology can fix and improve these specific shortcomings. For low light image enhancement, a series of qualitative and quantitative experimental comparisons conducted on a benchmark dataset demonstrate the superiority of our approach, which overcomes the drawbacks of white and colour distortion. At present, most of the research works on visual saliency have concentrated on the field of visible light, and there are few studies on night scenes. Due to insufficient lighting conditions in night scenes, and relatively lower contrasts and signal-to-noise ratios, the effectiveness of available visual features is greatly reduced. Moreover, without sufficient depth information, many features and clues are lost in the original images. Therefore, the detection of salient targets in night scenes is also difficult and it is a focus of current research in the field of computer vision. The performance leads to vague effects when the existing methods are directly con-ducted, so we adopt a new “enhance firstly, detection secondly” mechanism that firstly enhances the low-light images in order to improve the contrast and visibility, and then combines it with relevant saliency detection methods with depth information. Furthermore, we concern about the feature aggregation schemes for deep RGB-D saliency object detection and propose novel feature aggregation methods. Meanwhile, for the monocular vision, of which the depth information is hard to acquire, a novel RGB-D image saliency detection method is proposed to leverage depth cues for enhancing the saliency detection performance but without actually using depth data. The model not only outperforms the state-of-the-art RGB saliency models, but also achieves comparable or even better results compared with the state-of-the-art RGB-D saliency models
Saliency difference based objective evaluation method for a superimposed screen of the HUD with various background
The head-up display (HUD) is an emerging device which can project information
on a transparent screen. The HUD has been used in airplanes and vehicles, and
it is usually placed in front of the operator's view. In the case of the
vehicle, the driver can see not only various information on the HUD but also
the backgrounds (driving environment) through the HUD. However, the projected
information on the HUD may interfere with the colors in the background because
the HUD is transparent. For example, a red message on the HUD will be less
noticeable when there is an overlap between it and the red brake light from the
front vehicle. As the first step to solve this issue, how to evaluate the
mutual interference between the information on the HUD and backgrounds is
important. Therefore, this paper proposes a method to evaluate the mutual
interference based on saliency. It can be evaluated by comparing the HUD part
cut from a saliency map of a measured image with the HUD image.Comment: 10 pages, 5 fighres, 1 table, accepted by IFAC-HMS 201
Objects predict fixations better than early saliency
Humans move their eyes while looking at scenes and pictures. Eye movements correlate with shifts in attention and are thought to be a consequence of optimal resource allocation for high-level tasks such as visual recognition. Models of attention, such as “saliency maps,” are often built on the assumption that “early” features (color, contrast, orientation, motion, and so forth) drive attention directly. We explore an alternative hypothesis: Observers attend to “interesting” objects. To test this hypothesis, we measure the eye position of human observers while they inspect photographs of common natural
scenes. Our observers perform different tasks: artistic evaluation, analysis of content, and search. Immediately after each presentation, our observers are asked to name objects they saw. Weighted with recall frequency, these objects predict fixations in individual images better than early saliency, irrespective of task. Also, saliency combined with object positions predicts which objects are frequently named. This suggests that early saliency has only an indirect effect on attention, acting
through recognized objects. Consequently, rather than treating attention as mere preprocessing step for object recognition, models of both need to be integrated
- …