1,289 research outputs found
Background prior-based salient object detection via deep reconstruction residual
Detection of salient objects from images is gaining increasing research interest in recent years as it can substantially facilitate a wide range of content-based multimedia applications. Based on the assumption that foreground salient regions are distinctive within a certain context, most conventional approaches rely on a number of hand designed features and their distinctiveness measured using local or global contrast. Although these approaches have shown effective in dealing with simple images, their limited capability may cause difficulties when dealing with more complicated images. This paper proposes a novel framework for saliency detection by first modeling the background and then separating salient objects from the background. We develop stacked denoising autoencoders with deep learning architectures to model the background where latent patterns are explored and more powerful representations of data are learnt in an unsupervised and bottom up manner. Afterwards, we formulate the separation of salient objects from the background as a problem of measuring reconstruction residuals of deep autoencoders. Comprehensive evaluations on three benchmark datasets and comparisons with 9 state-of-the-art algorithms demonstrate the superiority of the proposed work
Contextual cropping and scaling of TV productions
This is the author's accepted manuscript. The final publication is available at Springer via http://dx.doi.org/10.1007/s11042-011-0804-3. Copyright @ Springer Science+Business Media, LLC 2011.In this paper, an application is presented which automatically adapts SDTV (Standard Definition Television) sports productions to smaller displays through intelligent cropping and scaling. It crops regions of interest of sports productions based on a smart combination of production metadata and systematic video analysis methods. This approach allows a context-based composition of cropped images. It provides a differentiation between the original SD version of the production and the processed one adapted to the requirements for mobile TV. The system has been comprehensively evaluated by comparing the outcome of the proposed method with manually and statically cropped versions, as well as with non-cropped versions. Envisaged is the integration of the tool in post-production and live workflows
PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection
Contexts play an important role in the saliency detection task. However,
given a context region, not all contextual information is helpful for the final
task. In this paper, we propose a novel pixel-wise contextual attention
network, i.e., the PiCANet, to learn to selectively attend to informative
context locations for each pixel. Specifically, for each pixel, it can generate
an attention map in which each attention weight corresponds to the contextual
relevance at each context location. An attended contextual feature can then be
constructed by selectively aggregating the contextual information. We formulate
the proposed PiCANet in both global and local forms to attend to global and
local contexts, respectively. Both models are fully differentiable and can be
embedded into CNNs for joint training. We also incorporate the proposed models
with the U-Net architecture to detect salient objects. Extensive experiments
show that the proposed PiCANets can consistently improve saliency detection
performance. The global and local PiCANets facilitate learning global contrast
and homogeneousness, respectively. As a result, our saliency model can detect
salient objects more accurately and uniformly, thus performing favorably
against the state-of-the-art methods
Bottom-up visual attention model for still image: a preliminary study
The philosophy of human visual attention is scientifically explained in the field of cognitive psychology and neuroscience then computationally modeled in the field of computer science and engineering. Visual attention models have been applied in computer vision systems such as object detection, object recognition, image segmentation, image and video compression, action recognition, visual tracking, and so on. This work studies bottom-up visual attention, namely human fixation prediction and salient object detection models. The preliminary study briefly covers from the biological perspective of visual attention, including visual pathway, the theory of visual attention, to the computational model of bottom-up visual attention that generates saliency map. The study compares some models at each stage and observes whether the stage is inspired by biological architecture, concept, or behavior of human visual attention. From the study, the use of low-level features, center-surround mechanism, sparse representation, and higher-level guidance with intrinsic cues dominate the bottom-up visual attention approaches. The study also highlights the correlation between bottom-up visual attention and curiosity
- …