15 research outputs found
RGBT Salient Object Detection: A Large-scale Dataset and Benchmark
Salient object detection in complex scenes and environments is a challenging
research topic. Most works focus on RGB-based salient object detection, which
limits its performance of real-life applications when confronted with adverse
conditions such as dark environments and complex backgrounds. Taking advantage
of RGB and thermal infrared images becomes a new research direction for
detecting salient object in complex scenes recently, as thermal infrared
spectrum imaging provides the complementary information and has been applied to
many computer vision tasks. However, current research for RGBT salient object
detection is limited by the lack of a large-scale dataset and comprehensive
benchmark. This work contributes such a RGBT image dataset named VT5000,
including 5000 spatially aligned RGBT image pairs with ground truth
annotations. VT5000 has 11 challenges collected in different scenes and
environments for exploring the robustness of algorithms. With this dataset, we
propose a powerful baseline approach, which extracts multi-level features
within each modality and aggregates these features of all modalities with the
attention mechanism, for accurate RGBT salient object detection. Extensive
experiments show that the proposed baseline approach outperforms the
state-of-the-art methods on VT5000 dataset and other two public datasets. In
addition, we carry out a comprehensive analysis of different algorithms of RGBT
salient object detection on VT5000 dataset, and then make several valuable
conclusions and provide some potential research directions for RGBT salient
object detection.Comment: 12 pages, 10 figures
https://github.com/lz118/RGBT-Salient-Object-Detectio
Recommended from our members
A prior regularized multi-layer graph ranking model for image saliency computation
Bottom-up saliency detection has been widely studied in many applications, such as image retrieval, object recognition, image compression and so on. Saliency detection via manifold ranking (MR) can identify the most salient and important area from an image efficiently. One limitation of the MR model is that it fails to consider the prior information in its ranking process. To overcome this limitation, we propose a prior regularized multi-layer graph ranking model (RegMR), which uses the prior calculating by boundary connectivity. We employ the foreground possibility in the first stage and background possibility in the second stage based on a multi-layer graph. We compare our model with fifteen state-of-the-art methods. Experiments show that our model performs well than all other methods on four public databases on PR-curves, F-measure and so on
A new cognitive temporal-spatial visual attention model for video saliency detection
Human vision has the natural cognitive ability to focus on salient objects or areas when watching static or dynamic scenes. Whilst research in image saliency has been historically popular, the challenging area of video saliency has been gaining increasing interest recently, as autonomous and cognitive vision techniques have continued to develop greatly. In this talk, a new cognitive temporal-spatial visual attention model is presented for video saliency detection. It extends the popular graph-based visual saliency(GBVS) model which adopts a ābottom-upā visual attention mechanism. The new model can detect salient motion map which can be combined with other static feature maps in GBVS model. Our proposed model is inspired, firstly, by the observation that independent components of optical flows are recognized for motion understanding in human brains, in the light of which we employ robust independent component analysis (robust ICA) to separate salient foreground optical flows from relatively static background. A second key feature of our proposed model is that the motion saliency map is calculated based on the foreground optical flow vector field and mean shift segmentation. Finally, the salient motion map is normalized and then fused with static maps through a linear combination. Preliminary experiments demonstrate that the spatio-temporal saliency map detected by the new cognitive visual attention model highlights salient foreground moving objects effectively, even in a complex outdoor scene with dynamic background or bad weather. The proposed model could be further exploited for autonomous robotic applications. Acknowledgements: This research is supported by The Royal Society of Edinburgh (RSE) and The National Natural Science Foundation of China (NNSFC) under the RSE-NNSFC joint project (2012-2014) [grant number 61211130309] with Anhui University, China, and the āSino-UK Higher Education Research Partnership for PhD Studiesā joint-project (2013-2015) funded by the British Council China and The China Scholarship Council (CSC). Amir and Erfu Yang are also funded, in part, by the UK Engineering and Physical Sciences Research Council (EPSRC) [grant number EP/I009310/1], and the RSE-NNSFC joint project (2012-2014) [grant number 61211130210] with Beihang University, China
A Biologically Inspired Vision-Based Approach for Detecting Multiple Moving Objects in Complex Outdoor Scenes
In the human brain, independent components of optical flows from the medial superior temporal area are speculated for motion cognition. Inspired by this hypothesis, a novel approach combining independent component analysis (ICA) with principal component analysis (PCA) is proposed in this paper for multiple moving objects detection in complex scenesāa major real-time challenge as bad weather or dynamic background can seriously influence the results of motion detection. In the proposed approach, by taking advantage of ICAās capability of separating the statistically independent features from signals, the ICA algorithm is initially employed to analyze the optical flows of consecutive visual image frames. As a result, the optical flows of background and foreground can be approximately separated. Since there are still many disturbances in the foreground optical flows in the complex scene, PCA is then applied to the optical flows of foreground components so that major optical flows corresponding to multiple moving objects can be enhanced effectively and the motions resulted from the changing background and small disturbances are relatively suppressed at the same time. Comparative experimental results with existing popular motion detection methods for challenging imaging sequences demonstrate that our proposed biologically inspired vision-based approach can extract multiple moving objects effectively in a complex scene
Multi-object extraction in complex scenes using independent component analysis and principal component analysis : a novel hybrid approach
It is always a big challenge to extract moving objects in complex video scenes because bad weather or dynamic backgrounds can seriously influence the results of motion detection. In this research, a new hybrid approach combining independent component analysis (ICA) with principal component analysis (PCA) is proposed for multiple moving objects extraction in complex scenes. First, a fast ICA algorithm is used to analyze the optical flows of video frames, so that the optical flows of background and foreground can be approximately separated. Next, the PCA is applied to the optical flows of foreground components as such the major optical flows corresponding to target multi-objects can be extracted accurately and the motions resulting from changing backgrounds are cleared away simultaneously. Preliminary experimental results demonstrate that the proposed novel hybrid ICA and PCA-based approach can extract multiple objects effectively in a complex scene. Acknowledgements: This research is supported by The Royal Society of Edinburgh (RSE) and The National Natural Science Foundation of China (NNSFC) under the RSE-NNSFC joint project (2012-2015) [grant number 61211130309] with Anhui University, China, and the āSino-UK Higher Education Research Partnership for PhD Studiesā joint-project (2013-2015) funded by the British Council China and The China Scholarship Council (CSC). Amir Hussain and Erfu Yang are also funded, by the RSE-NNSFC joint project (2012-2015) [grant number 61211130210] with Beihang University, China
Minimum Barrier Distance-Based Object Descriptor for Visual Tracking
In most visual tracking tasks, the target is tracked by a bounding box given in the first frame. The complexity and redundancy of background information in the bounding box inevitably exist and affect tracking performance. To alleviate the influence of background, we propose a robust object descriptor for visual tracking in this paper. First, we decompose the bounding box into non-overlapping patches and extract the color and gradient histograms features for each patch. Second, we adopt the minimum barrier distance (MBD) to calculate patch weights. Specifically, we consider the boundary patches as the background seeds and calculate the MBD from each patch to the seed set as the weight of each patch since the weight calculated by MBD can represent the difference between each patch and the background more effectively. Finally, we impose the weight on the extracted feature to get the descriptor of each patch and then incorporate our MBD-based descriptor into the structured support vector machine algorithm for tracking. Experiments on two benchmark datasets demonstrate the effectiveness of the proposed approach
Glycosylated fish gelatin emulsion: Rheological, tribological properties and its application as model coffee creamers
In this study, the emulsion stability of modified fish gelatin (FG) with gum Arabic (GA) and octenyl succinate anhydride gum Arabic (OSA-GA) through glycosylation and un-glycosylation were investigated. The results showed that glycosylated FG has the smallest values of viscosity and particle size by delaying the emulsion droplets flocculation during storage, but highest zeta potential than those of un-glycosylation samples. OSA-GA was better than GA for modification fish gelatin to form stable emulsion. Tribological data indicated that glycosylation could stabilize emulsion and guarantee its lubrication properties. The lightness and lubrication of white coffee increased with increasing emulsion content. Moreover, in comparison with A2 milk, all emulsion samples largely increased lubrication of white coffee, and FG-GA conjugates emulsion had highest lubrication. This study provides insight into the potential of glycosylation modified FG with improved emulsifying properties for application in coffee as a new coffee whitener instead of milk
Multimodal salient object detection via adversarial learning with collaborative generator
Multimodal salient object detection(MSOD), which utilizes multimodal information (e.g., RGB image and thermal infrared or depth image) to detect common salient objects, has received much attention recently. Different modalities reflect different appearance properties of salient objects, some of which could contribute to improving the precision and/or recall of MSOD. To greatly improve both Precision and Recall by fully exploring multimodal data, in this work, we propose an effective adversarial learning framework based on a novel collaborative generator for accurate multimodal salient object detection. In particular, the collaborative generator consists of three generators (generator1, generator2 and generator3), which aim at decreasing the false positive and false negative of the generated saliency maps and improving F-measure of the final saliency maps respectively. Generator1 and generator2 contain two encoderādecoder networks for multimodal inputs, and we propose a new co-attention model to perform adaptive interactions between different modalities. Furthermore, we apply generator3 to integrate feature maps from generator1 and generator2 in a complementary way. Through adversarially learning the collaborative generator and discriminator, both Precision and Recall of the predicted maps are boosted with the complementary benefits of multimodal data. Extensive experiments on three RGBT datasets and six RGBD datasets show that our method performs quite well against state-of-the-art MSOD methods