15 research outputs found

    RGBT Salient Object Detection: A Large-scale Dataset and Benchmark

    Full text link
    Salient object detection in complex scenes and environments is a challenging research topic. Most works focus on RGB-based salient object detection, which limits its performance of real-life applications when confronted with adverse conditions such as dark environments and complex backgrounds. Taking advantage of RGB and thermal infrared images becomes a new research direction for detecting salient object in complex scenes recently, as thermal infrared spectrum imaging provides the complementary information and has been applied to many computer vision tasks. However, current research for RGBT salient object detection is limited by the lack of a large-scale dataset and comprehensive benchmark. This work contributes such a RGBT image dataset named VT5000, including 5000 spatially aligned RGBT image pairs with ground truth annotations. VT5000 has 11 challenges collected in different scenes and environments for exploring the robustness of algorithms. With this dataset, we propose a powerful baseline approach, which extracts multi-level features within each modality and aggregates these features of all modalities with the attention mechanism, for accurate RGBT salient object detection. Extensive experiments show that the proposed baseline approach outperforms the state-of-the-art methods on VT5000 dataset and other two public datasets. In addition, we carry out a comprehensive analysis of different algorithms of RGBT salient object detection on VT5000 dataset, and then make several valuable conclusions and provide some potential research directions for RGBT salient object detection.Comment: 12 pages, 10 figures https://github.com/lz118/RGBT-Salient-Object-Detectio

    A new cognitive temporal-spatial visual attention model for video saliency detection

    No full text
    Human vision has the natural cognitive ability to focus on salient objects or areas when watching static or dynamic scenes. Whilst research in image saliency has been historically popular, the challenging area of video saliency has been gaining increasing interest recently, as autonomous and cognitive vision techniques have continued to develop greatly. In this talk, a new cognitive temporal-spatial visual attention model is presented for video saliency detection. It extends the popular graph-based visual saliency(GBVS) model which adopts a ā€˜bottom-upā€™ visual attention mechanism. The new model can detect salient motion map which can be combined with other static feature maps in GBVS model. Our proposed model is inspired, firstly, by the observation that independent components of optical flows are recognized for motion understanding in human brains, in the light of which we employ robust independent component analysis (robust ICA) to separate salient foreground optical flows from relatively static background. A second key feature of our proposed model is that the motion saliency map is calculated based on the foreground optical flow vector field and mean shift segmentation. Finally, the salient motion map is normalized and then fused with static maps through a linear combination. Preliminary experiments demonstrate that the spatio-temporal saliency map detected by the new cognitive visual attention model highlights salient foreground moving objects effectively, even in a complex outdoor scene with dynamic background or bad weather. The proposed model could be further exploited for autonomous robotic applications. Acknowledgements: This research is supported by The Royal Society of Edinburgh (RSE) and The National Natural Science Foundation of China (NNSFC) under the RSE-NNSFC joint project (2012-2014) [grant number 61211130309] with Anhui University, China, and the ā€œSino-UK Higher Education Research Partnership for PhD Studiesā€ joint-project (2013-2015) funded by the British Council China and The China Scholarship Council (CSC). Amir and Erfu Yang are also funded, in part, by the UK Engineering and Physical Sciences Research Council (EPSRC) [grant number EP/I009310/1], and the RSE-NNSFC joint project (2012-2014) [grant number 61211130210] with Beihang University, China

    A Biologically Inspired Vision-Based Approach for Detecting Multiple Moving Objects in Complex Outdoor Scenes

    Get PDF
    In the human brain, independent components of optical flows from the medial superior temporal area are speculated for motion cognition. Inspired by this hypothesis, a novel approach combining independent component analysis (ICA) with principal component analysis (PCA) is proposed in this paper for multiple moving objects detection in complex scenesā€”a major real-time challenge as bad weather or dynamic background can seriously influence the results of motion detection. In the proposed approach, by taking advantage of ICAā€™s capability of separating the statistically independent features from signals, the ICA algorithm is initially employed to analyze the optical flows of consecutive visual image frames. As a result, the optical flows of background and foreground can be approximately separated. Since there are still many disturbances in the foreground optical flows in the complex scene, PCA is then applied to the optical flows of foreground components so that major optical flows corresponding to multiple moving objects can be enhanced effectively and the motions resulted from the changing background and small disturbances are relatively suppressed at the same time. Comparative experimental results with existing popular motion detection methods for challenging imaging sequences demonstrate that our proposed biologically inspired vision-based approach can extract multiple moving objects effectively in a complex scene

    Multi-object extraction in complex scenes using independent component analysis and principal component analysis : a novel hybrid approach

    No full text
    It is always a big challenge to extract moving objects in complex video scenes because bad weather or dynamic backgrounds can seriously influence the results of motion detection. In this research, a new hybrid approach combining independent component analysis (ICA) with principal component analysis (PCA) is proposed for multiple moving objects extraction in complex scenes. First, a fast ICA algorithm is used to analyze the optical flows of video frames, so that the optical flows of background and foreground can be approximately separated. Next, the PCA is applied to the optical flows of foreground components as such the major optical flows corresponding to target multi-objects can be extracted accurately and the motions resulting from changing backgrounds are cleared away simultaneously. Preliminary experimental results demonstrate that the proposed novel hybrid ICA and PCA-based approach can extract multiple objects effectively in a complex scene. Acknowledgements: This research is supported by The Royal Society of Edinburgh (RSE) and The National Natural Science Foundation of China (NNSFC) under the RSE-NNSFC joint project (2012-2015) [grant number 61211130309] with Anhui University, China, and the ā€œSino-UK Higher Education Research Partnership for PhD Studiesā€ joint-project (2013-2015) funded by the British Council China and The China Scholarship Council (CSC). Amir Hussain and Erfu Yang are also funded, by the RSE-NNSFC joint project (2012-2015) [grant number 61211130210] with Beihang University, China

    Minimum Barrier Distance-Based Object Descriptor for Visual Tracking

    No full text
    In most visual tracking tasks, the target is tracked by a bounding box given in the first frame. The complexity and redundancy of background information in the bounding box inevitably exist and affect tracking performance. To alleviate the influence of background, we propose a robust object descriptor for visual tracking in this paper. First, we decompose the bounding box into non-overlapping patches and extract the color and gradient histograms features for each patch. Second, we adopt the minimum barrier distance (MBD) to calculate patch weights. Specifically, we consider the boundary patches as the background seeds and calculate the MBD from each patch to the seed set as the weight of each patch since the weight calculated by MBD can represent the difference between each patch and the background more effectively. Finally, we impose the weight on the extracted feature to get the descriptor of each patch and then incorporate our MBD-based descriptor into the structured support vector machine algorithm for tracking. Experiments on two benchmark datasets demonstrate the effectiveness of the proposed approach

    Glycosylated fish gelatin emulsion: Rheological, tribological properties and its application as model coffee creamers

    No full text
    In this study, the emulsion stability of modified fish gelatin (FG) with gum Arabic (GA) and octenyl succinate anhydride gum Arabic (OSA-GA) through glycosylation and un-glycosylation were investigated. The results showed that glycosylated FG has the smallest values of viscosity and particle size by delaying the emulsion droplets flocculation during storage, but highest zeta potential than those of un-glycosylation samples. OSA-GA was better than GA for modification fish gelatin to form stable emulsion. Tribological data indicated that glycosylation could stabilize emulsion and guarantee its lubrication properties. The lightness and lubrication of white coffee increased with increasing emulsion content. Moreover, in comparison with A2 milk, all emulsion samples largely increased lubrication of white coffee, and FG-GA conjugates emulsion had highest lubrication. This study provides insight into the potential of glycosylation modified FG with improved emulsifying properties for application in coffee as a new coffee whitener instead of milk

    Multimodal salient object detection via adversarial learning with collaborative generator

    No full text
    Multimodal salient object detection(MSOD), which utilizes multimodal information (e.g., RGB image and thermal infrared or depth image) to detect common salient objects, has received much attention recently. Different modalities reflect different appearance properties of salient objects, some of which could contribute to improving the precision and/or recall of MSOD. To greatly improve both Precision and Recall by fully exploring multimodal data, in this work, we propose an effective adversarial learning framework based on a novel collaborative generator for accurate multimodal salient object detection. In particular, the collaborative generator consists of three generators (generator1, generator2 and generator3), which aim at decreasing the false positive and false negative of the generated saliency maps and improving F-measure of the final saliency maps respectively. Generator1 and generator2 contain two encoderā€“decoder networks for multimodal inputs, and we propose a new co-attention model to perform adaptive interactions between different modalities. Furthermore, we apply generator3 to integrate feature maps from generator1 and generator2 in a complementary way. Through adversarially learning the collaborative generator and discriminator, both Precision and Recall of the predicted maps are boosted with the complementary benefits of multimodal data. Extensive experiments on three RGBT datasets and six RGBD datasets show that our method performs quite well against state-of-the-art MSOD methods
    corecore