465 research outputs found

    Saliency-guided Adaptive Seeding for Supervoxel Segmentation

    Full text link
    We propose a new saliency-guided method for generating supervoxels in 3D space. Rather than using an evenly distributed spatial seeding procedure, our method uses visual saliency to guide the process of supervoxel generation. This results in densely distributed, small, and precise supervoxels in salient regions which often contain objects, and larger supervoxels in less salient regions that often correspond to background. Our approach largely improves the quality of the resulting supervoxel segmentation in terms of boundary recall and under-segmentation error on publicly available benchmarks.Comment: 6 pages, accepted to IROS201

    An Iterative Co-Saliency Framework for RGBD Images

    Full text link
    As a newly emerging and significant topic in computer vision community, co-saliency detection aims at discovering the common salient objects in multiple related images. The existing methods often generate the co-saliency map through a direct forward pipeline which is based on the designed cues or initialization, but lack the refinement-cycle scheme. Moreover, they mainly focus on RGB image and ignore the depth information for RGBD images. In this paper, we propose an iterative RGBD co-saliency framework, which utilizes the existing single saliency maps as the initialization, and generates the final RGBD cosaliency map by using a refinement-cycle model. Three schemes are employed in the proposed RGBD co-saliency framework, which include the addition scheme, deletion scheme, and iteration scheme. The addition scheme is used to highlight the salient regions based on intra-image depth propagation and saliency propagation, while the deletion scheme filters the saliency regions and removes the non-common salient regions based on interimage constraint. The iteration scheme is proposed to obtain more homogeneous and consistent co-saliency map. Furthermore, a novel descriptor, named depth shape prior, is proposed in the addition scheme to introduce the depth information to enhance identification of co-salient objects. The proposed method can effectively exploit any existing 2D saliency model to work well in RGBD co-saliency scenarios. The experiments on two RGBD cosaliency datasets demonstrate the effectiveness of our proposed framework.Comment: 13 pages, 13 figures, Accepted by IEEE Transactions on Cybernetics 2017. Project URL: https://rmcong.github.io/proj_RGBD_cosal_tcyb.htm

    Hierarchical Salient Object Detection for Assisted Grasping

    Full text link
    Visual scene decomposition into semantic entities is one of the major challenges when creating a reliable object grasping system. Recently, we introduced a bottom-up hierarchical clustering approach which is able to segment objects and parts in a scene. In this paper, we introduce a transform from such a segmentation into a corresponding, hierarchical saliency function. In comprehensive experiments we demonstrate its ability to detect salient objects in a scene. Furthermore, this hierarchical saliency defines a most salient corresponding region (scale) for every point in an image. Based on this, an easy-to-use pick and place manipulation system was developed and tested exemplarily.Comment: Accepted for ICRA 201

    Light Field Salient Object Detection: A Review and Benchmark

    Full text link
    Salient object detection (SOD) is a long-standing research topic in computer vision and has drawn an increasing amount of research interest in the past decade. This paper provides the first comprehensive review and benchmark for light field SOD, which has long been lacking in the saliency community. Firstly, we introduce preliminary knowledge on light fields, including theory and data forms, and then review existing studies on light field SOD, covering ten traditional models, seven deep learning-based models, one comparative study, and one brief review. Existing datasets for light field SOD are also summarized with detailed information and statistical analyses. Secondly, we benchmark nine representative light field SOD models together with several cutting-edge RGB-D SOD models on four widely used light field datasets, from which insightful discussions and analyses, including a comparison between light field SOD and RGB-D SOD models, are achieved. Besides, due to the inconsistency of datasets in their current forms, we further generate complete data and supplement focal stacks, depth maps and multi-view images for the inconsistent datasets, making them consistent and unified. Our supplemental data makes a universal benchmark possible. Lastly, because light field SOD is quite a special problem attributed to its diverse data representations and high dependency on acquisition hardware, making it differ greatly from other saliency detection tasks, we provide nine hints into the challenges and future directions, and outline several open issues. We hope our review and benchmarking could help advance research in this field. All the materials including collected models, datasets, benchmarking results, and supplemented light field datasets will be publicly available on our project site https://github.com/kerenfu/LFSOD-Survey

    Exploration Strategies for Incremental Learning of Object-Based Visual Saliency

    Get PDF
    International audienceSearching for objects in an indoor environment can be drastically improved if a task-specific visual saliency is available. We describe a method to learn such an object-based visual saliency in an intrinsically motivated way using an environment exploration mechanism. We first define saliency in a geometrical manner and use this definition to discover salient elements given an attentive but costly observation of the environment. These elements are used to train a fast classifier that predicts salient objects given large-scale visual features. In order to get a better and faster learning, we use intrinsic motivation to drive our observation selection, based on uncertainty and novelty detection. Our approach has been tested on RGB-D images, is real-time, and outperforms several state-of-the-art methods in the case of indoor object detection

    Image and Video-Based Autism Spectrum Disorder Detection via Deep Learning

    Get PDF
    People with Autism Spectrum Disorder (ASD) show atypical attention to social stimuli and aberrant gaze when viewing images of the physical world. However, it is unknown how they perceive the world from a first-person perspective. In this study, we used machine learning to classify photos taken in three different categories (people, indoors, and outdoors) as either having been taken by individuals with ASD or by peers without ASD. Our classifier effectively discriminated photos from all three categories but was particularly successful at classifying photos of people with \u3e80% accuracy. Importantly, the visualization of our model revealed critical features that led to successful discrimination and showed that our model adopted a strategy similar to that of ASD experts. Furthermore, for the first time, we showed that photos were taken by individuals with ASD contained less salient objects, especially in the central visual field. Notably, our model outperformed the classification of these photos by ASD experts. Together, we demonstrate an effective and novel method that is capable of discerning photos taken by individuals with ASD and revealing aberrant visual attention in ASD from a unique first-person perspective. Our method may in turn provide an objective measure for evaluations of individuals with ASD. People with ASD also show atypical behavior when they are doing the same action with peers without ASD. However, it is challenging to efficiently extract this feature from spatial and temporal information. In this study, we applied Graph Convolutional Network (GCN) to the 2D skeleton sequence to classify video recording the same action (brush teeth and wash face) as either from individuals with ASD or by peers without ASD. Furthermore, we adopted an adaptive graph mechanism that allows the model to learn a kernel flexibly and exclusively for each layer, which means the model can learn more useful and robust features. Our classifier can effectively reach80% accuracy. Our method may play an important role in the evaluations of individuals with ASD
    corecore