3 research outputs found

    Context-Constrained Accurate Contour Extraction for Occlusion Edge Detection

    Full text link
    Occlusion edge detection requires both accurate locations and context constraints of the contour. Existing CNN-based pipeline does not utilize adaptive methods to filter the noise introduced by low-level features. To address this dilemma, we propose a novel Context-constrained accurate Contour Extraction Network (CCENet). Spatial details are retained and contour-sensitive context is augmented through two extraction blocks, respectively. Then, an elaborately designed fusion module is available to integrate features, which plays a complementary role to restore details and remove clutter. Weight response of attention mechanism is eventually utilized to enhance occluded contours and suppress noise. The proposed CCENet significantly surpasses state-of-the-art methods on PIOD and BSDS ownership dataset of object edge detection and occlusion orientation detection.Comment: To appear in ICME 201

    Occlusion-shared and Feature-separated Network for Occlusion Relationship Reasoning

    Full text link
    Occlusion relationship reasoning demands closed contour to express the object, and orientation of each contour pixel to describe the order relationship between objects. Current CNN-based methods neglect two critical issues of the task: (1) simultaneous existence of the relevance and distinction for the two elements, i.e, occlusion edge and occlusion orientation; and (2) inadequate exploration to the orientation features. For the reasons above, we propose the Occlusion-shared and Feature-separated Network (OFNet). On one hand, considering the relevance between edge and orientation, two sub-networks are designed to share the occlusion cue. On the other hand, the whole network is split into two paths to learn the high-level semantic features separately. Moreover, a contextual feature for orientation prediction is extracted, which represents the bilateral cue of the foreground and background areas. The bilateral cue is then fused with the occlusion cue to precisely locate the object regions. Finally, a stripe convolution is designed to further aggregate features from surrounding scenes of the occlusion edge. The proposed OFNet remarkably advances the state-of-the-art approaches on PIOD and BSDS ownership dataset. The source code is available at https://github.com/buptlr/OFNet.Comment: Accepted by ICCV 2019. Code and pretrained model are available at https://github.com/buptlr/OFNe

    Fusion with Diffusion for Robust Visual Tracking

    No full text
    A weighted graph is used as an underlying structure of many algorithms like semisupervised learning and spectral clustering. If the edge weights are determined by a single similarity measure, then it hard if not impossible to capture all relevant aspects of similarity when using a single similarity measure. In particular, in the case of visual object matching it is beneficial to integrate different similarity measures that focus on different visual representations. In this paper, a novel approach to integrate multiple similarity measures is proposed. First pairs of similarity measures are combined with a diffusion process on their tensor product graph (TPG). Hence the diffused similarity of each pair of objects becomes a function of joint diffusion of the two original similarities, which in turn depends on the neighborhood structure of the TPG. We call this process Fusion with Diffusion (FD). However, a higher order graph like the TPG usually means significant increase in time complexity. This is not the case in the proposed approach. A key feature of our approach is that the time complexity of the diffusion on the TPG is the same as the diffusion process on each of the original graphs. Moreover, it is not necessary to explicitly construct the TPG in our framework. Finally all diffused pairs of similarity measures are combined as a weighted sum. We demonstrate the advantages of the proposed approach on the task of visual tracking, where different aspects of the appearance similarity between the target object in frame t − 1 and target object candidates in frame t are integrated. The obtained method is tested on several challenge video sequences and the experimental results show that it outperforms state-of-the-art tracking methods.
    corecore