126 research outputs found

    Camouflaged Object Detection with Feature Grafting and Distractor Aware

    Full text link
    The task of Camouflaged Object Detection (COD) aims to accurately segment camouflaged objects that integrated into the environment, which is more challenging than ordinary detection as the texture between the target and background is visually indistinguishable. In this paper, we proposed a novel Feature Grafting and Distractor Aware network (FDNet) to handle the COD task. Specifically, we use CNN and Transformer to encode multi-scale images in parallel. In order to better explore the advantages of the two encoders, we design a cross-attention-based Feature Grafting Module to graft features extracted from Transformer branch into CNN branch, after which the features are aggregated in the Feature Fusion Module. A Distractor Aware Module is designed to explicitly model the two possible distractors in the COD task to refine the coarse camouflage map. We also proposed the largest artificial camouflaged object dataset which contains 2000 images with annotations, named ACOD2K. We conducted extensive experiments on four widely used benchmark datasets and the ACOD2K dataset. The results show that our method significantly outperforms other state-of-the-art methods. The code and the ACOD2K will be available at https://github.com/syxvision/FDNet.Comment: ICME2023 pape

    De-emphasis of distracting image regions using texture power maps

    Get PDF
    We present a post-processing technique that selectively reduces the salience of distracting regions in an image. Computational models of attention predict that texture variation influences bottom-up attention mechanisms. Our method reduces the spatial variation of texture using power maps, high-order features describing local frequency content in an image. Modification of power maps results in effective regional de-emphasis. We validate our results quantitatively via a human subject search experiment and qualitatively with eye tracking data.Singapore-MIT Alliance (SMA

    3D Target Detection and Spectral Classification for Single-photon LiDAR Data

    Full text link
    3D single-photon LiDAR imaging has an important role in many applications. However, full deployment of this modality will require the analysis of low signal to noise ratio target returns and a very high volume of data. This is particularly evident when imaging through obscurants or in high ambient background light conditions. This paper proposes a multiscale approach for 3D surface detection from the photon timing histogram to permit a significant reduction in data volume. The resulting surfaces are background-free and can be used to infer depth and reflectivity information about the target. We demonstrate this by proposing a hierarchical Bayesian model for 3D reconstruction and spectral classification of multispectral single-photon LiDAR data. The reconstruction method promotes spatial correlation between point-cloud estimates and uses a coordinate gradient descent algorithm for parameter estimation. Results on simulated and real data show the benefits of the proposed target detection and reconstruction approaches when compared to state-of-the-art processing algorithm

    Processing boundary and region features for perception

    Get PDF
    A fundamental task for any visual system is the accurate detection of objects from background information, for example, defining fruit from foliage or a predator in a forest. This is commonly referred to as figure-ground segregation, which occurs when the visual system locates differences in visual features across an image, such as colour or texture. Combinations of feature contrast define an object from its surrounds, though the exact nature of that combination is still debated. Two processes are likely to contribute to object conspicuity, the pooling of features within an object's bounds relative to those in the background ('region' contrast) and detecting feature contrast at the boundary itself ('boundary' contrast). Investigations of the relative contributions of these two processes to perception have produced sometimes contradictory findings, some of which can be explained by the methodology adopted in those studies. For example, results from several studies adopting search-based methodologies have advocated nonlinear interaction of the boundary and region processes, whereas results from more subjective methods have indicated a linear combination. This thesis aims to compare search and subjective methodologies to determine how visual features (region and boundary) interact, highlight limitations of these metrics, and then unpack the contributions of boundary and region processes in greater detail. The first and second experiments investigated the relative contributions of boundary strength, regional orientation, and regional spatial frequency to object conspicuity. This was achieved via a comparison of search and subjective methodologies, which, as mentioned, have previously produced conflicting results in this domain. The results advocated a relatively strong contribution of boundary features compared to region-based features, and replicated the apparent incongruence between findings from search-based and subjective metrics. Results from the search task suggest nonlinear interaction and those from the subjective task suggest linear interaction. A unifying model that reconciles these seemingly contradicting findings (and those in the literature) is then presented, which considers the effect of metric sensitivity and performance ceilings in the paradigms employed. In light of the findings from the first and second experiments that suggest a stronger contribution of boundary information to object conspicuity, the third and fourth experiments investigated boundary features in more detail. Anecdotal reports from observers in the earlier experiments suggest that the conspicuity of boundaries is modulated by information in the background, regardless of boundary structure. As such, the relative contributions of boundary-background contrast and boundary composition were investigated using a novel stimulus generation technique that enables their effective isolation. A novel metric for boundary composition that correlates well with perception is also outlined. Results for those experiments suggested a significant contribution of both sources of boundary information, though advocate a critical role for boundary-background contrast. The final experiment explored the contribution of region-based information to object conspicuity in more detail, specifically how higher-order image structure, such as the components of complex texture, contribute to conspicuity. A state-of-the-art texture synthesis model, which reproduces textures via mechanisms that mimic processes in the human visual system, is evaluated respect to its perceptual applicability. Previous evaluations of this synthesis model are extended via a novel approach that enables the isolation of the model's parameters (which simulate physiological mechanisms) for independent examination. An alternative metric for the efficacy of the model is also presented

    Frequency Perception Network for Camouflaged Object Detection

    Full text link
    Camouflaged object detection (COD) aims to accurately detect objects hidden in the surrounding environment. However, the existing COD methods mainly locate camouflaged objects in the RGB domain, their performance has not been fully exploited in many challenging scenarios. Considering that the features of the camouflaged object and the background are more discriminative in the frequency domain, we propose a novel learnable and separable frequency perception mechanism driven by the semantic hierarchy in the frequency domain. Our entire network adopts a two-stage model, including a frequency-guided coarse localization stage and a detail-preserving fine localization stage. With the multi-level features extracted by the backbone, we design a flexible frequency perception module based on octave convolution for coarse positioning. Then, we design the correction fusion module to step-by-step integrate the high-level features through the prior-guided correction and cross-layer feature channel association, and finally combine them with the shallow features to achieve the detailed correction of the camouflaged objects. Compared with the currently existing models, our proposed method achieves competitive performance in three popular benchmark datasets both qualitatively and quantitatively.Comment: Accepted by ACM MM 202
    • …
    corecore