331 research outputs found

    Self-supervised Blur Detection from Synthetically Blurred Scenes

    Get PDF
    Blur detection aims at segmenting the blurred areas of a given image. Recent deep learning-based methods approach this problem by learning an end-to-end mapping between the blurred input and a binary mask representing the localization of its blurred areas. Nevertheless, the effectiveness of such deep models is limited due to the scarcity of datasets annotated in terms of blur segmentation, as blur annotation is labour intensive. In this work, we bypass the need for such annotated datasets for end-to-end learning, and instead rely on object proposals and a model for blur generation in order to produce a dataset of synthetically blurred images. This allows us to perform self-supervised learning over the generated image and ground truth blur mask pairs using CNNs, defining a framework that can be employed in purely self-supervised, weakly supervised or semi-supervised configurations. Interestingly, experimental results of such setups over the largest blur segmentation datasets available show that this approach achieves state of the art results in blur segmentation, even without ever observing any real blurred image.This research was partially funded by the Basque Government’s Industry Department under the ELKARTEK program’s project ONKOIKER under agreement KK2018/00090. We thank the Spanish project TIN2016- 79717-R and mention Generalitat de Catalunya CERCA Program

    Explicit Visual Prompting for Universal Foreground Segmentations

    Full text link
    Foreground segmentation is a fundamental problem in computer vision, which includes salient object detection, forgery detection, defocus blur detection, shadow detection, and camouflage object detection. Previous works have typically relied on domain-specific solutions to address accuracy and robustness issues in those applications. In this paper, we present a unified framework for a number of foreground segmentation tasks without any task-specific designs. We take inspiration from the widely-used pre-training and then prompt tuning protocols in NLP and propose a new visual prompting model, named Explicit Visual Prompting (EVP). Different from the previous visual prompting which is typically a dataset-level implicit embedding, our key insight is to enforce the tunable parameters focusing on the explicit visual content from each individual image, i.e., the features from frozen patch embeddings and high-frequency components. Our method freezes a pre-trained model and then learns task-specific knowledge using a few extra parameters. Despite introducing only a small number of tunable parameters, EVP achieves superior performance than full fine-tuning and other parameter-efficient fine-tuning methods. Experiments in fourteen datasets across five tasks show the proposed method outperforms other task-specific methods while being considerably simple. The proposed method demonstrates the scalability in different architectures, pre-trained weights, and tasks. The code is available at: https://github.com/NiFangBaAGe/Explicit-Visual-Prompt.Comment: arXiv admin note: substantial text overlap with arXiv:2303.1088

    Nonlinear kernel based feature maps for blur-sensitive unsharp masking of JPEG images

    Get PDF
    In this paper, a method for estimating the blur regions of an image is first proposed, resorting to a mixture of linear and nonlinear convolutional kernels. The blur map obtained is then utilized to enhance images such that the enhancement strength is an inverse function of the amount of measured blur. The blur map can also be used for tasks such as attention-based object classification, low light image enhancement, and more. A CNN architecture is trained with nonlinear upsampling layers using a standard blur detection benchmark dataset, with the help of blur target maps. Further, it is proposed to use the same architecture to build maps of areas affected by the typical JPEG artifacts, ringing and blockiness. The blur map and the artifact map pair permit to build an activation map for the enhancement of a (possibly JPEG compressed) image. Extensive experiments on standard test images verify the quality of the maps obtained using the algorithm and their effectiveness in locally controlling the enhancement, for superior perceptual quality. Last but not least, the computation time for generating these maps is much lower than the one of other comparable algorithms

    Defocus Image Deblurring Network with Defocus Map Estimation as Auxiliary Task

    Get PDF
    Different from the object motion blur, the defocus blur is caused by the limitation of the cameras’ depth of field. The defocus amount can be characterized by the parameter of point spread function and thus forms a defocus map. In this paper, we propose a new network architecture called Defocus Image Deblurring Auxiliary Learning Net (DID-ANet), which is specifically designed for single image defocus deblurring by using defocus map estimation as auxiliary task to improve the deblurring result. To facilitate the training of the network, we build a novel and large-scale dataset for single image defocus deblurring, which contains the defocus images, the defocus maps and the all-sharp images. To the best of our knowledge, the new dataset is the first large-scale defocus deblurring dataset for training deep networks. Moreover, the experimental results demonstrate that the proposed DID-ANet outperforms the state-of-the-art methods for both tasks of defocus image deblurring and defocus map estimation, both quantitatively and qualitatively. The dataset, code, and model is available on GitHub: https://github.com/xytmhy/DID-ANet-Defocus-Deblurring

    Intriguing Findings of Frequency Selection for Image Deblurring

    Full text link
    Blur was naturally analyzed in the frequency domain, by estimating the latent sharp image and the blur kernel given a blurry image. Recent progress on image deblurring always designs end-to-end architectures and aims at learning the difference between blurry and sharp image pairs from pixel-level, which inevitably overlooks the importance of blur kernels. This paper reveals an intriguing phenomenon that simply applying ReLU operation on the frequency domain of a blur image followed by inverse Fourier transform, i.e., frequency selection, provides faithful information about the blur pattern (e.g., the blur direction and blur level, implicitly shows the kernel pattern). Based on this observation, we attempt to leverage kernel-level information for image deblurring networks by inserting Fourier transform, ReLU operation, and inverse Fourier transform to the standard ResBlock. 1x1 convolution is further added to let the network modulate flexible thresholds for frequency selection. We term our newly built block as Res FFT-ReLU Block, which takes advantages of both kernel-level and pixel-level features via learning frequency-spatial dual-domain representations. Extensive experiments are conducted to acquire a thorough analysis on the insights of the method. Moreover, after plugging the proposed block into NAFNet, we can achieve 33.85 dB in PSNR on GoPro dataset. Our method noticeably improves backbone architectures without introducing many parameters, while maintaining low computational complexity. Code is available at https://github.com/DeepMed-Lab/DeepRFT-AAAI2023.Comment: AAAI 202
    • …
    corecore