172 research outputs found

    Fast Deep Matting for Portrait Animation on Mobile Phone

    Full text link
    Image matting plays an important role in image and video editing. However, the formulation of image matting is inherently ill-posed. Traditional methods usually employ interaction to deal with the image matting problem with trimaps and strokes, and cannot run on the mobile phone in real-time. In this paper, we propose a real-time automatic deep matting approach for mobile devices. By leveraging the densely connected blocks and the dilated convolution, a light full convolutional network is designed to predict a coarse binary mask for portrait images. And a feathering block, which is edge-preserving and matting adaptive, is further developed to learn the guided filter and transform the binary mask into alpha matte. Finally, an automatic portrait animation system based on fast deep matting is built on mobile devices, which does not need any interaction and can realize real-time matting with 15 fps. The experiments show that the proposed approach achieves comparable results with the state-of-the-art matting solvers.Comment: ACM Multimedia Conference (MM) 2017 camera-read

    Edge-enhancing Filters with Negative Weights

    Full text link
    In [DOI:10.1109/ICMEW.2014.6890711], a graph-based denoising is performed by projecting the noisy image to a lower dimensional Krylov subspace of the graph Laplacian, constructed using nonnegative weights determined by distances between image data corresponding to image pixels. We~extend the construction of the graph Laplacian to the case, where some graph weights can be negative. Removing the positivity constraint provides a more accurate inference of a graph model behind the data, and thus can improve quality of filters for graph-based signal processing, e.g., denoising, compared to the standard construction, without affecting the costs.Comment: 5 pages; 6 figures. Accepted to IEEE GlobalSIP 2015 conferenc

    Object Discovery via Cohesion Measurement

    Full text link
    Color and intensity are two important components in an image. Usually, groups of image pixels, which are similar in color or intensity, are an informative representation for an object. They are therefore particularly suitable for computer vision tasks, such as saliency detection and object proposal generation. However, image pixels, which share a similar real-world color, may be quite different since colors are often distorted by intensity. In this paper, we reinvestigate the affinity matrices originally used in image segmentation methods based on spectral clustering. A new affinity matrix, which is robust to color distortions, is formulated for object discovery. Moreover, a Cohesion Measurement (CM) for object regions is also derived based on the formulated affinity matrix. Based on the new Cohesion Measurement, a novel object discovery method is proposed to discover objects latent in an image by utilizing the eigenvectors of the affinity matrix. Then we apply the proposed method to both saliency detection and object proposal generation. Experimental results on several evaluation benchmarks demonstrate that the proposed CM based method has achieved promising performance for these two tasks.Comment: 14 pages, 14 figure

    초점 스택에서 3D 깊이 재구성 및 깊이 개선

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·컴퓨터공학부, 2021. 2. 신영길.Three-dimensional (3D) depth recovery from two-dimensional images is a fundamental and challenging objective in computer vision, and is one of the most important prerequisites for many applications such as 3D measurement, robot location and navigation, self-driving, and so on. Depth-from-focus (DFF) is one of the important methods to reconstruct a 3D depth in the use of focus information. Reconstructing a 3D depth from texture-less regions is a typical issue associated with the conventional DFF. Further more, it is difficult for the conventional DFF reconstruction techniques to preserve depth edges and fine details while maintaining spatial consistency. In this dissertation, we address these problems and propose an DFF depth recovery framework which is robust over texture-less regions, and can reconstruct a depth image with clear edges and fine details. The depth recovery framework proposed in this dissertation is composed of two processes: depth reconstruction and depth refinement. To recovery an accurate 3D depth, We first formulate the depth reconstruction as a maximum a posterior (MAP) estimation problem with the inclusion of matting Laplacian prior. The nonlocal principle is adopted during the construction stage of the matting Laplacian matrix to preserve depth edges and fine details. Additionally, a depth variance based confidence measure with the combination of the reliability measure of focus measure is proposed to maintain the spatial smoothness, such that the smooth depth regions in initial depth could have high confidence value and the reconstructed depth could be more derived from the initial depth. As the nonlocal principle breaks the spatial consistency, the reconstructed depth image is spatially inconsistent. Meanwhile, it suffers from texture-copy artifacts. To smooth the noise and suppress the texture-copy artifacts introduced in the reconstructed depth image, we propose a closed-form edge-preserving depth refinement algorithm that formulates the depth refinement as a MAP estimation problem using Markov random fields (MRFs). With the incorporation of pre-estimated depth edges and mutual structure information into our energy function and the specially designed smoothness weight, the proposed refinement method can effectively suppress noise and texture-copy artifacts while preserving depth edges. Additionally, with the construction of undirected weighted graph representing the energy function, a closed-form solution is obtained by using the Laplacian matrix corresponding to the graph. The proposed framework presents a novel method of 3D depth recovery from a focal stack. The proposed algorithm shows the superiority in depth recovery over texture-less regions owing to the effective variance based confidence level computation and the matting Laplacian prior. Additionally, this proposed reconstruction method can obtain a depth image with clear edges and fine details due to the adoption of nonlocal principle in the construct]ion of matting Laplacian matrix. The proposed closed-form depth refinement approach shows that the ability in noise removal while preserving object structure with the usage of common edges. Additionally, it is able to effectively suppress texture-copy artifacts by utilizing mutual structure information. The proposed depth refinement provides a general idea for edge-preserving image smoothing, especially for depth related refinement such as stereo vision. Both quantitative and qualitative experimental results show the supremacy of the proposed method in terms of robustness in texture-less regions, accuracy, and ability to preserve object structure while maintaining spatial smoothness.Chapter 1 Introduction 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Chapter 2 Related Works 9 2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Principle of depth-from-focus . . . . . . . . . . . . . . . . . . . . 9 2.2.1 Focus measure operators . . . . . . . . . . . . . . . . . . . 12 2.3 Depth-from-focus reconstruction . . . . . . . . . . . . . . . . . . 14 2.4 Edge-preserving image denoising . . . . . . . . . . . . . . . . . . 23 Chapter 3 Depth-from-Focus Reconstruction using Nonlocal Matting Laplacian Prior 38 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2 Image matting and matting Laplacian . . . . . . . . . . . . . . . 40 3.3 Depth-from-focus . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.4 Depth reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.4.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . 47 3.4.2 Likelihood model . . . . . . . . . . . . . . . . . . . . . . . 48 3.4.3 Nonlocal matting Laplacian prior model . . . . . . . . . . 50 3.5 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.5.2 Data configuration . . . . . . . . . . . . . . . . . . . . . . 55 3.5.3 Reconstruction results . . . . . . . . . . . . . . . . . . . . 56 3.5.4 Comparison between reconstruction using local and nonlocal matting Laplacian . . . . . . . . . . . . . . . . . . . 56 3.5.5 Spatial consistency analysis . . . . . . . . . . . . . . . . . 59 3.5.6 Parameter setting and analysis . . . . . . . . . . . . . . . 59 3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Chapter 4 Closed-form MRF-based Depth Refinement 63 4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4.3 Closed-form solution . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.4 Edge preservation . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 4.5 Texture-copy artifacts suppression . . . . . . . . . . . . . . . . . 73 4.6 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . 76 4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 Chapter 5 Evaluation 82 5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 5.2 Evaluation metrics . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.3 Evaluation on synthetic datasets . . . . . . . . . . . . . . . . . . 84 5.4 Evaluation on real scene datasets . . . . . . . . . . . . . . . . . . 89 5.5 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 5.6 Computational performances . . . . . . . . . . . . . . . . . . . . 93 Chapter 6 Conclusion 96 Bibliography 99Docto

    CAP-VSTNet: Content Affinity Preserved Versatile Style Transfer

    Full text link
    Content affinity loss including feature and pixel affinity is a main problem which leads to artifacts in photorealistic and video style transfer. This paper proposes a new framework named CAP-VSTNet, which consists of a new reversible residual network and an unbiased linear transform module, for versatile style transfer. This reversible residual network can not only preserve content affinity but not introduce redundant information as traditional reversible networks, and hence facilitate better stylization. Empowered by Matting Laplacian training loss which can address the pixel affinity loss problem led by the linear transform, the proposed framework is applicable and effective on versatile style transfer. Extensive experiments show that CAP-VSTNet can produce better qualitative and quantitative results in comparison with the state-of-the-art methods.Comment: CVPR 202

    Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation

    Full text link
    Natural image matting is an important problem in computer vision and graphics. It is an ill-posed problem when only an input image is available without any external information. While the recent deep learning approaches have shown promising results, they only estimate the alpha matte. This paper presents a context-aware natural image matting method for simultaneous foreground and alpha matte estimation. Our method employs two encoder networks to extract essential information for matting. Particularly, we use a matting encoder to learn local features and a context encoder to obtain more global context information. We concatenate the outputs from these two encoders and feed them into decoder networks to simultaneously estimate the foreground and alpha matte. To train this whole deep neural network, we employ both the standard Laplacian loss and the feature loss: the former helps to achieve high numerical performance while the latter leads to more perceptually plausible results. We also report several data augmentation strategies that greatly improve the network's generalization performance. Our qualitative and quantitative experiments show that our method enables high-quality matting for a single natural image. Our inference codes and models have been made publicly available at https://github.com/hqqxyy/Context-Aware-Matting.Comment: This is the camera ready version of ICCV2019 pape

    Segmentation by transduction

    Get PDF
    International audienceThis paper addresses the problem of segmenting an image into regions consistent with user-supplied seeds (e.g., a sparse set of broad brush strokes). We view this task as a statistical transductive inference, in which some pixels are already associated with given zones and the remaining ones need to be classified. Our method relies on the Laplacian graph regularizer, a powerful manifold learning tool that is based on the estimation of variants of the Laplace-Beltrami operator and is tightly related to diffusion processes. Segmentation is modeled as the task of finding matting coefficients for unclassified pixels given known matting coefficients for seed pixels. The proposed algorithm essentially relies on a high margin assumption in the space of pixel characteristics. It is simple, fast, and accurate, as demonstrated by qualitative results on natural images and a quantitative comparison with state-of-the-art methods on the Microsoft GrabCut segmentation database