193 research outputs found

    Pixelated Semantic Colorization

    Get PDF
    While many image colorization algorithms have recently shown the capability of producing plausible color versions from gray-scale photographs, they still suffer from limited semantic understanding. To address this shortcoming, we propose to exploit pixelated object semantics to guide image colorization. The rationale is that human beings perceive and distinguish colors based on the semantic categories of objects. Starting from an autoregressive model, we generate image color distributions, from which diverse colored results are sampled. We propose two ways to incorporate object semantics into the colorization model: through a pixelated semantic embedding and a pixelated semantic generator. Specifically, the proposed convolutional neural network includes two branches. One branch learns what the object is, while the other branch learns the object colors. The network jointly optimizes a color embedding loss, a semantic segmentation loss and a color generation loss, in an end-to-end fashion. Experiments on PASCAL VOC2012 and COCO-stuff reveal that our network, when trained with semantic segmentation labels, produces more realistic and finer results compared to the colorization state-of-the-art

    Multi-task Self-Supervised Visual Learning

    Full text link
    We investigate methods for combining multiple self-supervised tasks--i.e., supervised tasks where data can be collected without manual labeling--in order to train a single visual representation. First, we provide an apples-to-apples comparison of four different self-supervised tasks using the very deep ResNet-101 architecture. We then combine tasks to jointly train a network. We also explore lasso regularization to encourage the network to factorize the information in its representation, and methods for "harmonizing" network inputs in order to learn a more unified representation. We evaluate all methods on ImageNet classification, PASCAL VOC detection, and NYU depth prediction. Our results show that deeper networks work better, and that combining tasks--even via a naive multi-head architecture--always improves performance. Our best joint network nearly matches the PASCAL performance of a model pre-trained on ImageNet classification, and matches the ImageNet network on NYU depth prediction.Comment: Published at ICCV 201

    ๋ณ€ํ˜•๋œ FusionNet์„ ์ด์šฉํ•œ ํšŒ์ƒ‰์กฐ ์ด๋ฏธ์ง€์˜ ์ž์—ฐ์Šค๋Ÿฌ์šด ์ฑ„์ƒ‰

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ๊ณ„์‚ฐ๊ณผํ•™์ „๊ณต, 2021. 2. ๊ฐ•๋ช…์ฃผ.In this paper, we propose a grayscale image colorizing technique. The colorization task can be divided into three main ways, the Scribble-based method, Exemplar-based method and Fully automatic method. Our proposed method is included in the third one. We use a deep learning model that is widely used in the colorization eld recently. We propose Encoder-Docoder model using Convolutional Neural Networks. In particular, we modify the FusionNet with good performance to suit this purpose. Also, in order to get better results, we do not use MSE loss function. Instead, we use the loss function suitable for the colorizing purpose. We use a subset of the ImageNet dataset as the training, validation and test dataset. We take some existing methods from Fully automatic Deep Learning method and compared them with our models. Our algorithm is evaluated using a quantitative metric called PSNR (Peak Signal-to-Noise Ratio). In addition, in order to evaluate the results qualitatively, our model was applied to the test dataset and compared with various other models. Our model has better performance both quantitatively and qualitatively than other models. Finally, we apply our model to old black and white photographs.๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํšŒ์ƒ‰์กฐ ์ด๋ฏธ์ง€๋“ค์— ๋Œ€ํ•œ ์ฑ„์ƒ‰ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์ฑ„์ƒ‰ ์ž‘์—…์€ ํฌ๊ฒŒ Scribble ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•, Exemplar ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•, ์™„์ „ ์ž๋™ ๋ฐฉ๋ฒ•์˜ ์„ธ ๊ฐ€์ง€๋กœ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์„ธ ๋ฒˆ์งธ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ์ตœ๊ทผ์— ์ฑ„์ƒ‰ ๋ถ„์•ผ์—์„œ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ๋”ฅ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•œ๋‹ค. Convolutional Neural Networks๋ฅผ ์ด์šฉํ•œ Encoder-Docoder ๋ชจ๋ธ์„ ์ œ์•ˆํ•œ๋‹ค. ํŠนํžˆ ๊ธฐ์กด์— image segmetation ๋ถ„์•ผ์—์„œ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์ด๋Š” FusionNet์„ ์ž๋™ ์ฑ„์ƒ‰ ๋ชฉ์ ์— ๋งž๊ฒŒ ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ์ˆ˜์ •ํ–ˆ๋‹ค. ๋˜ํ•œ ๋” ๋‚˜์€ ๊ฒฐ๊ณผ๋ฅผ ์–ป๊ธฐ ์œ„ํ•ด MSE ์†์‹ค ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•˜๋‹ค. ๋Œ€์‹ , ์šฐ๋ฆฌ๋Š” ์ž๋™ ์ฑ„์ƒ‰ ๋ชฉ์ ์— ์ ํ•ฉํ•œ ์†์‹ค ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ImageNet ๋ฐ์ดํ„ฐ์…‹์˜ ๋ถ€๋ถ„ ์ง‘ํ•ฉ์„ ํ›ˆ๋ จ, ๊ฒ€์ฆ ๋ฐ ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ์‚ฌ์šฉํ–ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์™„์ „ ์ž๋™ ๋”ฅ ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•์—์„œ ๊ธฐ์กด ๋ฐฉ๋ฒ•์„ ๊ฐ€์ ธ์™€ ์šฐ๋ฆฌ์˜ ๋ชจ๋ธ๊ณผ ๋น„๊ตํ–ˆ๋‹ค. ์šฐ๋ฆฌ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ PSNR (Peak Signal-to-Noise Ratio)์ด๋ผ๋Š” ์ •๋Ÿ‰์  ์ง€ํ‘œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ‰๊ฐ€๋˜์—ˆ๋‹ค. ๋˜ํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ •์„ฑ์ ์œผ๋กœ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด ํ…Œ์ŠคํŠธ ๋ฐ์ดํ„ฐ์…‹์— ๋ชจ๋ธ์„ ์ ์šฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ๊ณผ ๋น„๊ตํ–ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ ๋‹ค๋ฅธ ๋ชจ๋ธ์— ๋น„ํ•ด ์ •์„ฑ์ ์œผ๋กœ๋„, ์ •๋Ÿ‰์ ์œผ๋กœ๋„ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์˜ค๋ž˜๋œ ํ‘๋ฐฑ ์‚ฌ์ง„๊ณผ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์œ ํ˜•์˜ ์ด๋ฏธ์ง€์— ์ ์šฉํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ œ์‹œํ–ˆ๋‹ค.Abstract i 1 Introduction 1 2 Related Works 4 2.1 Scribble-based method . . . . . . . . . . . . . . . . . . . . . . 4 2.2 Exemplar-based method . . . . . . . . . . . . . . . . . . . . . 5 2.3 Fully automatic method . . . . . . . . . . . . . . . . . . . . . 6 3 Proposed Method 8 3.1 Method Overview . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2 Loss Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 Architecture detail . . . . . . . . . . . . . . . . . . . . . . . . 10 3.3.1 Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.3.2 Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3.3 Bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4 Experiments 14 4.1 CIE Lab Color Space . . . . . . . . . . . . . . . . . . . . . . . 15 4.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 4.3 Qualitative Evaluation . . . . . . . . . . . . . . . . . . . . . . 17 4.4 Quantitative Evaluation . . . . . . . . . . . . . . . . . . . . . 18 4.5 Legacy Old image Colorization . . . . . . . . . . . . . . . . . . 20 5 Conclusion 23 The bibliography 24 Abstract (in Korean) 28Maste

    ChromaGAN: Adversarial Picture Colorization with Semantic Class Distribution

    Full text link
    The colorization of grayscale images is an ill-posed problem, with multiple correct solutions. In this paper, we propose an adversarial learning colorization approach coupled with semantic information. A generative network is used to infer the chromaticity of a given grayscale image conditioned to semantic clues. This network is framed in an adversarial model that learns to colorize by incorporating perceptual and semantic understanding of color and class distributions. The model is trained via a fully self-supervised strategy. Qualitative and quantitative results show the capacity of the proposed method to colorize images in a realistic way achieving state-of-the-art results.Comment: 8 pages + reference

    Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization

    Full text link
    Exemplar-based colorization approaches rely on reference image to provide plausible colors for target gray-scale image. The key and difficulty of exemplar-based colorization is to establish an accurate correspondence between these two images. Previous approaches have attempted to construct such a correspondence but are faced with two obstacles. First, using luminance channels for the calculation of correspondence is inaccurate. Second, the dense correspondence they built introduces wrong matching results and increases the computation burden. To address these two problems, we propose Semantic-Sparse Colorization Network (SSCN) to transfer both the global image style and detailed semantic-related colors to the gray-scale image in a coarse-to-fine manner. Our network can perfectly balance the global and local colors while alleviating the ambiguous matching problem. Experiments show that our method outperforms existing methods in both quantitative and qualitative evaluation and achieves state-of-the-art performance.Comment: Accepted by ECCV2022; 14 pages, 10 figure
    • โ€ฆ
    corecore