Search CORE

825 research outputs found

Text-Guided Neural Image Inpainting

Author: Diederik
He Kaiming
Li Bowen
Lin Tsung-Yi
Liu Guilin
Liu Hongyu
Miyato Takeru
Reed Scott E.
Saxe Andrew M.
Song Yuhang
van den Oord Aäron
Yan Zhaoyi
Yang Chao
Yeh Raymond A.
Yu Jiahui
Yu Jiahui
Zheng Chuanxia
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/03/2021
Field of study

Image inpainting task requires filling the corrupted image with contents coherent with the context. This research field has achieved promising progress by using neural image inpainting methods. Nevertheless, there is still a critical challenge in guessing the missed content with only the context pixels. The goal of this paper is to fill the semantic information in corrupted images according to the provided descriptive text. Unique from existing text-guided image generation works, the inpainting models are required to compare the semantic content of the given text and the remaining part of the image, then find out the semantic content that should be filled for missing part. To fulfill such a task, we propose a novel inpainting model named Text-Guided Dual Attention Inpainting Network (TDANet). Firstly, a dual multimodal attention mechanism is designed to extract the explicit semantic information about the corrupted regions, which is done by comparing the descriptive text and complementary image areas through reciprocal attention. Secondly, an image-text matching loss is applied to maximize the semantic similarity of the generated image and the text. Experiments are conducted on two open datasets. Results show that the proposed TDANet model reaches new state-of-the-art on both quantitative and qualitative measures. Result analysis suggests that the generated images are consistent with the guidance text, enabling the generation of various results by providing different descriptions. Codes are available at https://github.com/idealwhite/TDANetComment: ACM MM'2020 (Oral). 9 pages, 4 tables, 7 figure

arXiv.org e-Print Archive

Crossref

Geometry-Aware Face Completion and Editing

Author: Cao Jie
He Ran
Hu Yibo
Song Linsen
Song Linxiao
Publication venue
Publication date: 13/02/2019
Field of study

Face completion is a challenging generation task because it requires generating visually pleasing new pixels that are semantically consistent with the unmasked face region. This paper proposes a geometry-aware Face Completion and Editing NETwork (FCENet) by systematically studying facial geometry from the unmasked region. Firstly, a facial geometry estimator is learned to estimate facial landmark heatmaps and parsing maps from the unmasked face image. Then, an encoder-decoder structure generator serves to complete a face image and disentangle its mask areas conditioned on both the masked face image and the estimated facial geometry images. Besides, since low-rank property exists in manually labeled masks, a low-rank regularization term is imposed on the disentangled masks, enforcing our completion network to manage occlusion area with various shape and size. Furthermore, our network can generate diverse results from the same masked input by modifying estimated facial geometry, which provides a flexible mean to edit the completed face appearance. Extensive experimental results qualitatively and quantitatively demonstrate that our network is able to generate visually pleasing face completion results and edit face attributes as well

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications