3 research outputs found

    HINT: High-quality INpainting Transformer with Mask-Aware Encoding and Enhanced Attention

    Get PDF
    Existing image inpainting methods leverage convolution-based downsampling approaches to reduce spatial dimensions. This may result in information loss from corrupted images where the available information is inherently sparse, especially for the scenario of large missing regions. Recent advances in self-attention mechanisms within transformers have led to significant improvements in many computer vision tasks including inpainting. However, limited by the computational costs, existing methods cannot fully exploit the efficacy of long-range modelling capabilities of such models. In this paper, we propose an end-to-end High-quality INpainting Transformer, abbreviated as HINT, which consists of a novel mask-aware pixel-shuffle downsampling module (MPD) to preserve the visible information extracted from the corrupted image while maintaining the integrity of the information available for highlevel inferences made within the model. Moreover, we propose a Spatially-activated Channel Attention Layer (SCAL), an efficient self-attention mechanism interpreting spatial awareness to model the corrupted image at multiple scales. To further enhance the effectiveness of SCAL, motivated by recent advanced in speech recognition, we introduce a sandwich structure that places feed-forward networks before and after the SCAL module. We demonstrate the superior performance of HINT compared to contemporary state-of-the-art models on four datasets, CelebA, CelebA-HQ, Places2, and Dunhuang

    Intelligent enhancement of ancient Chinese murals based on multi-scale parallel structure

    Get PDF
    Ancient mural artwork preserves the historical background and cultural customs of that time through intricate details and bright colors. However, after the natural environment and man-made damage, these works of art are damaged in color, texture and content and lose their quality. In order to identify and enhance murals with large areas of color damage, we propose a multi-scale parallel GAN and parallel Unet structure, which can extract features from multiple scales or images to adapt to the changing scale of the target and provide a more diverse set of features. This structure can reduce the risk of overfitting the training data by learning more general features. The verification results of indicators such as PSNR on the ancient mural data set show that the method has a certain performance improvement effect
    corecore