20 research outputs found
Eliminating Gradient Conflict in Reference-based Line-Art Colorization
Reference-based line-art colorization is a challenging task in computer
vision. The color, texture, and shading are rendered based on an abstract
sketch, which heavily relies on the precise long-range dependency modeling
between the sketch and reference. Popular techniques to bridge the cross-modal
information and model the long-range dependency employ the attention mechanism.
However, in the context of reference-based line-art colorization, several
techniques would intensify the existing training difficulty of attention, for
instance, self-supervised training protocol and GAN-based losses. To understand
the instability in training, we detect the gradient flow of attention and
observe gradient conflict among attention branches. This phenomenon motivates
us to alleviate the gradient issue by preserving the dominant gradient branch
while removing the conflict ones. We propose a novel attention mechanism using
this training strategy, Stop-Gradient Attention (SGA), outperforming the
attention baseline by a large margin with better training stability. Compared
with state-of-the-art modules in line-art colorization, our approach
demonstrates significant improvements in Fr\'echet Inception Distance (FID, up
to 27.21%) and structural similarity index measure (SSIM, up to 25.67%) on
several benchmarks. The code of SGA is available at
https://github.com/kunkun0w0/SGA .Comment: Accepted by ECCV202
Deep Geometrized Cartoon Line Inbetweening
We aim to address a significant but understudied problem in the anime
industry, namely the inbetweening of cartoon line drawings. Inbetweening
involves generating intermediate frames between two black-and-white line
drawings and is a time-consuming and expensive process that can benefit from
automation. However, existing frame interpolation methods that rely on matching
and warping whole raster images are unsuitable for line inbetweening and often
produce blurring artifacts that damage the intricate line structures. To
preserve the precision and detail of the line drawings, we propose a new
approach, AnimeInbet, which geometrizes raster line drawings into graphs of
endpoints and reframes the inbetweening task as a graph fusion problem with
vertex repositioning. Our method can effectively capture the sparsity and
unique structure of line drawings while preserving the details during
inbetweening. This is made possible via our novel modules, i.e., vertex
geometric embedding, a vertex correspondence Transformer, an effective
mechanism for vertex repositioning and a visibility predictor. To train our
method, we introduce MixamoLine240, a new dataset of line drawings with ground
truth vectorization and matching labels. Our experiments demonstrate that
AnimeInbet synthesizes high-quality, clean, and complete intermediate line
drawings, outperforming existing methods quantitatively and qualitatively,
especially in cases with large motions. Data and code are available at
https://github.com/lisiyao21/AnimeInbet.Comment: ICCV 202
Deep line art video colorization with a few references
Coloring line art images based on the colors of reference images is an important stage in animation production, which is time-consuming and tedious. In this paper, we propose a deep architecture to automatically color line art videos with the same color style as the given reference images. Our framework consists of a color transform network and a temporal refinement network based on 3U-net. The color transform network takes the target line art images as well as the line art and color images of the reference images as input, and generates corresponding target color images. To cope with the large differences between each target line art image and the reference color images, we propose a distance attention layer that utilizes non-local similarity matching to determine the region correspondences between the target image and the reference images and transforms the local color information from the references to the target. To ensure global color style consistency, we further incorporate Adaptive Instance Normalization (AdaIN) with the transformation parameters obtained from a multiple-layer AdaIN that describes the global color style of the references, extracted by an embedder network. The temporal refinement network learns spatiotemporal features through 3D convolutions to ensure the temporal color consistency of the results. Our model can achieve even better coloring results by fine-tuning the parameters with only a small number of samples when dealing with an animation of a new style. To evaluate our method, we build a line art coloring dataset
Artificial Intelligence in the Creative Industries: A Review
This paper reviews the current state of the art in Artificial Intelligence
(AI) technologies and applications in the context of the creative industries. A
brief background of AI, and specifically Machine Learning (ML) algorithms, is
provided including Convolutional Neural Network (CNNs), Generative Adversarial
Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement
Learning (DRL). We categorise creative applications into five groups related to
how AI technologies are used: i) content creation, ii) information analysis,
iii) content enhancement and post production workflows, iv) information
extraction and enhancement, and v) data compression. We critically examine the
successes and limitations of this rapidly advancing technology in each of these
areas. We further differentiate between the use of AI as a creative tool and
its potential as a creator in its own right. We foresee that, in the near
future, machine learning-based AI will be adopted widely as a tool or
collaborative assistant for creativity. In contrast, we observe that the
successes of machine learning in domains with fewer constraints, where AI is
the `creator', remain modest. The potential of AI (or its developers) to win
awards for its original creations in competition with human creatives is also
limited, based on contemporary technologies. We therefore conclude that, in the
context of creative industries, maximum benefit from AI will be derived where
its focus is human centric -- where it is designed to augment, rather than
replace, human creativity
DECISION SUPPORT SYSTEM USING WEIGHTING SIMILARITY MODEL FOR CONSTRUCTING GROUND-TRUTH DATA SET
This research aims to form a ground-truth
dataset in the entity-matching process used to detect duplication
of records in a bibliographic database. The contribution of this
research is the obtained dataset which can be used as reference
in measuring and evaluating the entity matching model
implemented in bibliographic databases. This aim was achieved
by developing a decision support system through experts who
act as decision makers in the bibliographic databases field to
construct ground-truth datasets. The model used in this decision
support system weights similarity by comparing each attribute of the pairwise record in the dataset. An expert who understands all characteristics of the research database can use the graphical
user interface to evaluate and determine the pairwise record
that meets the conditions, such as duplication of records. This research produces a ground-truth dataset using the decision
support system approach