653 research outputs found
Few shot font generation via transferring similarity guided global style and quantization local style
Automatic few-shot font generation (AFFG), aiming at generating new fonts
with only a few glyph references, reduces the labor cost of manually designing
fonts. However, the traditional AFFG paradigm of style-content disentanglement
cannot capture the diverse local details of different fonts. So, many
component-based approaches are proposed to tackle this problem. The issue with
component-based approaches is that they usually require special pre-defined
glyph components, e.g., strokes and radicals, which is infeasible for AFFG of
different languages. In this paper, we present a novel font generation approach
by aggregating styles from character similarity-guided global features and
stylized component-level representations. We calculate the similarity scores of
the target character and the referenced samples by measuring the distance along
the corresponding channels from the content features, and assigning them as the
weights for aggregating the global style features. To better capture the local
styles, a cross-attention-based style transfer module is adopted to transfer
the styles of reference glyphs to the components, where the components are
self-learned discrete latent codes through vector quantization without manual
definition. With these designs, our AFFG method could obtain a complete set of
component-level style representations, and also control the global glyph
characteristics. The experimental results reflect the effectiveness and
generalization of the proposed method on different linguistic scripts, and also
show its superiority when compared with other state-of-the-art methods. The
source code can be found at https://github.com/awei669/VQ-Font.Comment: Accepted by ICCV 202
Multi-Content GAN for Few-Shot Font Style Transfer
In this work, we focus on the challenge of taking partial observations of
highly-stylized text and generalizing the observations to generate unobserved
glyphs in the ornamented typeface. To generate a set of multi-content images
following a consistent style from very few examples, we propose an end-to-end
stacked conditional GAN model considering content along channels and style
along network layers. Our proposed network transfers the style of given glyphs
to the contents of unseen ones, capturing highly stylized fonts found in the
real-world such as those on movie posters or infographics. We seek to transfer
both the typographic stylization (ex. serifs and ears) as well as the textual
stylization (ex. color gradients and effects.) We base our experiments on our
collected data set including 10,000 fonts with different styles and demonstrate
effective generalization from a very small number of observed glyphs
TET-GAN: Text Effects Transfer via Stylization and Destylization
Text effects transfer technology automatically makes the text dramatically
more impressive. However, previous style transfer methods either study the
model for general style, which cannot handle the highly-structured text effects
along the glyph, or require manual design of subtle matching criteria for text
effects. In this paper, we focus on the use of the powerful representation
abilities of deep neural features for text effects transfer. For this purpose,
we propose a novel Texture Effects Transfer GAN (TET-GAN), which consists of a
stylization subnetwork and a destylization subnetwork. The key idea is to train
our network to accomplish both the objective of style transfer and style
removal, so that it can learn to disentangle and recombine the content and
style features of text effects images. To support the training of our network,
we propose a new text effects dataset with as much as 64 professionally
designed styles on 837 characters. We show that the disentangled feature
representations enable us to transfer or remove all these styles on arbitrary
glyphs using one network. Furthermore, the flexible network design empowers
TET-GAN to efficiently extend to a new text style via one-shot learning where
only one example is required. We demonstrate the superiority of the proposed
method in generating high-quality stylized text over the state-of-the-art
methods.Comment: Accepted by AAAI 2019. Code and dataset will be available at
http://www.icst.pku.edu.cn/struct/Projects/TETGAN.htm
Font Representation Learning via Paired-glyph Matching
Fonts can convey profound meanings of words in various forms of glyphs.
Without typography knowledge, manually selecting an appropriate font or
designing a new font is a tedious and painful task. To allow users to explore
vast font styles and create new font styles, font retrieval and font style
transfer methods have been proposed. These tasks increase the need for learning
high-quality font representations. Therefore, we propose a novel font
representation learning scheme to embed font styles into the latent space. For
the discriminative representation of a font from others, we propose a
paired-glyph matching-based font representation learning model that attracts
the representations of glyphs in the same font to one another, but pushes away
those of other fonts. Through evaluations on font retrieval with query glyphs
on new fonts, we show our font representation learning scheme achieves better
generalization performance than the existing font representation learning
techniques. Finally on the downstream font style transfer and generation tasks,
we confirm the benefits of transfer learning with the proposed method. The
source code is available at https://github.com/junhocho/paired-glyph-matching.Comment: Accepted to BMVC202
A Multi-Implicit Neural Representation for Fonts
Fonts are ubiquitous across documents and come in a variety of styles. They
are either represented in a native vector format or rasterized to produce fixed
resolution images. In the first case, the non-standard representation prevents
benefiting from latest network architectures for neural representations; while,
in the latter case, the rasterized representation, when encoded via networks,
results in loss of data fidelity, as font-specific discontinuities like edges
and corners are difficult to represent using neural networks. Based on the
observation that complex fonts can be represented by a superposition of a set
of simpler occupancy functions, we introduce \textit{multi-implicits} to
represent fonts as a permutation-invariant set of learned implict functions,
without losing features (e.g., edges and corners). However, while
multi-implicits locally preserve font features, obtaining supervision in the
form of ground truth multi-channel signals is a problem in itself. Instead, we
propose how to train such a representation with only local supervision, while
the proposed neural architecture directly finds globally consistent
multi-implicits for font families. We extensively evaluate the proposed
representation for various tasks including reconstruction, interpolation, and
synthesis to demonstrate clear advantages with existing alternatives.
Additionally, the representation naturally enables glyph completion, wherein a
single characteristic font is used to synthesize a whole font family in the
target style
DiffUTE: Universal Text Editing Diffusion Model
Diffusion model based language-guided image editing has achieved great
success recently. However, existing state-of-the-art diffusion models struggle
with rendering correct text and text style during generation. To tackle this
problem, we propose a universal self-supervised text editing diffusion model
(DiffUTE), which aims to replace or modify words in the source image with
another one while maintaining its realistic appearance. Specifically, we build
our model on a diffusion model and carefully modify the network structure to
enable the model for drawing multilingual characters with the help of glyph and
position information. Moreover, we design a self-supervised learning framework
to leverage large amounts of web data to improve the representation ability of
the model. Experimental results show that our method achieves an impressive
performance and enables controllable editing on in-the-wild images with high
fidelity. Our code will be avaliable in
\url{https://github.com/chenhaoxing/DiffUTE}
- …