137 research outputs found
StrokeGAN: Reducing Mode Collapse in Chinese Font Generation via Stroke Encoding
The generation of stylish Chinese fonts is an important problem involved in
many applications. Most of existing generation methods are based on the deep
generative models, particularly, the generative adversarial networks (GAN)
based models. However, these deep generative models may suffer from the mode
collapse issue, which significantly degrades the diversity and quality of
generated results. In this paper, we introduce a one-bit stroke encoding to
capture the key mode information of Chinese characters and then incorporate it
into CycleGAN, a popular deep generative model for Chinese font generation. As
a result we propose an efficient method called StrokeGAN, mainly motivated by
the observation that the stroke encoding contains amount of mode information of
Chinese characters. In order to reconstruct the one-bit stroke encoding of the
associated generated characters, we introduce a stroke-encoding reconstruction
loss imposed on the discriminator. Equipped with such one-bit stroke encoding
and stroke-encoding reconstruction loss, the mode collapse issue of CycleGAN
can be significantly alleviated, with an improved preservation of strokes and
diversity of generated characters. The effectiveness of StrokeGAN is
demonstrated by a series of generation tasks over nine datasets with different
fonts. The numerical results demonstrate that StrokeGAN generally outperforms
the state-of-the-art methods in terms of content and recognition accuracies, as
well as certain stroke error, and also generates more realistic characters.Comment: 10 pages, our codes and data are available at:
https://github.com/JinshanZeng/StrokeGA
LUT-NN: Empower Efficient Neural Network Inference with Centroid Learning and Table Lookup
On-device Deep Neural Network (DNN) inference consumes significant computing
resources and development efforts. To alleviate that, we propose LUT-NN, the
first system to empower inference by table lookup, to reduce inference cost.
LUT-NN learns the typical features for each operator, named centroid, and
precompute the results for these centroids to save in lookup tables. During
inference, the results of the closest centroids with the inputs can be read
directly from the table, as the approximated outputs without computations.
LUT-NN integrates two major novel techniques: (1) differentiable centroid
learning through backpropagation, which adapts three levels of approximation to
minimize the accuracy impact by centroids; (2) table lookup inference
execution, which comprehensively considers different levels of parallelism,
memory access reduction, and dedicated hardware units for optimal performance.
LUT-NN is evaluated on multiple real tasks, covering image and speech
recognition, and nature language processing. Compared to related work, LUT-NN
improves accuracy by 66% to 92%, achieving similar level with the original
models. LUT-NN reduces the cost at all dimensions, including FLOPs (
16x), model size ( 7x), latency ( 6.8x), memory ( 6.5x), and
power ( 41.7%)
- …