29 research outputs found
Learning to Generate Posters of Scientific Papers
Researchers often summarize their work in the form of posters. Posters
provide a coherent and efficient way to convey core ideas from scientific
papers. Generating a good scientific poster, however, is a complex and time
consuming cognitive task, since such posters need to be readable, informative,
and visually aesthetic. In this paper, for the first time, we study the
challenging problem of learning to generate posters from scientific papers. To
this end, a data-driven framework, that utilizes graphical models, is proposed.
Specifically, given content to display, the key elements of a good poster,
including panel layout and attributes of each panel, are learned and inferred
from data. Then, given inferred layout and attributes, composition of graphical
elements within each panel is synthesized. To learn and validate our model, we
collect and make public a Poster-Paper dataset, which consists of scientific
papers and corresponding posters with exhaustively labelled panels and
attributes. Qualitative and quantitative results indicate the effectiveness of
our approach.Comment: in Proceedings of the 30th AAAI Conference on Artificial Intelligence
(AAAI'16), Phoenix, AZ, 201
Modeling Caricature Expressions by 3D Blendshape and Dynamic Texture
The problem of deforming an artist-drawn caricature according to a given
normal face expression is of interest in applications such as social media,
animation and entertainment. This paper presents a solution to the problem,
with an emphasis on enhancing the ability to create desired expressions and
meanwhile preserve the identity exaggeration style of the caricature, which
imposes challenges due to the complicated nature of caricatures. The key of our
solution is a novel method to model caricature expression, which extends
traditional 3DMM representation to caricature domain. The method consists of
shape modelling and texture generation for caricatures. Geometric optimization
is developed to create identity-preserving blendshapes for reconstructing
accurate and stable geometric shape, and a conditional generative adversarial
network (cGAN) is designed for generating dynamic textures under target
expressions. The combination of both shape and texture components makes the
non-trivial expressions of a caricature be effectively defined by the extension
of the popular 3DMM representation and a caricature can thus be flexibly
deformed into arbitrary expressions with good results visually in both shape
and color spaces. The experiments demonstrate the effectiveness of the proposed
method.Comment: Accepted by the 28th ACM International Conference on Multimedia (ACM
MM 2020
Taming Reversible Halftoning via Predictive Luminance
Traditional halftoning usually drops colors when dithering images with binary
dots, which makes it difficult to recover the original color information. We
proposed a novel halftoning technique that converts a color image into a binary
halftone with full restorability to its original version. Our novel base
halftoning technique consists of two convolutional neural networks (CNNs) to
produce the reversible halftone patterns, and a noise incentive block (NIB) to
mitigate the flatness degradation issue of CNNs. Furthermore, to tackle the
conflicts between the blue-noise quality and restoration accuracy in our novel
base method, we proposed a predictor-embedded approach to offload predictable
information from the network, which in our case is the luminance information
resembling from the halftone pattern. Such an approach allows the network to
gain more flexibility to produce halftones with better blue-noise quality
without compromising the restoration quality. Detailed studies on the
multiple-stage training method and loss weightings have been conducted. We have
compared our predictor-embedded method and our novel method regarding spectrum
analysis on halftone, halftone accuracy, restoration accuracy, and the data
embedding studies. Our entropy evaluation evidences our halftone contains less
encoding information than our novel base method. The experiments show our
predictor-embedded method gains more flexibility to improve the blue-noise
quality of halftones and maintains a comparable restoration quality with a
higher tolerance for disturbances.Comment: to be published in IEEE Transactions on Visualization and Computer
Graphic
Analysis and Construction of Engaging Facial Forms and Expressions: Interdisciplinary Approaches from Art, Anatomy, Engineering, Cultural Studies, and Psychology
The topic of this dissertation is the anatomical, psychological, and cultural examination of a human face in order to effectively construct an anatomy-driven 3D virtual face customization and action model. In order to gain a broad perspective of all aspects of a face, theories and methodology from the fields of art, engineering, anatomy, psychology, and cultural studies have been analyzed and implemented. The computer generated facial customization and action model were designed based on the collected data. Using this customization system, culturally-specific attractive face in Korean popular culture, “kot-mi-nam (flower-like beautiful guy),” was modeled and analyzed as a case study. The “kot-mi-nam” phenomenon is overviewed in textual, visual, and contextual aspects, which reveals the gender- and sexuality-fluidity of its masculinity. The analysis and the actual development of the model organically co-construct each other requiring an interwoven process. Chapter 1 introduces anatomical studies of a human face, psychological theories of face recognition and an attractive face, and state-of-the-art face construction projects in the various fields. Chapter 2 and 3 present the Bezier curve-based 3D facial customization (BCFC) and Multi-layered Facial Action Model (MFAF) based on the analysis of human anatomy, to achieve a cost-effective yet realistic quality of facial animation without using 3D scanned data. In the experiments, results for the facial customization for gender, race, fat, and age showed that BCFC achieved enhanced performance of 25.20% compared to existing program Facegen , and 44.12% compared to Facial Studio. The experimental results also proved the realistic quality and effectiveness of MFAM compared with blend shape technique by enhancing 2.87% and 0.03% of facial area for happiness and anger expressions per second, respectively. In Chapter 4, according to the analysis based on BCFC, the 3D face of an average kot-mi-nam is close to gender neutral (male: 50.38%, female: 49.62%), and Caucasian (66.42-66.40%). Culturally-specific images can be misinterpreted in different cultures, due to their different languages, histories, and contexts. This research demonstrates that facial images can be affected by the cultural tastes of the makers and can also be interpreted differently by viewers in different cultures