1,468 research outputs found
TediGAN: Text-Guided Diverse Face Image Generation and Manipulation
In this work, we propose TediGAN, a novel framework for multi-modal image
generation and manipulation with textual descriptions. The proposed method
consists of three components: StyleGAN inversion module, visual-linguistic
similarity learning, and instance-level optimization. The inversion module maps
real images to the latent space of a well-trained StyleGAN. The
visual-linguistic similarity learns the text-image matching by mapping the
image and text into a common embedding space. The instance-level optimization
is for identity preservation in manipulation. Our model can produce diverse and
high-quality images with an unprecedented resolution at 1024. Using a control
mechanism based on style-mixing, our TediGAN inherently supports image
synthesis with multi-modal inputs, such as sketches or semantic labels, with or
without instance guidance. To facilitate text-guided multi-modal synthesis, we
propose the Multi-Modal CelebA-HQ, a large-scale dataset consisting of real
face images and corresponding semantic segmentation map, sketch, and textual
descriptions. Extensive experiments on the introduced dataset demonstrate the
superior performance of our proposed method. Code and data are available at
https://github.com/weihaox/TediGAN.Comment: CVPR 2021. Code: https://github.com/weihaox/TediGAN Data:
https://github.com/weihaox/Multi-Modal-CelebA-HQ Video:
https://youtu.be/L8Na2f5viA
PassGAN: A Deep Learning Approach for Password Guessing
State-of-the-art password guessing tools, such as HashCat and John the
Ripper, enable users to check billions of passwords per second against password
hashes. In addition to performing straightforward dictionary attacks, these
tools can expand password dictionaries using password generation rules, such as
concatenation of words (e.g., "password123456") and leet speak (e.g.,
"password" becomes "p4s5w0rd"). Although these rules work well in practice,
expanding them to model further passwords is a laborious task that requires
specialized expertise. To address this issue, in this paper we introduce
PassGAN, a novel approach that replaces human-generated password rules with
theory-grounded machine learning algorithms. Instead of relying on manual
password analysis, PassGAN uses a Generative Adversarial Network (GAN) to
autonomously learn the distribution of real passwords from actual password
leaks, and to generate high-quality password guesses. Our experiments show that
this approach is very promising. When we evaluated PassGAN on two large
password datasets, we were able to surpass rule-based and state-of-the-art
machine learning password guessing tools. However, in contrast with the other
tools, PassGAN achieved this result without any a-priori knowledge on passwords
or common password structures. Additionally, when we combined the output of
PassGAN with the output of HashCat, we were able to match 51%-73% more
passwords than with HashCat alone. This is remarkable, because it shows that
PassGAN can autonomously extract a considerable number of password properties
that current state-of-the art rules do not encode.Comment: This is an extended version of the paper which appeared in NeurIPS
2018 Workshop on Security in Machine Learning (SecML'18), see
https://github.com/secml2018/secml2018.github.io/raw/master/PASSGAN_SECML2018.pd
Photo Enhancement On Mobile Devices Using Deep Neural Networks
In recent years, the return of the usage of Artificial Neural Networks has lead to the
greatest improvements in the field of Artificial Intelligence, due to the huge diversity
of different applications that deep learning models has in a large variety of research
fields, and also the evolution of information processing systems capacity. This thesis
aims to study which deep neural networks models are most suitable for photo
enhancement, to generate images with certain desired characteristics.
Model selection has been done by comparing the both supervised, Convolutional
Neural Networks, and unsupervised models, Generative Adversarial Networks. It has
been demonstrated that Generative Adversarial Networks have great potential by
showing results that compete with the state of the art. The chosen model is a
Generative Adversarial model which outperforms the rest in terms of a combination
of enhancement quality and time taken in the process. Moreover, since the model is
compatible with mobile devices it has been integrated and evaluated in a BQ
smartphone, to proof its viability on mobile devices.Doble Grado en Ingeniería Informática y Administración de Empresa
Cali-Sketch: Stroke Calibration and Completion for High-Quality Face Image Generation from Poorly-Drawn Sketches
Image generation task has received increasing attention because of its wide
application in security and entertainment. Sketch-based face generation brings
more fun and better quality of image generation due to supervised interaction.
However, When a sketch poorly aligned with the true face is given as input,
existing supervised image-to-image translation methods often cannot generate
acceptable photo-realistic face images. To address this problem, in this paper
we propose Cali-Sketch, a poorly-drawn-sketch to photo-realistic-image
generation method. Cali-Sketch explicitly models stroke calibration and image
generation using two constituent networks: a Stroke Calibration Network (SCN),
which calibrates strokes of facial features and enriches facial details while
preserving the original intent features; and an Image Synthesis Network (ISN),
which translates the calibrated and enriched sketches to photo-realistic face
images. In this way, we manage to decouple a difficult cross-domain translation
problem into two easier steps. Extensive experiments verify that the face
photos generated by Cali-Sketch are both photo-realistic and faithful to the
input sketches, compared with state-of-the-art methodsComment: 10 pages, 12 figure
Artificial Intelligence in the Creative Industries: A Review
This paper reviews the current state of the art in Artificial Intelligence
(AI) technologies and applications in the context of the creative industries. A
brief background of AI, and specifically Machine Learning (ML) algorithms, is
provided including Convolutional Neural Network (CNNs), Generative Adversarial
Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement
Learning (DRL). We categorise creative applications into five groups related to
how AI technologies are used: i) content creation, ii) information analysis,
iii) content enhancement and post production workflows, iv) information
extraction and enhancement, and v) data compression. We critically examine the
successes and limitations of this rapidly advancing technology in each of these
areas. We further differentiate between the use of AI as a creative tool and
its potential as a creator in its own right. We foresee that, in the near
future, machine learning-based AI will be adopted widely as a tool or
collaborative assistant for creativity. In contrast, we observe that the
successes of machine learning in domains with fewer constraints, where AI is
the `creator', remain modest. The potential of AI (or its developers) to win
awards for its original creations in competition with human creatives is also
limited, based on contemporary technologies. We therefore conclude that, in the
context of creative industries, maximum benefit from AI will be derived where
its focus is human centric -- where it is designed to augment, rather than
replace, human creativity
- …