3,621 research outputs found
Sketch-a-Net that Beats Humans
We propose a multi-scale multi-channel deep neural network framework that,
for the first time, yields sketch recognition performance surpassing that of
humans. Our superior performance is a result of explicitly embedding the unique
characteristics of sketches in our model: (i) a network architecture designed
for sketch rather than natural photo statistics, (ii) a multi-channel
generalisation that encodes sequential ordering in the sketching process, and
(iii) a multi-scale network ensemble with joint Bayesian fusion that accounts
for the different levels of abstraction exhibited in free-hand sketches. We
show that state-of-the-art deep networks specifically engineered for photos of
natural objects fail to perform well on sketch recognition, regardless whether
they are trained using photo or sketch. Our network on the other hand not only
delivers the best performance on the largest human sketch dataset to date, but
also is small in size making efficient training possible using just CPUs.Comment: Accepted to BMVC 2015 (oral
SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis
Synthesizing realistic images from human drawn sketches is a challenging
problem in computer graphics and vision. Existing approaches either need exact
edge maps, or rely on retrieval of existing photographs. In this work, we
propose a novel Generative Adversarial Network (GAN) approach that synthesizes
plausible images from 50 categories including motorcycles, horses and couches.
We demonstrate a data augmentation technique for sketches which is fully
automatic, and we show that the augmented data is helpful to our task. We
introduce a new network building block suitable for both the generator and
discriminator which improves the information flow by injecting the input image
at multiple scales. Compared to state-of-the-art image translation methods, our
approach generates more realistic images and achieves significantly higher
Inception Scores.Comment: Accepted to CVPR 201
Cali-Sketch: Stroke Calibration and Completion for High-Quality Face Image Generation from Poorly-Drawn Sketches
Image generation task has received increasing attention because of its wide
application in security and entertainment. Sketch-based face generation brings
more fun and better quality of image generation due to supervised interaction.
However, When a sketch poorly aligned with the true face is given as input,
existing supervised image-to-image translation methods often cannot generate
acceptable photo-realistic face images. To address this problem, in this paper
we propose Cali-Sketch, a poorly-drawn-sketch to photo-realistic-image
generation method. Cali-Sketch explicitly models stroke calibration and image
generation using two constituent networks: a Stroke Calibration Network (SCN),
which calibrates strokes of facial features and enriches facial details while
preserving the original intent features; and an Image Synthesis Network (ISN),
which translates the calibrated and enriched sketches to photo-realistic face
images. In this way, we manage to decouple a difficult cross-domain translation
problem into two easier steps. Extensive experiments verify that the face
photos generated by Cali-Sketch are both photo-realistic and faithful to the
input sketches, compared with state-of-the-art methodsComment: 10 pages, 12 figure
Stroke-based sketched symbol reconstruction and segmentation
Hand-drawn objects usually consist of multiple semantically meaningful parts.
For example, a stick figure consists of a head, a torso, and pairs of legs and
arms. Efficient and accurate identification of these subparts promises to
significantly improve algorithms for stylization, deformation, morphing and
animation of 2D drawings. In this paper, we propose a neural network model that
segments symbols into stroke-level components. Our segmentation framework has
two main elements: a fixed feature extractor and a Multilayer Perceptron (MLP)
network that identifies a component based on the feature. As the feature
extractor we utilize an encoder of a stroke-rnn, which is our newly proposed
generative Variational Auto-Encoder (VAE) model that reconstructs symbols on a
stroke by stroke basis. Experiments show that a single encoder could be reused
for segmenting multiple categories of sketched symbols with negligible effects
on segmentation accuracies. Our segmentation scores surpass existing
methodologies on an available small state of the art dataset. Moreover,
extensive evaluations on our newly annotated big dataset demonstrate that our
framework obtains significantly better accuracies as compared to baseline
models. We release the dataset to the community
- …