84,501 research outputs found
SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis
Synthesizing realistic images from human drawn sketches is a challenging
problem in computer graphics and vision. Existing approaches either need exact
edge maps, or rely on retrieval of existing photographs. In this work, we
propose a novel Generative Adversarial Network (GAN) approach that synthesizes
plausible images from 50 categories including motorcycles, horses and couches.
We demonstrate a data augmentation technique for sketches which is fully
automatic, and we show that the augmented data is helpful to our task. We
introduce a new network building block suitable for both the generator and
discriminator which improves the information flow by injecting the input image
at multiple scales. Compared to state-of-the-art image translation methods, our
approach generates more realistic images and achieves significantly higher
Inception Scores.Comment: Accepted to CVPR 201
Network Sketching: Exploiting Binary Structure in Deep CNNs
Convolutional neural networks (CNNs) with deep architectures have
substantially advanced the state-of-the-art in computer vision tasks. However,
deep networks are typically resource-intensive and thus difficult to be
deployed on mobile devices. Recently, CNNs with binary weights have shown
compelling efficiency to the community, whereas the accuracy of such models is
usually unsatisfactory in practice. In this paper, we introduce network
sketching as a novel technique of pursuing binary-weight CNNs, targeting at
more faithful inference and better trade-off for practical applications. Our
basic idea is to exploit binary structure directly in pre-trained filter banks
and produce binary-weight models via tensor expansion. The whole process can be
treated as a coarse-to-fine model approximation, akin to the pencil drawing
steps of outlining and shading. To further speedup the generated models, namely
the sketches, we also propose an associative implementation of binary tensor
convolutions. Experimental results demonstrate that a proper sketch of AlexNet
(or ResNet) outperforms the existing binary-weight models by large margins on
the ImageNet large scale classification task, while the committed memory for
network parameters only exceeds a little.Comment: To appear in CVPR201
Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation
While representation learning aims to derive interpretable features for
describing visual data, representation disentanglement further results in such
features so that particular image attributes can be identified and manipulated.
However, one cannot easily address this task without observing ground truth
annotation for the training data. To address this problem, we propose a novel
deep learning model of Cross-Domain Representation Disentangler (CDRD). By
observing fully annotated source-domain data and unlabeled target-domain data
of interest, our model bridges the information across data domains and
transfers the attribute information accordingly. Thus, cross-domain joint
feature disentanglement and adaptation can be jointly performed. In the
experiments, we provide qualitative results to verify our disentanglement
capability. Moreover, we further confirm that our model can be applied for
solving classification tasks of unsupervised domain adaptation, and performs
favorably against state-of-the-art image disentanglement and translation
methods.Comment: CVPR 2018 Spotligh
Adding New Tasks to a Single Network with Weight Transformations using Binary Masks
Visual recognition algorithms are required today to exhibit adaptive
abilities. Given a deep model trained on a specific, given task, it would be
highly desirable to be able to adapt incrementally to new tasks, preserving
scalability as the number of new tasks increases, while at the same time
avoiding catastrophic forgetting issues. Recent work has shown that masking the
internal weights of a given original conv-net through learned binary variables
is a promising strategy. We build upon this intuition and take into account
more elaborated affine transformations of the convolutional weights that
include learned binary masks. We show that with our generalization it is
possible to achieve significantly higher levels of adaptation to new tasks,
enabling the approach to compete with fine tuning strategies by requiring
slightly more than 1 bit per network parameter per additional task. Experiments
on two popular benchmarks showcase the power of our approach, that achieves the
new state of the art on the Visual Decathlon Challenge
Quantized Compressive K-Means
The recent framework of compressive statistical learning aims at designing
tractable learning algorithms that use only a heavily compressed
representation-or sketch-of massive datasets. Compressive K-Means (CKM) is such
a method: it estimates the centroids of data clusters from pooled, non-linear,
random signatures of the learning examples. While this approach significantly
reduces computational time on very large datasets, its digital implementation
wastes acquisition resources because the learning examples are compressed only
after the sensing stage. The present work generalizes the sketching procedure
initially defined in Compressive K-Means to a large class of periodic
nonlinearities including hardware-friendly implementations that compressively
acquire entire datasets. This idea is exemplified in a Quantized Compressive
K-Means procedure, a variant of CKM that leverages 1-bit universal quantization
(i.e. retaining the least significant bit of a standard uniform quantizer) as
the periodic sketch nonlinearity. Trading for this resource-efficient signature
(standard in most acquisition schemes) has almost no impact on the clustering
performances, as illustrated by numerical experiments
End-to-End Photo-Sketch Generation via Fully Convolutional Representation Learning
Sketch-based face recognition is an interesting task in vision and multimedia
research, yet it is quite challenging due to the great difference between face
photos and sketches. In this paper, we propose a novel approach for
photo-sketch generation, aiming to automatically transform face photos into
detail-preserving personal sketches. Unlike the traditional models synthesizing
sketches based on a dictionary of exemplars, we develop a fully convolutional
network to learn the end-to-end photo-sketch mapping. Our approach takes whole
face photos as inputs and directly generates the corresponding sketch images
with efficient inference and learning, in which the architecture are stacked by
only convolutional kernels of very small sizes. To well capture the person
identity during the photo-sketch transformation, we define our optimization
objective in the form of joint generative-discriminative minimization. In
particular, a discriminative regularization term is incorporated into the
photo-sketch generation, enhancing the discriminability of the generated person
sketches against other individuals. Extensive experiments on several standard
benchmarks suggest that our approach outperforms other state-of-the-art methods
in both photo-sketch generation and face sketch verification.Comment: 8 pages, 6 figures. Proceeding in ACM International Conference on
Multimedia Retrieval (ICMR), 201
- …