43 research outputs found
Adversarial sketch-photo transformation for enhanced face recognition accuracy: a systematic analysis and evaluation
This research provides a strategy for enhancing the precision of face sketch identification through adversarial sketch-photo transformation. The approach uses a generative adversarial network (GAN) to learn to convert sketches into photographs, which may subsequently be utilized to enhance the precision of face sketch identification. The suggested method is evaluated in comparison to state-of-the-art face sketch recognition and synthesis techniques, such as sketchy GAN, similarity-preserving GAN (SPGAN), and super-resolution GAN (SRGAN). Possible domains of use for the proposed adversarial sketch-photo transformation approach include law enforcement, where reliable face sketch recognition is essential for the identification of suspects. The suggested approach can be generalized to various contexts, such as the creation of creative photographs from drawings or the conversion of pictures between modalities. The suggested method outperforms state-of-the-art face sketch recognition and synthesis techniques, confirming the usefulness of adversarial learning in this context. Our method is highly efficient for photo-sketch synthesis, with a structural similarity index (SSIM) of 0.65 on The Chinese University of Hong Kong dataset and 0.70 on the custom-generated dataset
Human-Inspired Facial Sketch Synthesis with Dynamic Adaptation
Facial sketch synthesis (FSS) aims to generate a vivid sketch portrait from a
given facial photo. Existing FSS methods merely rely on 2D representations of
facial semantic or appearance. However, professional human artists usually use
outlines or shadings to covey 3D geometry. Thus facial 3D geometry (e.g. depth
map) is extremely important for FSS. Besides, different artists may use diverse
drawing techniques and create multiple styles of sketches; but the style is
globally consistent in a sketch. Inspired by such observations, in this paper,
we propose a novel Human-Inspired Dynamic Adaptation (HIDA) method. Specially,
we propose to dynamically modulate neuron activations based on a joint
consideration of both facial 3D geometry and 2D appearance, as well as globally
consistent style control. Besides, we use deformable convolutions at
coarse-scales to align deep features, for generating abstract and distinct
outlines. Experiments show that HIDA can generate high-quality sketches in
multiple styles, and significantly outperforms previous methods, over a large
range of challenging faces. Besides, HIDA allows precise style control of the
synthesized sketch, and generalizes well to natural scenes and other artistic
styles. Our code and results have been released online at:
https://github.com/AiArt-HDU/HIDA.Comment: To appear on ICCV'2
Semi-supervised Cycle-GAN for face photo-sketch translation in the wild
The performance of face photo-sketch translation has improved a lot thanks to
deep neural networks. GAN based methods trained on paired images can produce
high-quality results under laboratory settings. Such paired datasets are,
however, often very small and lack diversity. Meanwhile, Cycle-GANs trained
with unpaired photo-sketch datasets suffer from the \emph{steganography}
phenomenon, which makes them not effective to face photos in the wild. In this
paper, we introduce a semi-supervised approach with a noise-injection strategy,
named Semi-Cycle-GAN (SCG), to tackle these problems. For the first problem, we
propose a {\em pseudo sketch feature} representation for each input photo
composed from a small reference set of photo-sketch pairs, and use the
resulting {\em pseudo pairs} to supervise a photo-to-sketch generator
. The outputs of can in turn help to train a sketch-to-photo
generator in a self-supervised manner. This allows us to train
and using a small reference set of photo-sketch pairs
together with a large face photo dataset (without ground-truth sketches). For
the second problem, we show that the simple noise-injection strategy works well
to alleviate the \emph{steganography} effect in SCG and helps to produce more
reasonable sketch-to-photo results with less overfitting than fully supervised
approaches. Experiments show that SCG achieves competitive performance on
public benchmarks and superior results on photos in the wild.Comment: 11 pages, 11 figures, 5 tables (+ 7 page appendix
Domain Generalization in Vision: A Survey
Generalization to out-of-distribution (OOD) data is a capability natural to
humans yet challenging for machines to reproduce. This is because most learning
algorithms strongly rely on the i.i.d.~assumption on source/target data, which
is often violated in practice due to domain shift. Domain generalization (DG)
aims to achieve OOD generalization by using only source data for model
learning. Since first introduced in 2011, research in DG has made great
progresses. In particular, intensive research in this topic has led to a broad
spectrum of methodologies, e.g., those based on domain alignment,
meta-learning, data augmentation, or ensemble learning, just to name a few; and
has covered various vision applications such as object recognition,
segmentation, action recognition, and person re-identification. In this paper,
for the first time a comprehensive literature review is provided to summarize
the developments in DG for computer vision over the past decade. Specifically,
we first cover the background by formally defining DG and relating it to other
research fields like domain adaptation and transfer learning. Second, we
conduct a thorough review into existing methods and present a categorization
based on their methodologies and motivations. Finally, we conclude this survey
with insights and discussions on future research directions.Comment: v4: includes the word "vision" in the title; improves the
organization and clarity in Section 2-3; adds future directions; and mor
DREAM: Domain-free Reverse Engineering Attributes of Black-box Model
Deep learning models are usually black boxes when deployed on machine
learning platforms. Prior works have shown that the attributes (, the
number of convolutional layers) of a target black-box neural network can be
exposed through a sequence of queries. There is a crucial limitation: these
works assume the dataset used for training the target model to be known
beforehand and leverage this dataset for model attribute attack. However, it is
difficult to access the training dataset of the target black-box model in
reality. Therefore, whether the attributes of a target black-box model could be
still revealed in this case is doubtful. In this paper, we investigate a new
problem of Domain-agnostic Reverse Engineering the Attributes of a black-box
target Model, called DREAM, without requiring the availability of the target
model's training dataset, and put forward a general and principled framework by
casting this problem as an out of distribution (OOD) generalization problem. In
this way, we can learn a domain-agnostic model to inversely infer the
attributes of a target black-box model with unknown training data. This makes
our method one of the kinds that can gracefully apply to an arbitrary domain
for model attribute reverse engineering with strong generalization ability.
Extensive experimental studies are conducted and the results validate the
superiority of our proposed method over the baselines
The Future Role of Strategic Landpower
Recent Russian aggression in Ukraine has reenergized military strategists and senior leaders to evaluate the role of strategic Landpower. American leadership in the European theater has mobilized allies and partners to reconsider force postures for responding to possible aggression against NATO members. Although Russian revisionist activity remains a threat in Europe, the challenges in the Pacific for strategic Landpower must also be considered. At the same time, the homeland, the Arctic, climate change, and the results of new and emerging technology also challenge the application of strategic Landpower. This publication serves as part of an enduring effort to evaluate strategic Landpower’s role, authorities, and resources for accomplishing the national strategic goals the Joint Force may face in the next conflict. This study considers multinational partners, allies, and senior leaders that can contribute to overcoming these enduring challenges. The insights derived from this study, which can be applied to both the European and Indo-Pacific theaters, should help leaders to consider these challenges, which may last a generation. Deterrence demands credible strategic response options integrated across warfighting functions. This valuable edition will continue the dialogue about addressing these issues as well as other emerging ones.https://press.armywarcollege.edu/monographs/1959/thumbnail.jp
Multiscale Mesh Deformation Component Analysis with Attention-based Autoencoders
Deformation component analysis is a fundamental problem in geometry
processing and shape understanding. Existing approaches mainly extract
deformation components in local regions at a similar scale while deformations
of real-world objects are usually distributed in a multi-scale manner. In this
paper, we propose a novel method to exact multiscale deformation components
automatically with a stacked attention-based autoencoder. The attention
mechanism is designed to learn to softly weight multi-scale deformation
components in active deformation regions, and the stacked attention-based
autoencoder is learned to represent the deformation components at different
scales. Quantitative and qualitative evaluations show that our method
outperforms state-of-the-art methods. Furthermore, with the multiscale
deformation components extracted by our method, the user can edit shapes in a
coarse-to-fine fashion which facilitates effective modeling of new shapes.Comment: 15 page
Doctor of Philosophy
dissertationMachine learning is the science of building predictive models from data that automatically improve based on past experience. To learn these models, traditional learning algorithms require labeled data. They also require that the entire dataset fits in the memory of a single machine. Labeled data are available or can be acquired for small and moderately sized datasets but curating large datasets can be prohibitively expensive. Similarly, massive datasets are usually too huge to fit into the memory of a single machine. An alternative is to distribute the dataset over multiple machines. Distributed learning, however, poses new challenges as most existing machine learning techniques are inherently sequential. Additionally, these distributed approaches have to be designed keeping in mind various resource limitations of real-world settings, prime among them being intermachine communication. With the advent of big datasets machine learning algorithms are facing new challenges. Their design is no longer limited to minimizing some loss function but, additionally, needs to consider other resources that are critical when learning at scale. In this thesis, we explore different models and measures for learning with limited resources that have a budget. What budgetary constraints are posed by modern datasets? Can we reuse or combine existing machine learning paradigms to address these challenges at scale? How does the cost metrics change when we shift to distributed models for learning? These are some of the questions that have been investigated in this thesis. The answers to these questions hold the key to addressing some of the challenges faced when learning on massive datasets. In the first part of this thesis, we present three different budgeted scenarios that deal with scarcity of labeled data and limited computational resources. The goal is to leverage transfer information from related domains to learn under budgetary constraints. Our proposed techniques comprise semisupervised transfer, online transfer and active transfer. In the second part of this thesis, we study distributed learning with limited communication. We present initial sampling based results, as well as, propose communication protocols for learning distributed linear classifiers