43 research outputs found

    Adversarial sketch-photo transformation for enhanced face recognition accuracy: a systematic analysis and evaluation

    Get PDF
    This research provides a strategy for enhancing the precision of face sketch identification through adversarial sketch-photo transformation. The approach uses a generative adversarial network (GAN) to learn to convert sketches into photographs, which may subsequently be utilized to enhance the precision of face sketch identification. The suggested method is evaluated in comparison to state-of-the-art face sketch recognition and synthesis techniques, such as sketchy GAN, similarity-preserving GAN (SPGAN), and super-resolution GAN (SRGAN). Possible domains of use for the proposed adversarial sketch-photo transformation approach include law enforcement, where reliable face sketch recognition is essential for the identification of suspects. The suggested approach can be generalized to various contexts, such as the creation of creative photographs from drawings or the conversion of pictures between modalities. The suggested method outperforms state-of-the-art face sketch recognition and synthesis techniques, confirming the usefulness of adversarial learning in this context. Our method is highly efficient for photo-sketch synthesis, with a structural similarity index (SSIM) of 0.65 on The Chinese University of Hong Kong dataset and 0.70 on the custom-generated dataset

    Human-Inspired Facial Sketch Synthesis with Dynamic Adaptation

    Full text link
    Facial sketch synthesis (FSS) aims to generate a vivid sketch portrait from a given facial photo. Existing FSS methods merely rely on 2D representations of facial semantic or appearance. However, professional human artists usually use outlines or shadings to covey 3D geometry. Thus facial 3D geometry (e.g. depth map) is extremely important for FSS. Besides, different artists may use diverse drawing techniques and create multiple styles of sketches; but the style is globally consistent in a sketch. Inspired by such observations, in this paper, we propose a novel Human-Inspired Dynamic Adaptation (HIDA) method. Specially, we propose to dynamically modulate neuron activations based on a joint consideration of both facial 3D geometry and 2D appearance, as well as globally consistent style control. Besides, we use deformable convolutions at coarse-scales to align deep features, for generating abstract and distinct outlines. Experiments show that HIDA can generate high-quality sketches in multiple styles, and significantly outperforms previous methods, over a large range of challenging faces. Besides, HIDA allows precise style control of the synthesized sketch, and generalizes well to natural scenes and other artistic styles. Our code and results have been released online at: https://github.com/AiArt-HDU/HIDA.Comment: To appear on ICCV'2

    Semi-supervised Cycle-GAN for face photo-sketch translation in the wild

    Full text link
    The performance of face photo-sketch translation has improved a lot thanks to deep neural networks. GAN based methods trained on paired images can produce high-quality results under laboratory settings. Such paired datasets are, however, often very small and lack diversity. Meanwhile, Cycle-GANs trained with unpaired photo-sketch datasets suffer from the \emph{steganography} phenomenon, which makes them not effective to face photos in the wild. In this paper, we introduce a semi-supervised approach with a noise-injection strategy, named Semi-Cycle-GAN (SCG), to tackle these problems. For the first problem, we propose a {\em pseudo sketch feature} representation for each input photo composed from a small reference set of photo-sketch pairs, and use the resulting {\em pseudo pairs} to supervise a photo-to-sketch generator Gp2sG_{p2s}. The outputs of Gp2sG_{p2s} can in turn help to train a sketch-to-photo generator Gs2pG_{s2p} in a self-supervised manner. This allows us to train Gp2sG_{p2s} and Gs2pG_{s2p} using a small reference set of photo-sketch pairs together with a large face photo dataset (without ground-truth sketches). For the second problem, we show that the simple noise-injection strategy works well to alleviate the \emph{steganography} effect in SCG and helps to produce more reasonable sketch-to-photo results with less overfitting than fully supervised approaches. Experiments show that SCG achieves competitive performance on public benchmarks and superior results on photos in the wild.Comment: 11 pages, 11 figures, 5 tables (+ 7 page appendix

    Domain Generalization in Vision: A Survey

    Full text link
    Generalization to out-of-distribution (OOD) data is a capability natural to humans yet challenging for machines to reproduce. This is because most learning algorithms strongly rely on the i.i.d.~assumption on source/target data, which is often violated in practice due to domain shift. Domain generalization (DG) aims to achieve OOD generalization by using only source data for model learning. Since first introduced in 2011, research in DG has made great progresses. In particular, intensive research in this topic has led to a broad spectrum of methodologies, e.g., those based on domain alignment, meta-learning, data augmentation, or ensemble learning, just to name a few; and has covered various vision applications such as object recognition, segmentation, action recognition, and person re-identification. In this paper, for the first time a comprehensive literature review is provided to summarize the developments in DG for computer vision over the past decade. Specifically, we first cover the background by formally defining DG and relating it to other research fields like domain adaptation and transfer learning. Second, we conduct a thorough review into existing methods and present a categorization based on their methodologies and motivations. Finally, we conclude this survey with insights and discussions on future research directions.Comment: v4: includes the word "vision" in the title; improves the organization and clarity in Section 2-3; adds future directions; and mor

    DREAM: Domain-free Reverse Engineering Attributes of Black-box Model

    Full text link
    Deep learning models are usually black boxes when deployed on machine learning platforms. Prior works have shown that the attributes (e.g.e.g., the number of convolutional layers) of a target black-box neural network can be exposed through a sequence of queries. There is a crucial limitation: these works assume the dataset used for training the target model to be known beforehand and leverage this dataset for model attribute attack. However, it is difficult to access the training dataset of the target black-box model in reality. Therefore, whether the attributes of a target black-box model could be still revealed in this case is doubtful. In this paper, we investigate a new problem of Domain-agnostic Reverse Engineering the Attributes of a black-box target Model, called DREAM, without requiring the availability of the target model's training dataset, and put forward a general and principled framework by casting this problem as an out of distribution (OOD) generalization problem. In this way, we can learn a domain-agnostic model to inversely infer the attributes of a target black-box model with unknown training data. This makes our method one of the kinds that can gracefully apply to an arbitrary domain for model attribute reverse engineering with strong generalization ability. Extensive experimental studies are conducted and the results validate the superiority of our proposed method over the baselines

    The Future Role of Strategic Landpower

    Get PDF
    Recent Russian aggression in Ukraine has reenergized military strategists and senior leaders to evaluate the role of strategic Landpower. American leadership in the European theater has mobilized allies and partners to reconsider force postures for responding to possible aggression against NATO members. Although Russian revisionist activity remains a threat in Europe, the challenges in the Pacific for strategic Landpower must also be considered. At the same time, the homeland, the Arctic, climate change, and the results of new and emerging technology also challenge the application of strategic Landpower. This publication serves as part of an enduring effort to evaluate strategic Landpower’s role, authorities, and resources for accomplishing the national strategic goals the Joint Force may face in the next conflict. This study considers multinational partners, allies, and senior leaders that can contribute to overcoming these enduring challenges. The insights derived from this study, which can be applied to both the European and Indo-Pacific theaters, should help leaders to consider these challenges, which may last a generation. Deterrence demands credible strategic response options integrated across warfighting functions. This valuable edition will continue the dialogue about addressing these issues as well as other emerging ones.https://press.armywarcollege.edu/monographs/1959/thumbnail.jp

    Multiscale Mesh Deformation Component Analysis with Attention-based Autoencoders

    Get PDF
    Deformation component analysis is a fundamental problem in geometry processing and shape understanding. Existing approaches mainly extract deformation components in local regions at a similar scale while deformations of real-world objects are usually distributed in a multi-scale manner. In this paper, we propose a novel method to exact multiscale deformation components automatically with a stacked attention-based autoencoder. The attention mechanism is designed to learn to softly weight multi-scale deformation components in active deformation regions, and the stacked attention-based autoencoder is learned to represent the deformation components at different scales. Quantitative and qualitative evaluations show that our method outperforms state-of-the-art methods. Furthermore, with the multiscale deformation components extracted by our method, the user can edit shapes in a coarse-to-fine fashion which facilitates effective modeling of new shapes.Comment: 15 page

    Doctor of Philosophy

    Get PDF
    dissertationMachine learning is the science of building predictive models from data that automatically improve based on past experience. To learn these models, traditional learning algorithms require labeled data. They also require that the entire dataset fits in the memory of a single machine. Labeled data are available or can be acquired for small and moderately sized datasets but curating large datasets can be prohibitively expensive. Similarly, massive datasets are usually too huge to fit into the memory of a single machine. An alternative is to distribute the dataset over multiple machines. Distributed learning, however, poses new challenges as most existing machine learning techniques are inherently sequential. Additionally, these distributed approaches have to be designed keeping in mind various resource limitations of real-world settings, prime among them being intermachine communication. With the advent of big datasets machine learning algorithms are facing new challenges. Their design is no longer limited to minimizing some loss function but, additionally, needs to consider other resources that are critical when learning at scale. In this thesis, we explore different models and measures for learning with limited resources that have a budget. What budgetary constraints are posed by modern datasets? Can we reuse or combine existing machine learning paradigms to address these challenges at scale? How does the cost metrics change when we shift to distributed models for learning? These are some of the questions that have been investigated in this thesis. The answers to these questions hold the key to addressing some of the challenges faced when learning on massive datasets. In the first part of this thesis, we present three different budgeted scenarios that deal with scarcity of labeled data and limited computational resources. The goal is to leverage transfer information from related domains to learn under budgetary constraints. Our proposed techniques comprise semisupervised transfer, online transfer and active transfer. In the second part of this thesis, we study distributed learning with limited communication. We present initial sampling based results, as well as, propose communication protocols for learning distributed linear classifiers
    corecore