177 research outputs found
Revisiting Discriminative vs. Generative Classifiers: Theory and Implications
A large-scale deep model pre-trained on massive labeled or unlabeled data
transfers well to downstream tasks. Linear evaluation freezes parameters in the
pre-trained model and trains a linear classifier separately, which is efficient
and attractive for transfer. However, little work has investigated the
classifier in linear evaluation except for the default logistic regression.
Inspired by the statistical efficiency of naive Bayes, the paper revisits the
classical topic on discriminative vs. generative classifiers. Theoretically,
the paper considers the surrogate loss instead of the zero-one loss in analyses
and generalizes the classical results from binary cases to multiclass ones. We
show that, under mild assumptions, multiclass naive Bayes requires
samples to approach its asymptotic error while the corresponding multiclass
logistic regression requires samples, where is the feature
dimension. To establish it, we present a multiclass -consistency
bound framework and an explicit bound for logistic loss, which are of
independent interests. Simulation results on a mixture of Gaussian validate our
theoretical findings. Experiments on various pre-trained deep vision models
show that naive Bayes consistently converges faster as the number of data
increases. Besides, naive Bayes shows promise in few-shot cases and we observe
the "two regimes" phenomenon in pre-trained supervised models. Our code is
available at https://github.com/ML-GSAI/Revisiting-Dis-vs-Gen-Classifiers.Comment: Accepted by ICML 2023, 58 page
An Analysis on Adversarial Machine Learning: Methods and Applications
Deep learning has witnessed astonishing advancement in the last decade and revolutionized many fields ranging from computer vision to natural language processing. A prominent field of research that enabled such achievements is adversarial learning, investigating the behavior and functionality of a learning model in presence of an adversary. Adversarial learning consists of two major trends. The first trend analyzes the susceptibility of machine learning models to manipulation in the decision-making process and aims to improve the robustness to such manipulations. The second trend exploits adversarial games between components of the model to enhance the learning process. This dissertation aims to provide an analysis on these two sides of adversarial learning and harness their potential for improving the robustness and generalization of deep models.
In the first part of the dissertation, we study the adversarial susceptibility of deep learning models. We provide an empirical analysis on the extent of vulnerability by proposing two adversarial attacks that explore the geometric and frequency-domain characteristics of inputs to manipulate deep decisions. Afterward, we formalize the susceptibility of deep networks using the first-order approximation of the predictions and extend the theory to the ensemble classification scheme. Inspired by theoretical findings, we formalize a reliable and practical defense against adversarial examples to robustify ensembles. We extend this part by investigating the shortcomings of \gls{at} and highlight that the popular momentum stochastic gradient descent, developed essentially for natural training, is not proper for optimization in adversarial training since it is not designed to be robust against the chaotic behavior of gradients in this setup. Motivated by these observations, we develop an optimization method that is more suitable for adversarial training. In the second part of the dissertation, we harness adversarial learning to enhance the generalization and performance of deep networks in discriminative and generative tasks. We develop several models for biometric identification including fingerprint distortion rectification and latent fingerprint reconstruction. In particular, we develop a ridge reconstruction model based on generative adversarial networks that estimates the missing ridge information in latent fingerprints. We introduce a novel modification that enables the generator network to preserve the ID information during the reconstruction process. To address the scarcity of data, {\it e.g.}, in latent fingerprint analysis, we develop a supervised augmentation technique that combines input examples based on their salient regions. Our findings advocate that adversarial learning improves the performance and reliability of deep networks in a wide range of applications
Fine-Grained Image Analysis with Deep Learning: A Survey
Fine-grained image analysis (FGIA) is a longstanding and fundamental problem
in computer vision and pattern recognition, and underpins a diverse set of
real-world applications. The task of FGIA targets analyzing visual objects from
subordinate categories, e.g., species of birds or models of cars. The small
inter-class and large intra-class variation inherent to fine-grained image
analysis makes it a challenging problem. Capitalizing on advances in deep
learning, in recent years we have witnessed remarkable progress in deep
learning powered FGIA. In this paper we present a systematic survey of these
advances, where we attempt to re-define and broaden the field of FGIA by
consolidating two fundamental fine-grained research areas -- fine-grained image
recognition and fine-grained image retrieval. In addition, we also review other
key issues of FGIA, such as publicly available benchmark datasets and related
domain-specific applications. We conclude by highlighting several research
directions and open problems which need further exploration from the community.Comment: Accepted by IEEE TPAM
- …