104 research outputs found
DOMAIN ADAPTION FOR UNCONSTRAINED FACE VERIFICATION AND IDENTIFICATION
Face recognition has been receiving consistent attention in computer vision community for over three decades. Although recent advances in deep convolutional neural networks (DCNNs) have pushed face recognition algorithms to surpass human performance in most controlled situations, the unconstrained face recognition performance is still far from satisfactory. This is mainly because the domain shift between training and test data is substantial when faces are captured under extreme pose, blur or other covariates variations. In this dissertation, we study the effects of covariates and present approaches of mitigating the domain mismatch to improve the performance of unconstrained face verification and identification.
To study how covariates affect the performance of deep neural networks on the large-scale unconstrained face verification problem, we implement five state-of-the-art deep convolutional networks (DCNNs) and evaluate them on three challenging covariates datasets. In total, seven covariates are considered: pose (yaw and roll), age, facial hair, gender, indoor/outdoor, occlusion (nose and mouth visibility, and forehead visibility), and skin tone. Some of the results confirm and extend the findings of previous studies, while others are new findings that were rarely mentioned before or did not show consistent trends. In addition, we demonstrate that with the assistance of gender information, the quality of a pre-curated noisy large-scale face dataset can be further improved.
Based on the results of this study, we propose four domain adaptation methods to alleviate the effects of covariates. First, since we find that pose is a key factor for performance degradation, we propose a metric learning method to alleviate the effects of pose on face verification performance. We learn a joint model for face and pose verification tasks and explicitly discourage information sharing between the identity and pose metrics. Specifically, we enforce an orthogonal regularization constraint on the learned projection matrices for the two tasks leading to making the identity metrics for face verification more pose-robust. Extensive experiments are conducted on three challenging unconstrained face datasets that show promising results compared to state-of-the-art methods.
Second, to tackle the negative effects brought by image blur, we propose two approaches. The first approach is an incremental dictionary learning method to mitigate the distribution difference between sharp training data and blurred test data. Some blurred faces called supportive samples are selected, which are used for building more discriminative classification models and act as a bridge to connect the two domains. Second, we propose an unsupervised face deblurring approach based on disentangled representations. The disentanglement is achieved by splitting the content and blur features in a blurred image using content encoders and blur encoders. An adversarial loss is added on deblurred results to generate visually realistic faces. We conduct extensive experiments on two challenging face datasets that show promising results.
Finally, apart from the effects of pose and blur, face verification performance also suffers from the generic domain mismatch between source and target faces. To tackle this problem, we propose a template adaptation method for template-based face verification. A template-specific metric is trained to adaptively learn the discriminative information between test templates and the negative training set, which contains subjects that are mutually exclusive to subjects in test templates. Extensive experiments on two challenging face verification datasets yield promising results compared to other competitive methods
Modify Training Directions in Function Space to Reduce Generalization Error
We propose theoretical analyses of a modified natural gradient descent method
in the neural network function space based on the eigendecompositions of neural
tangent kernel and Fisher information matrix. We firstly present analytical
expression for the function learned by this modified natural gradient under the
assumptions of Gaussian distribution and infinite width limit. Thus, we
explicitly derive the generalization error of the learned neural network
function using theoretical methods from eigendecomposition and statistics
theory. By decomposing of the total generalization error attributed to
different eigenspace of the kernel in function space, we propose a criterion
for balancing the errors stemming from training set and the distribution
discrepancy between the training set and the true data. Through this approach,
we establish that modifying the training direction of the neural network in
function space leads to a reduction in the total generalization error.
Furthermore, We demonstrate that this theoretical framework is capable to
explain many existing results of generalization enhancing methods. These
theoretical results are also illustrated by numerical examples on synthetic
data
Improving Large-scale Deep Biasing with Phoneme Features and Text-only Data in Streaming Transducer
Deep biasing for the Transducer can improve the recognition performance of
rare words or contextual entities, which is essential in practical
applications, especially for streaming Automatic Speech Recognition (ASR).
However, deep biasing with large-scale rare words remains challenging, as the
performance drops significantly when more distractors exist and there are words
with similar grapheme sequences in the bias list. In this paper, we combine the
phoneme and textual information of rare words in Transducers to distinguish
words with similar pronunciation or spelling. Moreover, the introduction of
training with text-only data containing more rare words benefits large-scale
deep biasing. The experiments on the LibriSpeech corpus demonstrate that the
proposed method achieves state-of-the-art performance on rare word error rate
for different scales and levels of bias lists.Comment: Submitted to ASRU 202
Tackling Non-Stationarity in Reinforcement Learning via Causal-Origin Representation
In real-world scenarios, the application of reinforcement learning is
significantly challenged by complex non-stationarity. Most existing methods
attempt to model changes in the environment explicitly, often requiring
impractical prior knowledge. In this paper, we propose a new perspective,
positing that non-stationarity can propagate and accumulate through complex
causal relationships during state transitions, thereby compounding its
sophistication and affecting policy learning. We believe that this challenge
can be more effectively addressed by tracing the causal origin of
non-stationarity. To this end, we introduce the Causal-Origin REPresentation
(COREP) algorithm. COREP primarily employs a guided updating mechanism to learn
a stable graph representation for states termed as causal-origin
representation. By leveraging this representation, the learned policy exhibits
impressive resilience to non-stationarity. We supplement our approach with a
theoretical analysis grounded in the causal interpretation for non-stationary
reinforcement learning, advocating for the validity of the causal-origin
representation. Experimental results further demonstrate the superior
performance of COREP over existing methods in tackling non-stationarity
Legal Origins, Religion and Health Outcomes: A Cross-Country Comparison of Organ Donation Laws
This paper investigates what drives countries to legislate presumed consent - making citizens organ donors by default unless they opt out - instead of explicit consent. Results reveal the following: First, civil law predicts presumed consent, which uncovers a mechanism by which an institution that long pre-dates transplantation medicine has an impact on current health outcomes. This is in line with previous research that has found that civil law regimes tend to be more comfortable with a centralized and activist government than common law ones. Second, Catholicism predicts presumed consent. This is consistent with previous research that shows Catholicism generally relies on more hierarchical structures and is less likely to encourage social responsibility among its members. Last, higher pro-social behavior decreases the likelihood of presumed consent. This could be explained by policy-makers trying not to discourage donations where pro-social behavior is high by making it look a requirement rather than an altruistic act. The implications of the findings are discussed, with a particular focus on policy-switches in organ donations
- …