6 research outputs found

    Local quality-based matching of faces for watchlist screening applications

    Get PDF
    Video surveillance systems are often exploited by safety organizations for enhanced security and situational awareness. A key application in video surveillance is watchlist screening where target individuals are enrolled to a still-to-video Face Recognition (FR) system using single still images captured a priori under controlled conditions. Watchlist Screening is a very challenging application. Indeed, the latter must provide accurate decisions and timely recognition using limited number of reference faces for the system’s enrolment. This issue is often called the "Single Sample Per Person" (SSPP) problem. Added to that, uncontrolled factors such as variations in illumination pose and occlusion is unpreventable in real case video surveillance which causes the degradation of the FR system’s performance. Another major problem in such applications is the camera interoperability. This means that there is a huge gap between the camera used for taking the still images and the camera used for taking the video surveillance footage in terms of quality and resolution. This issue hinders the classification process then decreases the system‘s performance. Controlled and uniform lighting is indispensable for having good facial captures that contributes in the recognition performance of the system. However, in reality, facial captures are poor in illumination factor and are severely affecting the system’s performance. This is why it is important to implement a FR system which is invariant to illumination changes. The first part of this Thesis consists in investigating different illumination normalization (IN) techniques that are applied at the pre-processing level of the still-to-video FR. Afterwards IN techniques are compared to each other in order to pinpoint the most suitable technique for illumination invariance. In addition, patch-based methods for template matching extracts facial features from different regions which offers more discriminative information and deals with occlusion issues. Thus, local matching is applied for the still-to-video FR system. For that, a profound examination is needed on the manner of applying these IN techniques. Two different approaches were conducted: the global approach which consists in performing IN on the image then performs local matching and the local approach which consists in primarily dividing the images into non overlapping patches then perform on individually on each patch each IN technique. The results obtained after executing these experiments have shown that the Tan and Triggs (TT) and Multi ScaleWeberfaces are likely to offer better illumination invariance for the still-to-video FR system. In addition to that, these outperforming IN techniques applied locally on each patch have shown to improve the performance of the FR compared to the global approach. The performance of a FR system is good when the training data and the operation data are from the same distribution. Unfortunately, in still-to-video FR systems this is not satisfied. The training data are still, high quality, high resolution and frontal images. However, the testing data are video frames, low quality, low resolution and varying head pose images. Thus, the former and the latter do not have the same distribution. To address this domain shift, the second part of this Thesis consists in presenting a new technique of dynamic regional weighting exploiting unsupervised domain adaptation and contextual information based on quality. The main contribution consists in assigning dynamic weights that is specific to a camera domain.This study replaces the static and predefined manner of assigning weights. In order to assess the impact of applying local weights dynamically, results are compared to a baseline (no weights) and static weighting technique. This context based approach has proven to increase the system’s performance compared to the static weighting that is dependent on the dataset and the baseline technique which consists of having no weights. These experiments are conducted and validated using the ChokePoint Dataset. As for the performance of the still-to-video FR system, it is evaluated using performance measures, Receiver operating characteristic (ROC) curve and Precision-Recall (PR) curve analysis

    Improved Contextual Recognition In Automatic Speech Recognition Systems By Semantic Lattice Rescoring

    Full text link
    Automatic Speech Recognition (ASR) has witnessed a profound research interest. Recent breakthroughs have given ASR systems different prospects such as faithfully transcribing spoken language, which is a pivotal advancement in building conversational agents. However, there is still an imminent challenge of accurately discerning context-dependent words and phrases. In this work, we propose a novel approach for enhancing contextual recognition within ASR systems via semantic lattice processing leveraging the power of deep learning models in accurately delivering spot-on transcriptions across a wide variety of vocabularies and speaking styles. Our solution consists of using Hidden Markov Models and Gaussian Mixture Models (HMM-GMM) along with Deep Neural Networks (DNN) models integrating both language and acoustic modeling for better accuracy. We infused our network with the use of a transformer-based model to properly rescore the word lattice achieving remarkable capabilities with a palpable reduction in Word Error Rate (WER). We demonstrate the effectiveness of our proposed framework on the LibriSpeech dataset with empirical analyses

    CES-KD: Curriculum-based Expert Selection for Guided Knowledge Distillation

    Full text link
    Knowledge distillation (KD) is an effective tool for compressing deep classification models for edge devices. However, the performance of KD is affected by the large capacity gap between the teacher and student networks. Recent methods have resorted to a multiple teacher assistant (TA) setting for KD, which sequentially decreases the size of the teacher model to relatively bridge the size gap between these models. This paper proposes a new technique called Curriculum Expert Selection for Knowledge Distillation (CES-KD) to efficiently enhance the learning of a compact student under the capacity gap problem. This technique is built upon the hypothesis that a student network should be guided gradually using stratified teaching curriculum as it learns easy (hard) data samples better and faster from a lower (higher) capacity teacher network. Specifically, our method is a gradual TA-based KD technique that selects a single teacher per input image based on a curriculum driven by the difficulty in classifying the image. In this work, we empirically verify our hypothesis and rigorously experiment with CIFAR-10, CIFAR-100, CINIC-10, and ImageNet datasets and show improved accuracy on VGG-like models, ResNets, and WideResNets architectures.Comment: ICPR202

    BD-KD: Balancing the Divergences for Online Knowledge Distillation

    Full text link
    Knowledge distillation (KD) has gained a lot of attention in the field of model compression for edge devices thanks to its effectiveness in compressing large powerful networks into smaller lower-capacity models. Online distillation, in which both the teacher and the student are learning collaboratively, has also gained much interest due to its ability to improve on the performance of the networks involved. The Kullback-Leibler (KL) divergence ensures the proper knowledge transfer between the teacher and student. However, most online KD techniques present some bottlenecks under the network capacity gap. By cooperatively and simultaneously training, the models the KL distance becomes incapable of properly minimizing the teacher's and student's distributions. Alongside accuracy, critical edge device applications are in need of well-calibrated compact networks. Confidence calibration provides a sensible way of getting trustworthy predictions. We propose BD-KD: Balancing of Divergences for online Knowledge Distillation. We show that adaptively balancing between the reverse and forward divergences shifts the focus of the training strategy to the compact student network without limiting the teacher network's learning process. We demonstrate that, by performing this balancing design at the level of the student distillation loss, we improve upon both performance accuracy and calibration of the compact student network. We conducted extensive experiments using a variety of network architectures and show improvements on multiple datasets including CIFAR-10, CIFAR-100, Tiny-ImageNet, and ImageNet. We illustrate the effectiveness of our approach through comprehensive comparisons and ablations with current state-of-the-art online and offline KD techniques

    Analytic Characterization of the Herpes Simplex Virus Type 2 Epidemic in the United States, 1950-2050.

    Get PDF
    BACKGROUND: We analytically characterized the past, present, and future levels and trends of the national herpes simplex virus type 2 (HSV-2) epidemic in the United States. METHODS: A population-level mathematical model was constructed to describe HSV-2 transmission dynamics and was fitted to the data series of the National Health and Nutrition Examination Survey. RESULTS: Over 1950-2050, antibody prevalence (seroprevalence) increased rapidly from 1960, peaking at 19.9% in 1983 in those aged 15-49 years, before reversing course to decline to 13.2% by 2020 and 8.5% by 2050. Incidence rate peaked in 1971 at 11.9 per 1000 person-years, before declining by 59% by 2020 and 70% by 2050. Annual number of new infections peaked at 1 033 000 in 1978, before declining to 667 000 by 2020 and 600 000 by 2050. Women were disproportionately affected, averaging 75% higher seroprevalence, 95% higher incidence rate, and 71% higher annual number of infections. In 2020, 78% of infections were acquired by those 15-34 years of age. CONCLUSIONS: The epidemic has undergone a major transition over a century, with the greatest impact in those 15-34 years of age. In addition to 47 million prevalent infections in 2020, high incidence will persist over the next 3 decades, adding >600 000 new infections every year

    Contextual weighting of patches for local matching in still-to-video face recognition

    No full text
    Abstract Still-to-video face recognition (FR) systems for watchlist screening seek to recognize individuals of interest given faces captured over a network of video surveillance cameras. Screening faces against a watchlist is a challenging application because only a limited number of reference stills is available per individual during enrollment, and the appearance of face captures in videos changes from camera to camera, due to variations in illumination, pose, blur, scale, expression and occlusion. In order to improve the robustness of FR systems, several local matching techniques have been proposed that rely on static or dynamic weighting of patches. However, these approaches are not suitable for watchlist screening applications where the capturing conditions vary significantly over different camera fields of view (FoV). In this paper, a new dynamic weighting technique is proposed for weighting facial patches based on video data collected a priori from the specific operational domain (camera FoV) and on image quality assessment. Results obtained on videos from the Chokepoint dataset indicate that the proposed approach can significantly outperform the reference local matching methods because patch weights tend to grow for discriminant facial regions
    corecore