781 research outputs found

    Audio-coupled video content understanding of unconstrained video sequences

    Get PDF
    Unconstrained video understanding is a difficult task. The main aim of this thesis is to recognise the nature of objects, activities and environment in a given video clip using both audio and video information. Traditionally, audio and video information has not been applied together for solving such complex task, and for the first time we propose, develop, implement and test a new framework of multi-modal (audio and video) data analysis for context understanding and labelling of unconstrained videos. The framework relies on feature selection techniques and introduces a novel algorithm (PCFS) that is faster than the well-established SFFS algorithm. We use the framework for studying the benefits of combining audio and video information in a number of different problems. We begin by developing two independent content recognition modules. The first one is based on image sequence analysis alone, and uses a range of colour, shape, texture and statistical features from image regions with a trained classifier to recognise the identity of objects, activities and environment present. The second module uses audio information only, and recognises activities and environment. Both of these approaches are preceded by detailed pre-processing to ensure that correct video segments containing both audio and video content are present, and that the developed system can be made robust to changes in camera movement, illumination, random object behaviour etc. For both audio and video analysis, we use a hierarchical approach of multi-stage classification such that difficult classification tasks can be decomposed into simpler and smaller tasks. When combining both modalities, we compare fusion techniques at different levels of integration and propose a novel algorithm that combines advantages of both feature and decision-level fusion. The analysis is evaluated on a large amount of test data comprising unconstrained videos collected for this work. We finally, propose a decision correction algorithm which shows that further steps towards combining multi-modal classification information effectively with semantic knowledge generates the best possible results

    Threshold-optimized decision-level fusion and its application to biometrics

    Get PDF
    Fusion is a popular practice to increase the reliability of biometric verification. In this paper, we propose an optimal fusion scheme at decision level by the AND or OR rule, based on optimizing matching score thresholds. The proposed fusion scheme will always give an improvement in the Neyman–Pearson sense over the component classifiers that are fused. The theory of the threshold-optimized decision-level fusion is presented, and the applications are discussed. Fusion experiments are done on the FRGC database which contains 2D texture data and 3D shape data. The proposed decision fusion improves the system performance, in a way comparable to or better than the conventional score-level fusion. It is noteworthy that in practice, the threshold-optimized decision-level fusion by the OR rule is especially useful in presence of outliers

    Face comparison in forensics:A deep dive into deep learning and likelihood rations

    Get PDF
    This thesis explores the transformative potential of deep learning techniques in the field of forensic face recognition. It aims to address the pivotal question of how deep learning can advance this traditionally manual field, focusing on three key areas: forensic face comparison, face image quality assessment, and likelihood ratio estimation. Using a comparative analysis of open-source automated systems and forensic experts, the study finds that automated systems excel in identifying non-matches in low-quality images, but lag behind experts in high-quality settings. The thesis also investigates the role of calibration methods in estimating likelihood ratios, revealing that quality score-based and feature-based calibrations are more effective than naive methods. To enhance face image quality assessment, a multi-task explainable quality network is proposed that not only gauges image quality, but also identifies contributing factors. Additionally, a novel images-to-video recognition method is introduced to improve the estimation of likelihood ratios in surveillance settings. The study employs multiple datasets and software systems for its evaluations, aiming for a comprehensive analysis that can serve as a cornerstone for future research in forensic face recognition

    Cluster-Based Supervised Classification

    Get PDF

    Automatic human face detection in color images

    Get PDF
    Automatic human face detection in digital image has been an active area of research over the past decade. Among its numerous applications, face detection plays a key role in face recognition system for biometric personal identification, face tracking for intelligent human computer interface (HCI), and face segmentation for object-based video coding. Despite significant progress in the field in recent years, detecting human faces in unconstrained and complex images remains a challenging problem in computer vision. An automatic system that possesses a similar capability as the human vision system in detecting faces is still a far-reaching goal. This thesis focuses on the problem of detecting human laces in color images. Although many early face detection algorithms were designed to work on gray-scale Images, strong evidence exists to suggest face detection can be done more efficiently by taking into account color characteristics of the human face. In this thesis, we present a complete and systematic face detection algorithm that combines the strengths of both analytic and holistic approaches to face detection. The algorithm is developed to detect quasi-frontal faces in complex color Images. This face class, which represents typical detection scenarios in most practical applications of face detection, covers a wide range of face poses Including all in-plane rotations and some out-of-plane rotations. The algorithm is organized into a number of cascading stages including skin region segmentation, face candidate selection, and face verification. In each of these stages, various visual cues are utilized to narrow the search space for faces. In this thesis, we present a comprehensive analysis of skin detection using color pixel classification, and the effects of factors such as the color space, color classification algorithm on segmentation performance. We also propose a novel and efficient face candidate selection technique that is based on color-based eye region detection and a geometric face model. This candidate selection technique eliminates the computation-intensive step of window scanning often employed In holistic face detection, and simplifies the task of detecting rotated faces. Besides various heuristic techniques for face candidate verification, we developface/nonface classifiers based on the naive Bayesian model, and investigate three feature extraction schemes, namely intensity, projection on face subspace and edge-based. Techniques for improving face/nonface classification are also proposed, including bootstrapping, classifier combination and using contextual information. On a test set of face and nonface patterns, the combination of three Bayesian classifiers has a correct detection rate of 98.6% at a false positive rate of 10%. Extensive testing results have shown that the proposed face detector achieves good performance in terms of both detection rate and alignment between the detected faces and the true faces. On a test set of 200 images containing 231 faces taken from the ECU face detection database, the proposed face detector has a correct detection rate of 90.04% and makes 10 false detections. We have found that the proposed face detector is more robust In detecting in-plane rotated laces, compared to existing face detectors. +D2

    Making sense of pervasive signals: a machine learning approach

    Full text link
    This study focused on challenges come from noisy and complex pervasive data. We proposed new Bayesian nonparametric models to infer co-patterns from multi-channel data collected from pervasive devices. By making sense of pervasive data, the study contributes to the development of Machine Learning and Data Mining in Big Data era
    • …
    corecore