273,194 research outputs found

    Structural models for face detection

    Full text link
    Abstract — Despite the success in the last two decades, the state-of-the-art face detectors still have problems in dealing with images in the wild for the large appearance variations. Instead of taking appearance variations as black boxes and leaving them to statistical learning algorithms, we propose a structural face model to explicitly represent them. Our hierarchical part based structural face model enables part subtype option to describe appearance variations of the local part, and part deformation to capture the deformable variations between different poses and expressions. In the process of detection, the input candidate is first fitted by the structural model to infer the part location and part subtype, and the confidence score is then computed based on the fitted configuration to reduce the influence of structure variation. Besides the face model, we utilize the co-occurrence of face and body to further boost the face detection performance. We present a method for training phrase based body detectors, and propose a structural context model to jointly use the results of face detector and various body detectors. Experiments on the challenging FDDB show that our method has state-of-the-art performance, compared with other commercial and academic systems. I

    Automatic CNN channel selection and effective detection on face and rotated aerial objects

    Get PDF
    Balancing accuracy and computational cost is a challenging task in computer vision. This is especially true for convolutional neural networks (CNNs), which required far larger scale of processing power than traditional learning algorithms. This thesis is aimed at the development of new CNN structures and loss functions to tackle the unbalanced accuracy-effciency issue in image classification and object detection, which are two fundamental yet challenging tasks of computer vision. For a CNN based object detector, the main computational cost is caused by the feature extractor (backbone), which has been originally applied to image classification.;Optimising the structure of CNN applied to image classification will bring benefits when it is applied to object detection. Although the outputs of detectors may vary across detection tasks, the challenges and the design principles among detectors are similar. Therefore, this thesis will start with face detection (i.e. a single object detection task), which is a significant branch of objection detection and has been widely used in real life. After that, object detection on aerial image will be investigated, which is a more challenging detection task.;Specifically, the objectives of this thesis are: 1. Optimising the CNN structures for image classification; 2. Developing a face detector which enables a trade-off between computational cost and accuracy; and 3. Proposing an object detector for aerial images, which suppresses the background noise without damaging the inference efficiency.;For the first target, this thesis aims to automatically optimise the topology of CNNs to generate the structure of fixed-length models, in which unnecessary convolutional kernels are removed. Experimental results have demonstrated that the optimised model can achieve comparable accuracy to the state-of-the-art models, across a broad range of datasets, whilst significantly reducing the number of parameters.;To tackle the unbalanced accuracy-effciency challenge in face detection, a novel context enhanced approach is proposed which improves the performance of the face detector in terms of both loss function and structure. For loss function optimisation, a hierarchical loss, referred to as 'triple loss' in this thesis, is introduced to optimise the feature pyramid network (FPN) based face detector. For structural optimisation, this thesis proposes a context-sensitive structure to increase the capacity of the network prediction. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection.;To suppress the background noise in aerial image object detection, this thesis presents a two-stage detector, named as 'SAFDet'. To be more specific, a rotation anchor-free-branch (RAFB) is proposed to regress the precise rectangle boundary. Asthe RAFB is anchor free, the computational cost is negligible during training. Meanwhile,a centre prediction module (CPM) is introduced to enhance the capabilities oftarget localisation and noise suppression from the background. As the CPM is only deployed during training, it does not increase the computational cost of inference. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost, and it effectively suppresses the background noise at the same time.Balancing accuracy and computational cost is a challenging task in computer vision. This is especially true for convolutional neural networks (CNNs), which required far larger scale of processing power than traditional learning algorithms. This thesis is aimed at the development of new CNN structures and loss functions to tackle the unbalanced accuracy-effciency issue in image classification and object detection, which are two fundamental yet challenging tasks of computer vision. For a CNN based object detector, the main computational cost is caused by the feature extractor (backbone), which has been originally applied to image classification.;Optimising the structure of CNN applied to image classification will bring benefits when it is applied to object detection. Although the outputs of detectors may vary across detection tasks, the challenges and the design principles among detectors are similar. Therefore, this thesis will start with face detection (i.e. a single object detection task), which is a significant branch of objection detection and has been widely used in real life. After that, object detection on aerial image will be investigated, which is a more challenging detection task.;Specifically, the objectives of this thesis are: 1. Optimising the CNN structures for image classification; 2. Developing a face detector which enables a trade-off between computational cost and accuracy; and 3. Proposing an object detector for aerial images, which suppresses the background noise without damaging the inference efficiency.;For the first target, this thesis aims to automatically optimise the topology of CNNs to generate the structure of fixed-length models, in which unnecessary convolutional kernels are removed. Experimental results have demonstrated that the optimised model can achieve comparable accuracy to the state-of-the-art models, across a broad range of datasets, whilst significantly reducing the number of parameters.;To tackle the unbalanced accuracy-effciency challenge in face detection, a novel context enhanced approach is proposed which improves the performance of the face detector in terms of both loss function and structure. For loss function optimisation, a hierarchical loss, referred to as 'triple loss' in this thesis, is introduced to optimise the feature pyramid network (FPN) based face detector. For structural optimisation, this thesis proposes a context-sensitive structure to increase the capacity of the network prediction. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost of face detection.;To suppress the background noise in aerial image object detection, this thesis presents a two-stage detector, named as 'SAFDet'. To be more specific, a rotation anchor-free-branch (RAFB) is proposed to regress the precise rectangle boundary. Asthe RAFB is anchor free, the computational cost is negligible during training. Meanwhile,a centre prediction module (CPM) is introduced to enhance the capabilities oftarget localisation and noise suppression from the background. As the CPM is only deployed during training, it does not increase the computational cost of inference. Experimental results indicate that the proposed method achieves a good balance between the accuracy and computational cost, and it effectively suppresses the background noise at the same time

    Quantifying DeepFake Detection Accuracy for a Variety of Natural Settings

    Get PDF
    Deep fakes are videos generated from a starting video of a person where that person\u27s face has been swapped for someone else\u27s. In this report, we describe our work to develop general, deep learning-based models to classify Deep Fake content. Our first experiments involved simple Convolution Neural Network (CNN)-based models where we varied how individual frames from the source video were passed to the CNN. These simple models tended to give low accuracy scores for discriminating fake versus non-fake videos of less than 60%. We then developed three more sophisticated models: one based on choosing test frames, one based on video Optical Flow, and one that uses Generative Adversarial Networks (GANs) to determine structural differences in images. This last technique we call MRI-GAN and is new to the literature. We tested our models using the Deep Fake Detection Challenge dataset and found our plain frames-based model achieves 90% test accuracy, our MRI model achieves 79% test accuracy, and Optical Flow-based model achieves 69% test accuracy

    Smart FRP Composite Sandwich Bridge Decks in Cold Regions

    Get PDF
    INE/AUTC 12.0
    • …
    corecore