223 research outputs found

    Face Detection and Verification using Local Binary Patterns

    Get PDF
    This thesis proposes a robust Automatic Face Verification (AFV) system using Local Binary Patterns (LBP). AFV is mainly composed of two modules: Face Detection (FD) and Face Verification (FV). The purpose of FD is to determine whether there are any face in an image, while FV involves confirming or denying the identity claimed by a person. The contributions of this thesis are the following: 1) a real-time multiview FD system which is robust to illumination and partial occlusion, 2) a FV system based on the adaptation of LBP features, 3) an extensive study of the performance evaluation of FD algorithms and in particular the effect of FD errors on FV performance. The first part of the thesis addresses the problem of frontal FD. We introduce the system of Viola and Jones which is the first real-time frontal face detector. One of its limitations is the sensitivity to local lighting variations and partial occlusion of the face. In order to cope with these limitations, we propose to use LBP features. Special emphasis is given to the scanning process and to the merging of overlapped detections, because both have a significant impact on the performance. We then extend our frontal FD module to multiview FD. In the second part, we present a novel generative approach for FV, based on an LBP description of the face. The main advantages compared to previous approaches are a very fast and simple training procedure and robustness to bad lighting conditions. In the third part, we address the problem of estimating the quality of FD. We first show the influence of FD errors on the FV task and then empirically demonstrate the limitations of current detection measures when applied to this task. In order to properly evaluate the performance of a face detection module, we propose to embed the FV into the performance measuring process. We show empirically that the proposed methodology better matches the final FV performance

    Human Body Posture Recognition Approaches: A Review

    Get PDF
    Human body posture recognition has become the focus of many researchers in recent years. Recognition of body posture is used in various applications, including surveillance, security, and health monitoring. However, these systems that determine the body’s posture through video clips, images, or data from sensors have many challenges when used in the real world. This paper provides an important review of how most essential ‎ hardware technologies are ‎used in posture recognition systems‎. These systems capture and collect datasets through ‎accelerometer sensors or computer vision. In addition, this paper presents a comparison ‎study with state-of-the-art in terms of accuracy. We also present the advantages and ‎limitations of each system and suggest promising future ideas that can increase the ‎efficiency of the existing posture recognition system. Finally, the most common datasets ‎applied in these systems are described in detail. It aims to be a resource to help choose one of the methods in recognizing the posture of the human body and the techniques that suit each method. It analyzes more than 80 papers between 2015 and 202

    Compressive Sequential Learning for Action Similarity Labeling

    Get PDF
    Human action recognition in videos has been extensively studied in recent years due to its wide range of applications. Instead of classifying video sequences into a number of action categories, in this paper, we focus on a particular problem of action similarity labeling (ASLAN), which aims at verifying whether a pair of videos contain the same type of action or not. To address this challenge, a novel approach called compressive sequential learning (CSL) is proposed by leveraging the compressive sensing theory and sequential learning. We first project data points to a low-dimensional space by effectively exploring an important property in compressive sensing: the restricted isometry property. In particular, a very sparse measurement matrix is adopted to reduce the dimensionality efficiently. We then learn an ensemble classifier for measuring similarities between pairwise videos by iteratively minimizing its empirical risk with the AdaBoost strategy on the training set. Unlike conventional AdaBoost, the weak learner for each iteration is not explicitly defined and its parameters are learned through greedy optimization. Furthermore, an alternative of CSL named compressive sequential encoding is developed as an encoding technique and followed by a linear classifier to address the similarity-labeling problem. Our method has been systematically evaluated on four action data sets: ASLAN, KTH, HMDB51, and Hollywood2, and the results show the effectiveness and superiority of our method for ASLAN

    Deep learning for remote sensing image classification:A survey

    Get PDF
    Remote sensing (RS) image classification plays an important role in the earth observation technology using RS data, having been widely exploited in both military and civil fields. However, due to the characteristics of RS data such as high dimensionality and relatively small amounts of labeled samples available, performing RS image classification faces great scientific and practical challenges. In recent years, as new deep learning (DL) techniques emerge, approaches to RS image classification with DL have achieved significant breakthroughs, offering novel opportunities for the research and development of RS image classification. In this paper, a brief overview of typical DL models is presented first. This is followed by a systematic review of pixel?wise and scene?wise RS image classification approaches that are based on the use of DL. A comparative analysis regarding the performances of typical DL?based RS methods is also provided. Finally, the challenges and potential directions for further research are discussedpublishersversionPeer reviewe

    Human Pose Estimation from Monocular Images : a Comprehensive Survey

    Get PDF
    Human pose estimation refers to the estimation of the location of body parts and how they are connected in an image. Human pose estimation from monocular images has wide applications (e.g., image indexing). Several surveys on human pose estimation can be found in the literature, but they focus on a certain category; for example, model-based approaches or human motion analysis, etc. As far as we know, an overall review of this problem domain has yet to be provided. Furthermore, recent advancements based on deep learning have brought novel algorithms for this problem. In this paper, a comprehensive survey of human pose estimation from monocular images is carried out including milestone works and recent advancements. Based on one standard pipeline for the solution of computer vision problems, this survey splits the problema into several modules: feature extraction and description, human body models, and modelin methods. Problem modeling methods are approached based on two means of categorization in this survey. One way to categorize includes top-down and bottom-up methods, and another way includes generative and discriminative methods. Considering the fact that one direct application of human pose estimation is to provide initialization for automatic video surveillance, there are additional sections for motion-related methods in all modules: motion features, motion models, and motion-based methods. Finally, the paper also collects 26 publicly available data sets for validation and provides error measurement methods that are frequently used

    A Deep Learning Approach for Multi-View Engagement Estimation of Children in a Child-Robot Joint Attention Task

    Get PDF
    International audienceIn this work we tackle the problem of child engagement estimation while children freely interact with a robot in a friendly, room-like environment. We propose a deep-based multi-view solution that takes advantage of recent developments in human pose detection. We extract the child's pose from different RGB-D cameras placed regularly in the room, fuse the results and feed them to a deep neural network trained for classifying engagement levels. The deep network contains a recurrent layer, in order to exploit the rich temporal information contained in the pose data. The resulting method outperforms a number of baseline classifiers, and provides a promising tool for better automatic understanding of a child's attitude, interest and attention while cooperating with a robot. The goal is to integrate this model in next generation social robots as an attention monitoring tool during various Child Robot Interaction (CRI) tasks both for Typically Developed (TD) children and children affected by autism (ASD)
    • 

    corecore