227 research outputs found

    Modification of the AdaBoost-based Detector for Partially Occluded Faces

    Full text link
    While face detection seems a solved problem under general conditions, most state-of-the-art systems degrade rapidly when faces are partially occluded by other objects. This paper presents a solution to detect partially occluded faces by reasonably modifying the AdaBoost-based face detector. Our basic idea is that the weak classifiers in the AdaBoost-based face detector, each corresponding to a Haar-like feature, are inherently a patch-based model. Therefore, one can divide the whole face region into multiple patches, and map those weak classifiers to the patches. The weak classifiers belonging to each patch are re-formed to be a new classifier to determine if it is a valid face patch—without occlusion. Finally, we combine all of the valid face patches by assigning the patches with different weights to make the final decision whether the input subwindow is a face. The experimental results show that the proposed method is promising for the detection of occluded faces. 1

    Mathematical modeling for partial object detection.

    Get PDF
    From a computer vision point of view, the image is a scene consisting of objects of interest and a background represented by everything else in the image. The relations and interactions among these objects are the key factors for scene understanding. In this dissertation, a mathematical model is designed for the detection of partially occluded faces captured in unconstrained real life conditions. The proposed model novelty comes from explicitly considering certain objects that are common to occlude faces and embedding them in the face model. This enables the detection of faces in difficult settings and provides more information to subsequent analysis in addition to the bounding box of the face. In the proposed Selective Part Models (SPM), the face is modelled as a collection of parts that can be selected from the visible regular facial parts and some of the occluding objects which commonly interact with faces such as sunglasses, caps, hands, shoulders, and other faces. With the face detection being the first step in the face recognition pipeline, the proposed model does not only detect partially occluded faces efficiently but it also suggests the occluded parts to be excluded from the subsequent recognition step. The model was tested on several recent face detection databases and benchmarks and achieved state of the art performance. In addition, detailed analysis for the performance with respect to different types of occlusion were provided. Moreover, a new database was collected for evaluating face detectors focusing on the partial occlusion problem. This dissertation highlights the importance of explicitly handling the partial occlusion problem in face detection and shows its efficiency in enhancing both the face detection performance and the subsequent recognition performance of partially occluded faces. The broader impact of the proposed detector exceeds the common security applications by using it for human robot interaction. The humanoid robot Nao is used to help in teaching children with autism and the proposed detector is used to achieve natural interaction between the robot and the children by detecting their faces which can be used for recognition or more interestingly for adaptive interaction by analyzing their expressions

    Multi-view Face Detection Using Deep Convolutional Neural Networks

    Full text link
    In this paper we consider the problem of multi-view face detection. While there has been significant research on this problem, current state-of-the-art approaches for this task require annotation of facial landmarks, e.g. TSM [25], or annotation of face poses [28, 22]. They also require training dozens of models to fully capture faces in all orientations, e.g. 22 models in HeadHunter method [22]. In this paper we propose Deep Dense Face Detector (DDFD), a method that does not require pose/landmark annotation and is able to detect faces in a wide range of orientations using a single model based on deep convolutional neural networks. The proposed method has minimal complexity; unlike other recent deep learning object detection methods [9], it does not require additional components such as segmentation, bounding-box regression, or SVM classifiers. Furthermore, we analyzed scores of the proposed face detector for faces in different orientations and found that 1) the proposed method is able to detect faces from different angles and can handle occlusion to some extent, 2) there seems to be a correlation between dis- tribution of positive examples in the training set and scores of the proposed face detector. The latter suggests that the proposed methods performance can be further improved by using better sampling strategies and more sophisticated data augmentation techniques. Evaluations on popular face detection benchmark datasets show that our single-model face detector algorithm has similar or better performance compared to the previous methods, which are more complex and require annotations of either different poses or facial landmarks.Comment: in International Conference on Multimedia Retrieval 2015 (ICMR

    A survey of face recognition techniques under occlusion

    Get PDF
    The limited capacity to recognize faces under occlusions is a long-standing problem that presents a unique challenge for face recognition systems and even for humans. The problem regarding occlusion is less covered by research when compared to other challenges such as pose variation, different expressions, etc. Nevertheless, occluded face recognition is imperative to exploit the full potential of face recognition for real-world applications. In this paper, we restrict the scope to occluded face recognition. First, we explore what the occlusion problem is and what inherent difficulties can arise. As a part of this review, we introduce face detection under occlusion, a preliminary step in face recognition. Second, we present how existing face recognition methods cope with the occlusion problem and classify them into three categories, which are 1) occlusion robust feature extraction approaches, 2) occlusion aware face recognition approaches, and 3) occlusion recovery based face recognition approaches. Furthermore, we analyze the motivations, innovations, pros and cons, and the performance of representative approaches for comparison. Finally, future challenges and method trends of occluded face recognition are thoroughly discussed

    Face Occlusion Detection Using Deep Convolutional Neural Networks

    Get PDF
    With the rise of crimes associated with Automated Teller Machines (ATMs), security reinforcement by surveillance techniques has been a hot topic on the security agenda. As a result, cameras are frequently installed with ATMs, so as to capture the facial images of users. The main objective is to support follow-up criminal investigations in the event of an incident. However, in the case of miss-use, the user’s face is often occluded. Therefore, face occlusion detection has become very important to prevent crimes connected with ATM usage. Traditional approaches to solving the problem typically comprise a succession of steps: localization, segmentation, feature extraction and recognition. This paper proposes an end-to-end facial occlusion detection framework, which is robust and effective by combining region proposal algorithm and Convolutional Neural Networks (CNN). The framework utilizes a coarse-to-fine strategy, which consists of two CNNs. The first CNN detects the head element within an upper body image while the second distinguishes which facial part is occluded from the head image. In comparison with previous approaches, the usage of CNN is optimal from a system point of view as the design is based on the end-to-end principle and the model operates directly on image pixels. For evaluation purposes, a face occlusion database consisting of over 50[Formula: see text]000 images, with annotated facial parts, was used. Experimental results revealed that the proposed framework is very effective. Using the bespoke face occlusion dataset, Aleix and Robert (AR) face dataset and the Labeled Face in the Wild (LFW) database, we achieved over 85.61%, 97.58% and 100% accuracies for head detection when the Intersection over Union-section (IoU) is larger than 0.5, and 94.55%, 98.58% and 95.41% accuracies for occlusion discrimination, respectively. </jats:p

    Efficient Pedestrian Detection in Urban Traffic Scenes

    Get PDF
    Pedestrians are important participants in urban traffic environments, and thus act as an interesting category of objects for autonomous cars. Automatic pedestrian detection is an essential task for protecting pedestrians from collision. In this thesis, we investigate and develop novel approaches by interpreting spatial and temporal characteristics of pedestrians, in three different aspects: shape, cognition and motion. The special up-right human body shape, especially the geometry of the head and shoulder area, is the most discriminative characteristic for pedestrians from other object categories. Inspired by the success of Haar-like features for detecting human faces, which also exhibit a uniform shape structure, we propose to design particular Haar-like features for pedestrians. Tailored to a pre-defined statistical pedestrian shape model, Haar-like templates with multiple modalities are designed to describe local difference of the shape structure. Cognition theories aim to explain how human visual systems process input visual signals in an accurate and fast way. By emulating the center-surround mechanism in human visual systems, we design multi-channel, multi-direction and multi-scale contrast features, and boost them to respond to the appearance of pedestrians. In this way, our detector is considered as a top-down saliency system. In the last part of this thesis, we exploit the temporal characteristics for moving pedestrians and then employ motion information for feature design, as well as for regions of interest (ROIs) selection. Motion segmentation on optical flow fields enables us to select those blobs most probably containing moving pedestrians; a combination of Histogram of Oriented Gradients (HOG) and motion self difference features further enables robust detection. We test our three approaches on image and video data captured in urban traffic scenes, which are rather challenging due to dynamic and complex backgrounds. The achieved results demonstrate that our approaches reach and surpass state-of-the-art performance, and can also be employed for other applications, such as indoor robotics or public surveillance. In this thesis, we investigate and develop novel approaches by interpreting spatial and temporal characteristics of pedestrians, in three different aspects: shape, cognition and motion. The special up-right human body shape, especially the geometry of the head and shoulder area, is the most discriminative characteristic for pedestrians from other object categories. Inspired by the success of Haar-like features for detecting human faces, which also exhibit a uniform shape structure, we propose to design particular Haar-like features for pedestrians. Tailored to a pre-defined statistical pedestrian shape model, Haar-like templates with multiple modalities are designed to describe local difference of the shape structure. Cognition theories aim to explain how human visual systems process input visual signals in an accurate and fast way. By emulating the center-surround mechanism in human visual systems, we design multi-channel, multi-direction and multi-scale contrast features, and boost them to respond to the appearance of pedestrians. In this way, our detector is considered as a top-down saliency system. In the last part of this thesis, we exploit the temporal characteristics for moving pedestrians and then employ motion information for feature design, as well as for regions of interest (ROIs) selection. Motion segmentation on optical flow fields enables us to select those blobs most probably containing moving pedestrians; a combination of Histogram of Oriented Gradients (HOG) and motion self difference features further enables robust detection. We test our three approaches on image and video data captured in urban traffic scenes, which are rather challenging due to dynamic and complex backgrounds. The achieved results demonstrate that our approaches reach and surpass state-of-the-art performance, and can also be employed for other applications, such as indoor robotics or public surveillance

    Data driven analysis of faces from images

    Get PDF
    This thesis proposes three new data-driven approaches to detect, analyze, or modify faces in images. All presented contributions are inspired by the use of prior knowledge and they derive information about facial appearances from pre-collected databases of images or 3D face models. First, we contribute an approach that extends a widely-used monocular face detector by an additional classifier that evaluates disparity maps of a passive stereo camera. The algorithm runs in real-time and significantly reduces the number of false positives compared to the monocular approach. Next, with a many-core implementation of the detector, we train view-dependent face detectors based on tailored views which guarantee that the statistical variability is fully covered. These detectors are superior to the state of the art on a challenging dataset and can be trained in an automated procedure. Finally, we contribute a model describing the relation of facial appearance and makeup. The approach extracts makeup from before/after images of faces and allows to modify faces in images. Applications such as machine-suggested makeup can improve perceived attractiveness as shown in a perceptual study. In summary, the presented methods help improve the outcome of face detection algorithms, ease and automate their training procedures and the modification of faces in images. Moreover, their data-driven nature enables new and powerful applications arising from the use of prior knowledge and statistical analyses.In der vorliegenden Arbeit werden drei neue, datengetriebene Methoden vorgestellt, die Gesichter in Abbildungen detektieren, analysieren oder modifizieren. Alle Algorithmen extrahieren dabei Vorwissen über Gesichter und deren Erscheinungsformen aus zuvor erstellten Gesichts- Datenbanken, in 2-D oder 3-D. Zunächst wird ein weit verbreiteter monokularer Gesichtsdetektions- Algorithmus um einen zweiten Klassifikator erweitert. In Echtzeit wertet dieser stereoskopische Tiefenkarten aus und führt so zu nachweislich weniger falsch detektierten Gesichtern. Anschließend wird der Basis-Algorithmus durch Parallelisierung verbessert und mit synthetisch generierten Bilddaten trainiert. Diese garantieren die volle Nutzung des verfügbaren Varianzspektrums. So erzeugte Detektoren übertreffen bisher präsentierte Detektoren auf einem schwierigen Datensatz und können automatisch erzeugt werden. Abschließend wird ein Datenmodell für Gesichts-Make-up vorgestellt. Dieses extrahiert Make-up aus Vorher/Nachher-Fotos und kann Gesichter in Abbildungen modifizieren. In einer Studie wird gezeigt, dass vom Computer empfohlenes Make-up die wahrgenommene Attraktivität von Gesichtern steigert. Zusammengefasst verbessern die gezeigten Methoden die Ergebnisse von Gesichtsdetektoren, erleichtern und automatisieren ihre Trainingsprozedur sowie die automatische Veränderung von Gesichtern in Abbildungen. Durch Extraktion von Vorwissen und statistische Datenanalyse entstehen zudem neuartige Anwendungsfelder

    Face recognition in the wild.

    Get PDF
    Research in face recognition deals with problems related to Age, Pose, Illumination and Expression (A-PIE), and seeks approaches that are invariant to these factors. Video images add a temporal aspect to the image acquisition process. Another degree of complexity, above and beyond A-PIE recognition, occurs when multiple pieces of information are known about people, which may be distorted, partially occluded, or disguised, and when the imaging conditions are totally unorthodox! A-PIE recognition in these circumstances becomes really “wild” and therefore, Face Recognition in the Wild has emerged as a field of research in the past few years. Its main purpose is to challenge constrained approaches of automatic face recognition, emulating some of the virtues of the Human Visual System (HVS) which is very tolerant to age, occlusion and distortions in the imaging process. HVS also integrates information about individuals and adds contexts together to recognize people within an activity or behavior. Machine vision has a very long road to emulate HVS, but face recognition in the wild, using the computer, is a road to perform face recognition in that path. In this thesis, Face Recognition in the Wild is defined as unconstrained face recognition under A-PIE+; the (+) connotes any alterations to the design scenario of the face recognition system. This thesis evaluates the Biometric Optical Surveillance System (BOSS) developed at the CVIP Lab, using low resolution imaging sensors. Specifically, the thesis tests the BOSS using cell phone cameras, and examines the potential of facial biometrics on smart portable devices like iPhone, iPads, and Tablets. For quantitative evaluation, the thesis focused on a specific testing scenario of BOSS software using iPhone 4 cell phones and a laptop. Testing was carried out indoor, at the CVIP Lab, using 21 subjects at distances of 5, 10 and 15 feet, with three poses, two expressions and two illumination levels. The three steps (detection, representation and matching) of the BOSS system were tested in this imaging scenario. False positives in facial detection increased with distances and with pose angles above ± 15°. The overall identification rate (face detection at confidence levels above 80%) also degraded with distances, pose, and expressions. The indoor lighting added challenges also, by inducing shadows which affected the image quality and the overall performance of the system. While this limited number of subjects and somewhat constrained imaging environment does not fully support a “wild” imaging scenario, it did provide a deep insight on the issues with automatic face recognition. The recognition rate curves demonstrate the limits of low-resolution cameras for face recognition at a distance (FRAD), yet it also provides a plausible defense for possible A-PIE face recognition on portable devices
    • …
    corecore