1,719 research outputs found

    Deep Multi-task Multi-label CNN for Effective Facial Attribute Classification

    Get PDF
    Facial Attribute Classification (FAC) has attracted increasing attention in computer vision and pattern recognition. However, state-of-the-art FAC methods perform face detection/alignment and FAC independently. The inherent dependencies between these tasks are not fully exploited. In addition, most methods predict all facial attributes using the same CNN network architecture, which ignores the different learning complexities of facial attributes. To address the above problems, we propose a novel deep multi-task multi-label CNN, termed DMM-CNN, for effective FAC. Specifically, DMM-CNN jointly optimizes two closely-related tasks (i.e., facial landmark detection and FAC) to improve the performance of FAC by taking advantage of multi-task learning. To deal with the diverse learning complexities of facial attributes, we divide the attributes into two groups: objective attributes and subjective attributes. Two different network architectures are respectively designed to extract features for two groups of attributes, and a novel dynamic weighting scheme is proposed to automatically assign the loss weight to each facial attribute during training. Furthermore, an adaptive thresholding strategy is developed to effectively alleviate the problem of class imbalance for multi-label learning. Experimental results on the challenging CelebA and LFWA datasets show the superiority of the proposed DMM-CNN method compared with several state-of-the-art FAC methods

    Real-time acquisition of multi-view face images to support robust face recognition using a wireless camera network

    Get PDF
    Recent terror attacks, intrusion attempts and criminal activities have necessitated a transition to modern biometric systems that are capable of identifying suspects in real time. But real-time biometrics is challenging given the computationally intensive nature of video processing and the potential occlusions and variations in pose of a subject in an unconstrained environment. The objective of this dissertation is to utilize the robustness and parallel computational abilities of a distributed camera network for fast and robust face recognition.;In order to support face recognition using a camera network, a collaborative middle-ware service is designed that enables the rapid extraction of multi-view face images of multiple subjects moving through a region. This service exploits the epipolar geometry between cameras to speed up multi view face detection rates. By quickly detecting face images within the network, labeling the pose of each face image, filtering them based on their suitability of recognition and transmitting only the resultant images to a base station for recognition, both the required network bandwidth and centralized processing overhead are reduced. The performance of the face image acquisition system is evaluated using an embedded camera network that is deployed in indoor environments that mimic walkways in public places. The relevance of the acquired images for recognition is evaluated by using a commercial software for matching acquired probe images. The experimental results demonstrate significant improvement in face recognition system performance over traditional systems as well as increase in multi-view face detection rate over purely image processing based approaches

    Mathematical modeling for partial object detection.

    Get PDF
    From a computer vision point of view, the image is a scene consisting of objects of interest and a background represented by everything else in the image. The relations and interactions among these objects are the key factors for scene understanding. In this dissertation, a mathematical model is designed for the detection of partially occluded faces captured in unconstrained real life conditions. The proposed model novelty comes from explicitly considering certain objects that are common to occlude faces and embedding them in the face model. This enables the detection of faces in difficult settings and provides more information to subsequent analysis in addition to the bounding box of the face. In the proposed Selective Part Models (SPM), the face is modelled as a collection of parts that can be selected from the visible regular facial parts and some of the occluding objects which commonly interact with faces such as sunglasses, caps, hands, shoulders, and other faces. With the face detection being the first step in the face recognition pipeline, the proposed model does not only detect partially occluded faces efficiently but it also suggests the occluded parts to be excluded from the subsequent recognition step. The model was tested on several recent face detection databases and benchmarks and achieved state of the art performance. In addition, detailed analysis for the performance with respect to different types of occlusion were provided. Moreover, a new database was collected for evaluating face detectors focusing on the partial occlusion problem. This dissertation highlights the importance of explicitly handling the partial occlusion problem in face detection and shows its efficiency in enhancing both the face detection performance and the subsequent recognition performance of partially occluded faces. The broader impact of the proposed detector exceeds the common security applications by using it for human robot interaction. The humanoid robot Nao is used to help in teaching children with autism and the proposed detector is used to achieve natural interaction between the robot and the children by detecting their faces which can be used for recognition or more interestingly for adaptive interaction by analyzing their expressions

    Video Face Swapping

    Get PDF
    Face swapping is the challenge of replacing one or multiple faces in a target image with a face from a source image, the source image conditions need to be transformed in order to match the conditions in the target image (lighting and pose). A code for Image Face Swapping (IFS) was refactored and used to perform face swapping in videos. The basic logic behind Video Face Swapping (VFS) is the same as the one used for IFS since a video is just a sequence of images (frames) stitched together to imitate movement. In order to achieve VFS, the face(s) in an input image are detected, their facial landmarks key points are calculated and assigned to their corresponding (X,Y) coordinates, subsequently the faces are aligned using a procrustes analysis, next a mask is created for each image in order to determine what parts of the source and target image need to be shown in the output, then the source image shape has to warp onto the shape of the target image and for the output to look as natural as possible, color correction is performed. Finally, the two masks are blended to generate a new image output showing the face swap. The results were analysed and obstacles of the VFS code were identified and optimization of the code was conducted. In estonian: Näovahetusena mõistetakse käesolevalt lähtekujutiselt saadud ühe või mitme näo asendamist sihtpildil. Lähtekujutise tingimusi peab transformeerima, et nad ühtiksid sihtpildiga (valgus, asend). Pildi näovahetus (IFS, Image Face Swapping) koodi refaktoreeriti ja kasutati video näovahetuseks. Video näovahetuse (Video Face Swapping, VFS) põhiline loogika on sama kui IFSi puhul, kuna video on olemuselt ühendatud kujutiste järjestus, mis imiteerib liikumist. VFSi saavutamiseks tuvastatakse nägu (näod) sisendkujutisel, arvutatakse näotuvastusalgoritmi abil näojoonte koordinaadid, pärast mida joondatakse näod Procrustese meetodiga. Järgnevalt luuakse igale kujutisele image-mask, määratlemaks, milliseid lähte- ja sihtkujutise osi on vaja näidata väljundina; seejärel ühitatakse lähte- ja sihtkujutise kujud ja võimalikult loomuliku tulemuse jaoks viiakse läbi värvikorrektsioon. Lõpuks hajutatakse kaks maski uueks väljundkujutiseks, millel on näha näovahetuse tulemus. Tulemusi analüüsiti ja tuvastati VFS koodi takistused ning seejärel optimeeriti koodi
    • …