3,404 research outputs found

    Fast Landmark Localization with 3D Component Reconstruction and CNN for Cross-Pose Recognition

    Full text link
    Two approaches are proposed for cross-pose face recognition, one is based on the 3D reconstruction of facial components and the other is based on the deep Convolutional Neural Network (CNN). Unlike most 3D approaches that consider holistic faces, the proposed approach considers 3D facial components. It segments a 2D gallery face into components, reconstructs the 3D surface for each component, and recognizes a probe face by component features. The segmentation is based on the landmarks located by a hierarchical algorithm that combines the Faster R-CNN for face detection and the Reduced Tree Structured Model for landmark localization. The core part of the CNN-based approach is a revised VGG network. We study the performances with different settings on the training set, including the synthesized data from 3D reconstruction, the real-life data from an in-the-wild database, and both types of data combined. We investigate the performances of the network when it is employed as a classifier or designed as a feature extractor. The two recognition approaches and the fast landmark localization are evaluated in extensive experiments, and compared to stateof-the-art methods to demonstrate their efficacy.Comment: 14 pages, 12 figures, 4 table

    Video Face Swapping

    Get PDF
    Face swapping is the challenge of replacing one or multiple faces in a target image with a face from a source image, the source image conditions need to be transformed in order to match the conditions in the target image (lighting and pose). A code for Image Face Swapping (IFS) was refactored and used to perform face swapping in videos. The basic logic behind Video Face Swapping (VFS) is the same as the one used for IFS since a video is just a sequence of images (frames) stitched together to imitate movement. In order to achieve VFS, the face(s) in an input image are detected, their facial landmarks key points are calculated and assigned to their corresponding (X,Y) coordinates, subsequently the faces are aligned using a procrustes analysis, next a mask is created for each image in order to determine what parts of the source and target image need to be shown in the output, then the source image shape has to warp onto the shape of the target image and for the output to look as natural as possible, color correction is performed. Finally, the two masks are blended to generate a new image output showing the face swap. The results were analysed and obstacles of the VFS code were identified and optimization of the code was conducted. In estonian: Näovahetusena mõistetakse käesolevalt lähtekujutiselt saadud ühe või mitme näo asendamist sihtpildil. Lähtekujutise tingimusi peab transformeerima, et nad ühtiksid sihtpildiga (valgus, asend). Pildi näovahetus (IFS, Image Face Swapping) koodi refaktoreeriti ja kasutati video näovahetuseks. Video näovahetuse (Video Face Swapping, VFS) põhiline loogika on sama kui IFSi puhul, kuna video on olemuselt ühendatud kujutiste järjestus, mis imiteerib liikumist. VFSi saavutamiseks tuvastatakse nägu (näod) sisendkujutisel, arvutatakse näotuvastusalgoritmi abil näojoonte koordinaadid, pärast mida joondatakse näod Procrustese meetodiga. Järgnevalt luuakse igale kujutisele image-mask, määratlemaks, milliseid lähte- ja sihtkujutise osi on vaja näidata väljundina; seejärel ühitatakse lähte- ja sihtkujutise kujud ja võimalikult loomuliku tulemuse jaoks viiakse läbi värvikorrektsioon. Lõpuks hajutatakse kaks maski uueks väljundkujutiseks, millel on näha näovahetuse tulemus. Tulemusi analüüsiti ja tuvastati VFS koodi takistused ning seejärel optimeeriti koodi

    Deep Learning Detection in the Visible and Radio Spectrums

    Get PDF
    Deep learning models with convolutional neural networks are being used to solve some of the most difficult problems in computing today. Complicating factors to the use and development of deep learning models include lack of availability of large volumes of data, lack of problem specific samples, and the lack variations in the specific samples available. The costs to collect this data and to compute the models for the task of detection remains a inhibitory condition for all but the most well funded organizations. This thesis seeks to approach deep learning from a cost reduction and hybrid perspective — incorporating techniques of transfer learning, training augmentation, synthetic data generation, morphological computations, as well as statistical and thresholding model fusion — in the task of detection in two domains: visible spectrum detection of target spacecraft, and radio spectrum detection of radio frequency interference in 2D astronomical time-frequency data. The effects of training augmentation on object detection performance is studied in the visible spectrum, as well as the effect of image degradation on detection performance. Supplementing training on degraded images significantly improves the detection results, and in scenarios with low factors of degradation, the baseline results are exceeded. Morphological operations on degraded data shows promise in reducing computational requirements in some detection tasks. The proposed Mask R-CNN model is able to detect and localize properly on spacecraft images degraded by high levels of pixel loss. Deep learning models such as U-Net have been leveraged for the task of radio frequency interference labeling (flagging). Model variations on U-Net architecture design such as layer size and composition are continuing to be explored, however, the examination of deep learning models combined with statistical tests and thresholding techniques for radio frequency interference mitigation is in its infancy. For the radio spectrum domain, the use of the U-Net model combined with various statistical tests and the SumThreshold technique in an output fusion model is tested against a baseline of SumThreshold alone, for the detection of radio frequency interference. This thesis also contributes an improved dataset for spacecraft detection, and a simple technique for the generation of synthetic channelized voltage data for simulating radio astronomy spectra recordings in a 2D time-frequency plot

    Reference face graph for face recognition

    Get PDF
    Face recognition has been studied extensively; however, real-world face recognition still remains a challenging task. The demand for unconstrained practical face recognition is rising with the explosion of online multimedia such as social networks, and video surveillance footage where face analysis is of significant importance. In this paper, we approach face recognition in the context of graph theory. We recognize an unknown face using an external reference face graph (RFG). An RFG is generated and recognition of a given face is achieved by comparing it to the faces in the constructed RFG. Centrality measures are utilized to identify distinctive faces in the reference face graph. The proposed RFG-based face recognition algorithm is robust to the changes in pose and it is also alignment free. The RFG recognition is used in conjunction with DCT locality sensitive hashing for efficient retrieval to ensure scalability. Experiments are conducted on several publicly available databases and the results show that the proposed approach outperforms the state-of-the-art methods without any preprocessing necessities such as face alignment. Due to the richness in the reference set construction, the proposed method can also handle illumination and expression variation
    corecore