Search CORE

406 research outputs found

Face Centered Image Analysis Using Saliency and Deep Learning Based Techniques

Author: Guo Rui
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2016
Field of study

Image analysis starts with the purpose of configuring vision machines that can perceive like human to intelligently infer general principles and sense the surrounding situations from imagery. This dissertation studies the face centered image analysis as the core problem in high level computer vision research and addresses the problem by tackling three challenging subjects: Are there anything interesting in the image? If there is, what is/are that/they? If there is a person presenting, who is he/she? What kind of expression he/she is performing? Can we know his/her age? Answering these problems results in the saliency-based object detection, deep learning structured objects categorization and recognition, human facial landmark detection and multitask biometrics. To implement object detection, a three-level saliency detection based on the self-similarity technique (SMAP) is firstly proposed in the work. The first level of SMAP accommodates statistical methods to generate proto-background patches, followed by the second level that implements local contrast computation based on image self-similarity characteristics. At last, the spatial color distribution constraint is considered to realize the saliency detection. The outcome of the algorithm is a full resolution image with highlighted saliency objects and well-defined edges. In object recognition, the Adaptive Deconvolution Network (ADN) is implemented to categorize the objects extracted from saliency detection. To improve the system performance, L1/2 norm regularized ADN has been proposed and tested in different applications. The results demonstrate the efficiency and significance of the new structure. To fully understand the facial biometrics related activity contained in the image, the low rank matrix decomposition is introduced to help locate the landmark points on the face images. The natural extension of this work is beneficial in human facial expression recognition and facial feature parsing research. To facilitate the understanding of the detected facial image, the automatic facial image analysis becomes essential. We present a novel deeply learnt tree-structured face representation to uniformly model the human face with different semantic meanings. We show that the proposed feature yields unified representation in multi-task facial biometrics and the multi-task learning framework is applicable to many other computer vision tasks

University of Tennessee, Knoxville: Trace

Side information in robust principal component analysis: algorithms and applications

Author: Xue Niannan
Publication venue: Computing, Imperial College London
Publication date: 01/10/2020
Field of study

Dimensionality reduction and noise removal are fundamental machine learning tasks that are vital to artificial intelligence applications. Principal component analysis has long been utilised in computer vision to achieve the above mentioned goals. Recently, it has been enhanced in terms of robustness to outliers in robust principal component analysis. Both convex and non-convex programs have been developed to solve this new formulation, some with exact convergence guarantees. Its effectiveness can be witnessed in image and video applications ranging from image denoising and alignment to background separation and face recognition. However, robust principal component analysis is by no means perfect. This dissertation identifies its limitations, explores various promising options for improvement and validates the proposed algorithms on both synthetic and real-world datasets. Common algorithms approximate the NP-hard formulation of robust principal component analysis with convex envelopes. Though under certain assumptions exact recovery can be guaranteed, the relaxation margin is too big to be squandered. In this work, we propose to apply gradient descent on the Burer-Monteiro bilinear matrix factorisation to squeeze this margin given available subspaces. This non-convex approach improves upon conventional convex approaches both in terms of accuracy and speed. On the other hand, oftentimes there is accompanying side information when an observation is made. The ability to assimilate such auxiliary sources of data can ameliorate the recovery process. In this work, we investigate in-depth such possibilities for incorporating side information in restoring the true underlining low-rank component from gross sparse noise. Lastly, tensors, also known as multi-dimensional arrays, represent real-world data more naturally than matrices. It is thus advantageous to adapt robust principal component analysis to tensors. Since there is no exact equivalence between tensor rank and matrix rank, we employ the notions of Tucker rank and CP rank as our optimisation objectives. Overall, this dissertation carefully defines the problems when facing real-world computer vision challenges, extensively and impartially evaluates the state-of-the-art approaches, proposes novel solutions and provides sufficient validations on both simulated data and popular real-world datasets for various mainstream computer vision tasks.Open Acces

Spiral - Imperial College Digital Repository

Facial Inpainting Methods for Robust Face Recognition

Author: PANAGAKIS VASILEIOS-MARIOS
ΠΑΝΑΓΑΚΗΣ ΒΑΣΙΛΕΙΟΣ-ΜΑΡΙΟΣ
Publication venue
Publication date: 01/01/2021
Field of study

Το ανθρώπινο πρόσωπο είναι πιθανώς το πιο χαρακτηριστικό αναγνωριστικό της ταυτότητας ενός ανθρώπου σε κάθε έκφανση της ζωής του. Στη σύχρονη εποχή, η ανάπτυξη των καμερών και των ηλεκτρονικών συσκευών έχει οδηγήσει στην αδιάκοπη παραγωγή και συλλογή εικόνων με πρόσωπα, που βρίσκουν εφαρμογή σε πολλούς τομείς, όπως η εκπαίδευση, η υγεία, τα ηλεκτρονικά παιχνίδια, η ασφάλεια, η ποινική και ιατροδικαστική έρευνα. Είναι προφανές, ότι η πρόοδος σε αυτούς τους τομείς μπορεί να διευκολύνει την καθημερινή ζωή των ανθρώπων και να τους βηθήσει να ζουν σε πιο ασφαλείς κοινωνίες. Όμως, για να μπορέσουν αυτού του είδους οι εφαρμογές να λειτουργήσουν ορθά, απαιτείται η φωτογραφική λήψη προσώπων μεγάλης καθαρότητας και ευκρίνειας. Αυτή η απαίτηση είναι κάτι παραπάνω από δύσκολο να ικανοποιηθεί στις πραγματικές συνθήκες διαβίωσης. Occlusions όπως γυαλιά μυωπίας, γυαλιά ηλίου, μάσκες προσώπου, φουλάρια, χέρια κ.ά. προκαλούν σοβαρές αλλοιώσεις στις φωτογραφίες με πρόσωπα και αποδυναμώνουν την απόδοση της ταυτοποίησης προσώπου, από τις αντίστοιχες εφαρμογές. Παρόλο που ορισμένοι αλγόριθμοι μπορούν να διαχειριστούν την αναγνώριση προσώπου με occlusion, εξακολουθούν να υφίστανται μείωση στην απόδοσή τους εξαιτίας της έκτασης του occlusion. Επομένως, η αφαίρεση των occlusions από τις εικόνες με πρόσωπα είναι μια πολύ σημαντική, αλλά και απαιτητική εργασία. Η δυσκολία της οφείλεται στο γεγονός, ότι μια μέθοδος ανακατασκευής πρέπει να βρει κάποιον τρόπο, ώστε να αποκαταστήσει τα occluded μέρη του προσώπου σε μια μη occluded μορφή, στοχεύοντας στην παραγωγή ενός καθαρού προσώπου. Όπως γνωρίζουμε, τα ανθρώπινα πρόσωπα έχουν παρόμοιο σχήμα και μέγεθος σε γενικές γραμμές. Ωστόσο, ορισμένα χαρακτηριστικά μπορεί να διαφέρουν πολύ με βάση την φυλή, το γένος και την ηλικία τους. Αυτές οι λεπτομέρεις αυξάνουν ακόμα περισσότερο το βαθμό δυσκολίας της διαδικασίας αποκατάστασης του προσώπου. Ο σκοπός αυτής της Πτυχιακής Μελέτης είναι η αποκατάσταση occluded εικόνων με πρόσωπα σε μια μη occluded μορφή, ώστε να διευκολυνθεί η ταυτοποίησή τους. Για να το πετύχουμε αυτό, διερευνούμε ένα πλήθος από μοντέλα, ειδικευμένα στην ανάπλαση του προσώπου και τα αξιολογούμε με βάση την απόδοσή τους στην αναγνώριση προσώπου. Τα μοντέλα στηρίζονται σε δύο κυρίαρχες μεθοδολογίες της ανάπλασης προσώπου. Η πρώτη, επιτηρούμενη μεθοδολογία, γνωστή ως Generative Landmark Guided Face Inpainting (ή LaFIn) αξιοποιεί μερικά από τα πιο καινοτόμα και υπερσύγχρονα εργαλεία στο πεδίο της μηχανικής μάθησης, τα βαθειά νευρωνικά δίκτυα. Η αρχιτεκτονική του LaFIn επωφελείται από την ενσωμάτωση των διακριτών σημείων του προσώπου και επιτυγχάνει την επιθυμητή αποκατάστασή του. Η δεύτερη, μη επιτηρούμενη μέθοδος γνωστή ως Principal Component Pursuit using Side Information, Features and Missing Values (ή PCPSFM) είναι μια γενίκευση της διάσημης μεθόδου Robust Principal Component Analysis (RPCA). Η PCPSFM αξιοποιεί την προϋπάρχουσα γνώση και καταφέρνει να ανακτήσει έναν πίνακα L0, χαμηλού βαθμού, ο οποίος περιέχει το αναπλασμένο πρόσωπο. Ταυτόχρονα, απομονώνει τα occlusions σε έναν ξεχωριστό, αραιό πίνακα S0. Για να αξιολογήσουμε τις προτεινόμενες μεθόδους, δουλέψαμε σε ένα τμήμα του δημοφιλούς συνόλου δεδομένων CelebA, το οποίο περιέχει τις αναπαραστάσεις των προσώπων διάφορων διάσημων προσωπικοτήτων. Για τα πειράματά μας, δημιουργήσαμε occlusions διαφορετικών μεγεθών και σχημάτων, ώστε να αξιολογήσουμε τα μοντέλα υπό πολλαπλές συνθήκες. Όσον αφορά την διαδικασία αξιολόγησης, χρησιμοποιήθηκαν τρία διαφορετικά μοντέλα, που προσπαθούν να εντοπίσουν την κυρίαρχη μεθόδο ανάπλασης, με βάση το ποσοστό των επιτυχημένων ταιριασμάτων μεταξύ των αναπλασμένων και των καθαρών προσώπων όλων των διάσημων προσωπικοτήτων, που εμπεριέχονται στο σύνολο δεδομένων.Human face is probably the most characteristic identifier in every aspect of a person’s life. In modern times, the development of cameras and digital electronics, has led to a non-stop generation and collection of face images enabling applications in numerous fields, like education, health, gaming, security, criminal and forensic investigation. It’s obvious, that the progress in these fields can facilitate people’s daily life and help them live in more secure societies. In order for this kind of applications to function properly, though, faces of high clearness and sharpness are required to be captured. This request is far from easy to satisfy in real world conditions. Occlusions such as eyeglasses, sunglasses, face masks, scarves, hands and more, cause serious corruptions to the face images and weaken the identification performance of face-related applications. Although some algorithms can handle face recognition with occlusion, they still suffer from performance degradation due to occlusion’s extent. Therefore, the removal of occlusions in face images is a very important, yet challenging task. The difficulty of the task lies in the fact that, a reconstruction method has to find a way to restore the occluded face parts to a non-occluded form, aiming to the generation of a clean face. As we know, human faces have similar shapes and appearances in general. However, the feature details may differ substantially among people depending on their race, gender and age. These details are the ones that raise even more the degree of difficulty of the face restoration procedure. The objective of this thesis is the restoration of occluded face images to a nonoccluded form, in order to facilitate their identification. To achieve that, we investigate a number of inpainting models and we evaluate them on face recognition task. The models are based on two principal face inpainting methodologies. The first, supervised method, known as Generative Landmark Guided Face Inpainting (or LaFIn) exploits some of the most innovative and state-of-the-art tools, in the machine learning field, the deep neural networks. LaFIn’s architecture benefits from the integration of facial landmarks and accomplishes the desired face restoration. The second, unsupervised method known as Principal Component Pursuit using Side Information, Features and Missing Values (or PCPSFM) is a variation of the famous Robust Principal Component Analysis (RPCA) method. PCPSFM utilizes domain dependent prior knowledge and manages to recover a lowrank matrix L0, containing the inpainted face. At the same time, it isolates the occlusions in a separate, sparse matrix S0. To evaluate the proposed methods, we worked on a portion of the popular CelebA dataset, which contains face representations of numerous celebrities. For the purpose of our experiments, we created occlusions of different sizes and shapes, in order to test the models under multiple scenarios. Concerning the evaluation process, three different models were employed to detect the dominant inpainting method, based on the percentage of successful matches between the inpainted faces and the clean faces of all the celebrity identities in the dataset

Pergamos : Unified Institutional Repository / Digital Library Platform of the National and Kapodistrian University of Athens