653 research outputs found

    A review on automated facial nerve function assessment from visual face capture

    Get PDF

    Facial Paralysis Grading Based on Dynamic and Static Features

    Get PDF
    Peripheral facial nerve palsy, also known as facial paralysis (FP), is a common clinical disease, which requires subjective judgment and scoring based on the FP scale. There exists some automatic facial paralysis grading methods, but the current methods mostly only consider either static or dynamic features, resulting in a low accuracy rate of FP grading. This thesis proposes an automatic facial paralysis assessment method including both static and dynamic characteristics. The first step of the method performs preprocessing on the collected facial expression videos of the subjects, including rough video interception, video stabilization, keyframe extraction, image geometric normalization and gray-scale normalization. Next, the method selects as keyframes no facial expression state and maximum facial expression state in the image data to build the the research data set. Data preprocessing reduces errors, noise, redundancy and even errors in the original data. The basis for extracting static and dynamic features of an image is to use Ensemble of Regression Trees algorithm to determine 68 facial landmarks. Based on landmark points, image regions of image are formed. According to the Horn-Schunck optical flow method, the optical flow information of parts of the face are extracted, and the dynamic characteristics of the optical flow difference between the left and right parts are calculated. Finally, the results of dynamic and static feature classification are weighted and analyzed to obtain FP ratings of subjects. A 32-dimensional static feature is fed into the support vector machine for classification. A 60-dimensional feature vector of dynamical aspects is fed into a long and short-term memory network for classification. Videos of 30 subjects are used to extract 1419 keyframes to test the algorithm. The accuracy, precision, recall and f1 of the best classifier reach 93.33%, 94.29%, 91.33% and 91.87%, respectively.Perifeerinen kasvojen hermohalvaus, joka tunnetaan myös nimellä kasvojen halvaus (FP), on yleinen kliininen sairaus, joka vaatii subjektiivista arviointia ja FP -asteikon pisteytystä. Joitakin automaattisia kasvohalvauksen luokittelumenetelmiä on olemassa, mutta yleensä ottaen ne punnitsevat vain joko staattisia tai dynaamisia piirteitä. Tässä tutkielmassa ehdotetaan automaattista kasvojen halvaantumisen arviointimenetelmää, joka kattaa sekä staattiset että dynaamiset ominaisuudet. Menetelmän ensimmäinen vaihe suorittaa ensin esikäsittelyn kohteiden kerätyille kasvojen ilmevideoille, mukaan lukien karkea videon sieppaus, videon vakautus, avainruudun poiminta, kuvan geometrinen normalisointi ja harmaasävyjen normalisointi. Seuraavaksi menetelmä valitsee avainruuduiksi ilmeettömän tilan ja kasvojen ilmeiden maksimitilan kuvadatasta kerryttäen tutkimuksen data-aineiston. Tietojen esikäsittely vähentää virheitä, kohinaa, redundanssia ja jopa virheitä alkuperäisestä datasta. Kuvan staattisten ja dynaamisten piirteiden poimimisen perusta on käyttää Ensemble of Regression Trees -algoritmia 68 kasvojen merkkipisteiden määrittämiseen. Merkkipisteiden perusteella määritellään kuvan kiinnostavat alueet. Horn-Schunckin optisen virtausmenetelmän mukaisesti poimitaan optisen virtauksen tiedot joistakin kasvojen osista, ja dynaaminen luonnehdinta lasketaan vasempien ja oikeiden osien välille. Lopuksi dynaamisen ja staattisen piirteiden luokittelun tulokset painotetaan ja analysoidaan kattavasti koehenkilöiden FP-luokitusten saamiseksi. 32- ulotteinen staattisten piirteiden vektori syötetään tukivektorikoneeseen luokittelua varten. 60-ulotteinen dynaamisten piirteiden ominaisuusvektori syötetään pitkän ja lyhyen aikavälin muistiverkkoon luokittelua varten. Parhaan luokittelijan tarkkuus, täsmällisyys, palautustaso ja f1 saavuttavat arvot 93,33%, 94,29%, 91,33% ja 91,87%

    Severity scoring approach using modified optical flow method and lesion identification for facial nerve paralysis assessment

    Get PDF
    The facial nerve controls facial movement and expression. Hence, a patient with facial nerve paralysis will experience affected social interactions, psychological distress, and low self-esteem. Upon the first presentation, it is crucial to determine the severity level of the paralysis and take out the possibility of stroke or any other serious causes by recognising the type of lesion in preventing any mistreatment of the patient. Clinically, the facial nerve is assessed subjectively by observing voluntary facial movement and assigning a score based on the deductions made by the clinician. However, the results are not uniform among different examiners evaluating the same patients. This is extremely undesirable for both medical diagnostic and treatment considerations. Acknowledging the importance of this assessment, this research was conducted to develop a facial nerve assessment that can classify both the severity level of facial nerve function and also the types of facial lesion, Upper Motor Neuron (UMN) and Lower Motor Neuron (LMN), in facial regional assessment and lesion assessment, respectively. For regional assessment, two optical flow techniques, Kanade-Lucas-Tomasi (KLT) and Horn-Schunck (HS) were used in this study to determine the local and global motion information of facial features. Nevertheless, there is a problem with the original KLT which is the inability of the Eigen features to distinguish the normal and patient subjects. Thus, the KLT method was modified by introducing polygonal measurements and the landmarks were placed on each facial region. Similar to the HS method, the multiple frames evaluation was proposed rather than a single frame evaluation of the original HS method to avoid the differences between frames becoming too small. The features of these modified methods, Modified Local Sparse (MLS) and Modified Global Dense (MGD), were combined, namely the Combined Modified Local-Global (CMLG), to discover both local (certain region) and global (entire image) flow features. This served as the input into the k-NN classifier to assess the performance of each of them in determining the severity level of paralysis. For the lesion assessment, the Gabor filter method was used to extract the wrinkle forehead features. Thereafter, the Gabor features combined with the previous features of CMLG, by focusing only on the forehead region to evaluate both the wrinkle and motion information of the facial features. This is because, in an LMN lesion, the patient will not be able to move the forehead symmetrically during the rising eyebrows movement and unable to wrinkle the forehead due to the damaged frontalis muscle. However, the patient with a UMN lesion exhibits the same criteria as a normal subject, where the forehead is spared and can be lifted symmetrically. The CMLG technique in regional assessment showed the best performance in distinguishing between patient and normal subjects with an accuracy of 92.26% compared to that of MLS and MGD, which were 88.38% and 90.32%, respectively. From the results, some assessment tools were developed in this study namely individual score, total score and paralysis score chart which were correlated with the House-Brackmann score and validated by a medical professional with 91.30% of accuracy. In lesion assessment, the combined features of Gabor and CMLG on the forehead region depicted a greater performance in distinguishing the UMN and LMN lesion of the patient with an accuracy of 89.03% compared to Gabor alone, which was 78.07%. In conclusion, the proposed facial nerve assessment approach consisting of both regional assessment and lesion assessment is capable of determining the level of facial paralysis severity and recognising the type of facial lesion, whether it is a UMN or LMN lesion

    Local binary patterns for 1-D signal processing

    Get PDF
    Local Binary Patterns (LBP) have been used in 2-D image processing for applications such as texture segmentation and feature detection. In this paper a new 1-dimensional local binary pattern (LBP) signal processing method is presented. Speech systems such as hearing aids require fast and computationally inexpensive signal processing. The practical use of LBP based speech processing is demonstrated on two signal processing problems: - (i) signal segmentation and (ii) voice activity detection (VAD). Both applications use the underlying features extracted from the 1-D LBP. The proposed VAD algorithm demonstrates the simplicity of 1-D LBP processing with low computational complexity. It is also shown that distinct LBP features are obtained to identify the voiced and the unvoiced components of speech signal

    Deep human face analysis and modelling

    Get PDF
    Human face appearance and motion play a significant role in creating the complex social environments of human civilisation. Humans possess the capacity to perform facial analysis and come to conclusion such as the identity of individuals, understanding emotional state and diagnosing diseases. The capacity though is not universal for the entire population, where there are medical conditions such prosopagnosia and autism which can directly affect facial analysis capabilities of individuals, while other facial analysis tasks require specific traits and training to perform well. This has lead to the research of facial analysis systems within the computer vision and machine learning fields over the previous decades, where the aim is to automate many facial analysis tasks to a level similar or surpassing humans. While breakthroughs have been made in certain tasks with the emergence of deep learning methods in the recent years, new state-of-the-art results have been achieved in many computer vision and machine learning tasks. Within this thesis an investigation into the use of deep learning based methods for facial analysis systems takes place, following a review of the literature specific facial analysis tasks, methods and challenges are found which form the basis for the research findings presented. The research presented within this thesis focuses on the tasks of face detection and facial symmetry analysis specifically for the medical condition facial palsy. Firstly an initial approach to face detection and symmetry analysis is proposed using a unified multi-task Faster R-CNN framework, this method presents good accuracy on the test data sets for both tasks but also demonstrates limitations from which the remaining chapters take their inspiration. Next the Integrated Deep Model is proposed for the tasks of face detection and landmark localisation, with specific focus on false positive face detection reduction which is crucial for accurate facial feature extraction in the medical applications studied within this thesis. Evaluation of the method on the Face Detection Dataset and Benchmark and Annotated Faces in-the-Wild benchmark data sets shows a significant increase of over 50% in precision against other state-of-the-art face detection methods, while retaining a high level of recall. The task of facial symmetry and facial palsy grading are the focus of the finals chapters where both geometry-based symmetry features and 3D CNNs are applied. It is found through evaluation that both methods have validity in the grading of facial palsy. The 3D CNNs are the most accurate with an F1 score of 0.88. 3D CNNs are also capable of recognising mouth motion for both those with and without facial palsy with an F1 score of 0.82

    Models and Analysis of Vocal Emissions for Biomedical Applications

    Get PDF
    The International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications (MAVEBA) came into being in 1999 from the particularly felt need of sharing know-how, objectives and results between areas that until then seemed quite distinct such as bioengineering, medicine and singing. MAVEBA deals with all aspects concerning the study of the human voice with applications ranging from the neonate to the adult and elderly. Over the years the initial issues have grown and spread also in other aspects of research such as occupational voice disorders, neurology, rehabilitation, image and video analysis. MAVEBA takes place every two years always in Firenze, Italy. This edition celebrates twenty years of uninterrupted and succesfully research in the field of voice analysis

    3D facial model analysis for clinical medicine

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH
    corecore