2 research outputs found

    Information theoretic thresholding techniques based on particle swarm optimization.

    Get PDF
    In this dissertation, we discuss multi-level image thresholding techniques based on information theoretic entropies. In order to apply the correlation information of neighboring pixels of an image to obtain better segmentation results, we propose several multi-level thresholding models by using Gray-Level & Local-Average histogram (GLLA) and Gray-Level & Local-Variance histogram (GLLV). Firstly, a RGB color image thresholding model based on GLLA histogram and Tsallis-Havrda-Charv\u27at entropy is discussed. We validate the multi-level thresholding criterion function by using mathematical induction. For each component image, we assign the mean value from each thresholded class to obtain three segmented component images independently. Then we obtain the segmented color image by combining the three segmented component images. Secondly, we use the GLLV histogram to propose three novel entropic multi-level thresholding models based on Shannon entropy, R\u27enyi entropy and Tsallis-Havrda-Charv\u27at entropy respectively. Then we apply these models on the three components of a RGB color image to complete the RGB color image segmentation. An entropic thresholding model is mostly about searching for the optimal threshold values by maximizing or minimizing a criterion function. We apply particle swarm optimization (PSO) algorithm to search the optimal threshold values for all the models. We conduct the experiments extensively on The Berkeley Segmentation Dataset and Benchmark (BSDS300) and calculate the average four performance indices (Probability Rand Index, PRI, Global Consistency Error, GCE, Variation of Information, VOI and Boundary Displacement Error, BDE) to show the effectiveness and reasonability of the proposed models

    Facial expression recognition and intensity estimation.

    Get PDF
    Doctoral Degree. University of KwaZulu-Natal, Durban.Facial Expression is one of the profound non-verbal channels through which human emotion state is inferred from the deformation or movement of face components when facial muscles are activated. Facial Expression Recognition (FER) is one of the relevant research fields in Computer Vision (CV) and Human-Computer Interraction (HCI). Its application is not limited to: robotics, game, medical, education, security and marketing. FER consists of a wealth of information. Categorising the information into primary emotion states only limit its performance. This thesis considers investigating an approach that simultaneously predicts the emotional state of facial expression images and the corresponding degree of intensity. The task also extends to resolving FER ambiguous nature and annotation inconsistencies with a label distribution learning method that considers correlation among data. We first proposed a multi-label approach for FER and its intensity estimation using advanced machine learning techniques. According to our findings, this approach has not been considered for emotion and intensity estimation in the field before. The approach used problem transformation to present FER as a multilabel task, such that every facial expression image has unique emotion information alongside the corresponding degree of intensity at which the emotion is displayed. A Convolutional Neural Network (CNN) with a sigmoid function at the final layer is the classifier for the model. The model termed ML-CNN (Multilabel Convolutional Neural Network) successfully achieve concurrent prediction of emotion and intensity estimation. ML-CNN prediction is challenged with overfitting and intraclass and interclass variations. We employ Visual Geometric Graphics-16 (VGG-16) pretrained network to resolve the overfitting challenge and the aggregation of island loss and binary cross-entropy loss to minimise the effect of intraclass and interclass variations. The enhanced ML-CNN model shows promising results and outstanding performance than other standard multilabel algorithms. Finally, we approach data annotation inconsistency and ambiguity in FER data using isomap manifold learning with Graph Convolutional Networks (GCN). The GCN uses the distance along the isomap manifold as the edge weight, which appropriately models the similarity between adjacent nodes for emotion predictions. The proposed method produces a promising result in comparison with the state-of-the-art methods.Author's List of Publication is on page xi of this thesis
    corecore