Search CORE

21 research outputs found

Firefly-based Facial Expression Recognition

Author: He Mengda
Mistry Kamlesh
Zeng Yifeng
Zhang Li
Publication venue
Publication date: 08/05/2017
Field of study

Teeside University's Research Repository

Facial Expression Recognition using a Firefly-based Feature Optimization

Author: He Mengda
Mistry Kamlesh
Sexton G
Zeng Yifeng
Zhang Li
Publication venue
Publication date: 05/06/2017
Field of study

Teeside University's Research Repository

RankTrace : relative and unbounded affect annotation

Author: International Conference on Affective Computing and Intelligent Interaction (ACII) 2017
Liapis Antonios
Lopes Phil
Yannakakis Georgios N.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2017
Field of study

How should annotation data be processed so that it can best characterize the ground truth of affect? This paper attempts to address this critical question by testing various methods of processing annotation data on their ability to capture phasic elements of skin conductance. Towards this goal the paper introduces a new affect annotation tool, RankTrace, that allows for the annotation of affect in a continuous yet unbounded fashion. RankTrace is tested on first-person annotation lines (traces) of tension elicited from a horror video game. The key findings of the paper suggest that the relative processing of traces via their mean gradient yields the best and most robust predictors of phasic manifestations of skin conductance.peer-reviewe

OAR@UM

Deep recurrent neural networks with attention mechanisms for respiratory anomaly classification.

Author: Mistry Kamlesh
Wall Conor
Yu Yonghong
Zhang Li
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/07/2021
Field of study

In recent years, a variety of deep learning techniques and methods have been adopted to provide AI solutions to issues within the medical field, with one specific area being audio-based classification of medical datasets. This research aims to create a novel deep learning architecture for this purpose, with a variety of different layer structures implemented for undertaking audio classification. Specifically, bidirectional Long Short-Term Memory (BiLSTM) and Gated Recurrent Units (GRU) networks in conjunction with an attention mechanism, are implemented in this research for chronic and non-chronic lung disease and COVID-19 diagnosis. We employ two audio datasets, i.e. the Respiratory Sound and the Coswara datasets, to evaluate the proposed model architectures pertaining to lung disease classification. The Respiratory Sound Database contains audio data with respect to lung conditions such as Chronic Obstructive Pulmonary Disease (COPD) and asthma, while the Coswara dataset contains coughing audio samples associated with COVID-19. After a comprehensive evaluation and experimentation process, as the most performant architecture, the proposed attention BiLSTM network (A-BiLSTM) achieves accuracy rates of 96.2% and 96.8% for the Respiratory Sound and the Coswara datasets, respectively. Our research indicates that the implementation of the BiLSTM and attention mechanism was effective in improving performance for undertaking audio classification with respect to various lung condition diagnoses

Open Access Institutional Repository at Robert Gordon University

A Multi-Population FA for Automatic Facial Emotion Recognition

Author: Iqbal Sadaf
Joy Colin Paul
Mistry Kamlesh
Rizvi Baqar
Rook Chris
Zhang Li
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/07/2020
Field of study

Automatic facial emotion recognition system is popular in various domains such as health care, surveillance and human-robot interaction. In this paper we present a novel multi-population FA for automatic facial emotion recognition. The overall system is equipped with horizontal vertical neighborhood local binary patterns (hvnLBP) for feature extraction, a novel multi-population FA for feature selection and diverse classifiers for emotion recognition. First, we extract features using hvnLBP, which are robust to illumination changes, scaling and rotation variations. Then, a novel FA variant is proposed to further select most important and emotion specific features. These selected features are used as input to the classifier to further classify seven basic emotions. The proposed system is evaluated with multiple facial expression datasets and also compared with other state-of-the-art models

Northumbria University Research Portal

Failure Mode Identification of Elastomer for Well Completion Systems using Mask R-CNN

Author: Chen Weihang
Jiang Ming
Zhang Li
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/07/2022
Field of study

Royal Holloway - Pure

Mask R-CNN Transfer Learning Variants for Multi-Organ Medical Image Segmentation

Author: Lem Hongjian
Zhang Li
Publication venue
Publication date: 29/01/2024
Field of study

Medical abdomen image segmentation is a challenging task owing to discernible characteristics of the tumour against other organs. As an effective image segmenter, Mask R-CNN has been employed in many medical imaging applications, e.g. for segmenting nucleus from cytoplasm for leukaemia diagnosis and skin lesion segmentation. Motivated by such existing studies, this research takes advantage of the strengths of Mask R-CNN in leveraging on pre-trained CNN architectures such as ResNet and proposes three variants of Mask R-CNN for multi-organ medical image segmentation. Specifically, we propose three variants of the Mask R-CNN transfer learning model successively, each with a set of configurations modified from the one preceding. To be specific, the three variants are (1) the traditional transfer learning with customized loss functions with comparatively more weightage on the segmentation performance, (2) transfer learning based on Mask R-CNN with deepened re-trained layers instead of only the last two/three layers as in traditional transfer learning, and (3) the fine-tuning of Mask R-CNN with expansion of the Region of Interest pooling sizes. Evaluating using Beyond-the-Cranial-Vault (BTCV) abdominal dataset, a well-established benchmark for multi-organ medical image segmentation, the three proposed variants of Mask R-CNN obtain promising performances. In particular, the empirical results indicate the effectiveness of the proposed adapted loss functions, the deepened transfer learning process, as well as the expansion of the RoI pooling sizes. Such variations account for the great efficiency of the proposed transfer learning variant schemes for undertaking multi-organ image segmentation tasks

Royal Holloway - Pure

A Deep Learning Based Wearable Healthcare Iot Device for AI-Enabled Hearing Assistance Automation

Author: Jiang Richard
Liu Han
Wall Conor
Young Fraser
Zhang Li
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

With the recent booming of artificial intelligence (AI), particularly deep learning techniques, digital healthcare is one of the prevalent areas that could gain benefits from AI-enabled functionality. This research presents a novel AI-enabled Internet of Things (IoT) device operating from the ESP-8266 platform capable of assisting those who suffer from impairment of hearing or deafness to communicate with others in conversations. In the proposed solution, a server application is created that leverages Google's online speech recognition service to convert the received conversations into texts, then deployed to a micro-display attached to the glasses to display the conversation contents to deaf people, to enable and assist conversation as normal with the general population. Furthermore, in order to raise alert of traffic or dangerous scenarios, an 'urban-emergency' classifier is developed using a deep learning model, Inception-v4, with transfer learning to detect/recognize alerting/alarming sounds, such as a horn sound or a fire alarm, with texts generated to alert the prospective user. The training of Inception-v4 was carried out on a consumer desktop PC and then implemented into the AI-based IoT application. The empirical results indicate that the developed prototype system achieves an accuracy rate of 92% for sound recognition and classification with real-time performance

arXiv.org e-Print Archive

Northumbria University Research Portal

Lancaster E-Prints

Intelligent facial emotion recognition using moth-firefly optimization

Author: Lim Chee Peng
Mistry Kamlesh
Neoh Siew Chin
Zhang Li
Publication venue: 'Elsevier BV'
Publication date: 01/11/2016
Field of study

In this research, we propose a facial expression recognition system with a variant of evolutionary firefly algorithm for feature optimization. First of all, a modified Local Binary Pattern descriptor is proposed to produce an initial discriminative face representation. A variant of the firefly algorithm is proposed to perform feature optimization. The proposed evolutionary firefly algorithm exploits the spiral search behaviour of moths and attractiveness search actions of fireflies to mitigate premature convergence of the Levy-flight firefly algorithm (LFA) and the moth-flame optimization (MFO) algorithm. Specifically, it employs the logarithmic spiral search capability of the moths to increase local exploitation of the fireflies, whereas in comparison with the flames in MFO, the fireflies not only represent the best solutions identified by the moths but also act as the search agents guided by the attractiveness function to increase global exploration. Simulated Annealing embedded with Levy flights is also used to increase exploitation of the most promising solution. Diverse single and ensemble classifiers are implemented for the recognition of seven expressions. Evaluated with frontal-view images extracted from CK+, JAFFE, and MMI, and 45-degree multi-view and 90-degree side-view images from BU-3DFE and MMI, respectively, our system achieves a superior performance, and outperforms other state-of-the-art feature optimization methods and related facial expression recognition models by a significant margin

Northumbria University Research Portal

Elsevier - Publisher Connector

Deakin Research Online

Crossref

Intelligent affect regression for bodily expressions using hybrid particle swarm optimization and adaptive ensembles

Author: Hossain Alamgir
Mistry Kamlesh
Neoh Siew Chin
Zhang Li
Zhang Yang
Publication venue: 'Elsevier BV'
Publication date: 01/12/2015
Field of study

his research focuses on continuous dimensional affect recognition from bodily expressions using feature optimization and adaptive regression. Both static posture and dynamic motion bodily features are extracted in this research. A hybrid particle swarm optimization (PSO) algorithm is proposed for feature selection, which overcomes premature convergence and local optimum trap encountered by conventional PSO. It integrates diverse jump-out mechanisms such as the genetic algorithm (GA) and mutation techniques of Gaussian, Cauchy and Levy distributions to balance well between convergence speed and swarm diversity, thus called GM-PSO. The proposed PSO variant employs the subswarm concept and a cooperative strategy to enable mutation mechanisms of each subswarm, i.e. the GA and the probability distributions, to work in a collaborative manner to enhance the exploration and exploitation capability of the swarm leader, sustain the population diversity and guide the search toward an ultimate global optimum. An adaptive ensemble regression model is subsequently proposed to robustly map subjects’ emotional states onto a continuous arousal–valence affective space using the identified optimized feature subsets. This regression model also shows great adaption to newly arrived bodily expression patterns to deal with data stream regression. Empirical findings indicate that the proposed hybrid PSO optimization algorithm outperforms other state-of-the-art PSO variants, conventional PSO and classic GA significantly in terms of catching global optimum and discriminative feature selection. The system achieves the best performance for the regression of arousal and valence when ensemble regression model is applied, in terms of both mean squared error (arousal: 0.054, valence: 0.08) and Pearson correlation coefficient (arousal: 0.97, valence: 0.91) and outperforms other state-of-the-art PSO-based optimization combined with ensemble regression and related bodily expression perception research by a significant margin

Northumbria University Research Portal