4 research outputs found

    Face Detection Based on Skin Color Segmentation Using Fuzzy Entropy

    Get PDF
    Face detection is the first step of any automated face recognition system. One of the most popular approaches to detect faces in color images is using a skin color segmentation scheme, which in many cases needs a proper representation of color spaces to interpret image information. In this paper, we propose a fuzzy system for detecting skin in color images, so that each color tone is assumed to be a fuzzy set. The Red, Green, and Blue (RGB), the Hue, Saturation and Value (HSV), and the YCbCr (where Y is the luminance and Cb,Cr are the chroma components) color systems are used for the development of our fuzzy design. Thus, a fuzzy three-partition entropy approach is used to calculate all of the parameters needed for the fuzzy systems, and then, a face detection method is also developed to validate the segmentation results. The results of the experiments show a correct skin detection rate between 94% and 96% for our fuzzy segmentation methods, with a false positive rate of about 0.5% in all cases. Furthermore, the average correct face detection rate is above 93%, and even when working with heterogeneous backgrounds and different light conditions, it achieves almost 88% correct detections. Thus, our method leads to accurate face detection results with low false positive and false negative rates.This work has been supported by the Ministerio de Economía y Competitividad (Spain), Project TIN2013-40982-R. Project co-financed with FEDER funds

    Modern automatic recognition technologies for visual communication tools

    Get PDF
    Общение представляет собой широкий спектр различных действий, связанных с приёмом и передачей информации. Процесс общения складывается из вербальных, паравербальных и невербальных компонентов, содержащих информационную часть передаваемого сообщения и его эмоциональную окраску соответственно. Комплексный анализ всех компонентов общения позволяет оценить не только содержательную составляющую, но и ситуативный контекст сказанного, а также выявлять дополнительные факторы, относящиеся к психическому и соматическому состоянию говорящего. Существует несколько методов передачи вербального сообщения, среди которых устная и жестовая речь. Речевые и околоречевые компоненты общения могут содержаться в различных каналах данных, таких как аудио- или видеоканалы. В данном обзоре рассматриваются системы анализа видеоданных ввиду того, что аудиоканал не способен передать ряд околоречевых компонентов общения, вносящих в передаваемое сообщение дополнительную информацию. Проводится анализ существующих баз данных статических и динамических образов и систем, разрабатываемых для распознавания вербальной составляющей в устной и жестовой речи, а также систем, оценивающих паравербальные и невербальные компоненты общения. Обозначены сложности, с которыми сталкиваются разработчики подобных баз данных и систем. Также сформулированы перспективные направления разработок, связанные в том числе с комплексным анализом всех компонентов общения с целью наиболее полной оценки передаваемого сообщения.Работа выполнена при поддержке Госпрограммы 47 ГП «Научно-технологическое развитие Российской Федерации» (2019-2030), тема 0134-2019-0006

    Development and Evaluation of Facial Gesture Recognition and Head Tracking for Assistive Technologies

    Get PDF
    Globally, the World Health Organisation estimates that there are about 1 billion people suffering from disabilities and the UK has about 10 million people suffering from neurological disabilities in particular. In extreme cases these individuals with disabilities such as Motor Neuron Disease(MND), Cerebral Palsy(CP) and Multiple Sclerosis(MS) may only be able to perform limited head movement, move their eyes or make facial gestures. The aim of this research is to investigate low-cost and reliable assistive devices using automatic gesture recognition systems that will enable the most severely disabled user to access electronic assistive technologies and communication devices thus enabling them to communicate with friends and relative. The research presented in this thesis is concerned with the detection of head movements, eye movements, and facial gestures, through the analysis of video and depth images. The proposed system, using web cameras or a RGB-D sensor coupled with computer vision and pattern recognition techniques, will have to be able to detect the movement of the user and calibrate it to facilitate communication. The system will also provide the user with the functionality of choosing the sensor to be used i.e. the web camera or the RGB-D sensor, and the interaction or switching mechanism i.e. eye blink or eyebrows movement to use. This ability to system to enable the user to select according to the user's needs would make it easier on the users as they would not have to learn how to operating the same system as their condition changes. This research aims to explore in particular the use of depth data for head movement based assistive devices and the usability of different gesture modalities as switching mechanisms. The proposed framework consists of a facial feature detection module, a head tracking module and a gesture recognition module. Techniques such as Haar-Cascade and skin detection were used to detect facial features such as the face, eyes and nose. The depth data from the RGB-D sensor was used to segment the area nearest to the sensor. Both the head tracking module and the gesture recognition module rely on the facial feature module as it provided data such as the location of the facial features. The head tracking module uses the facial feature data to calculate the centroid of the face, the distance to the sensor, the location of the eyes and the nose to detect head motion and translate it into pointer movement. The gesture detection module uses features such as the location of the eyes, the location of the pupil, the size of the pupil and calculates the interocular distance for the detection of blink or eyebrows movement to perform a click action. The research resulted in the creation of four assistive devices based on the combination of the sensors (Web Camera and RGB-D sensor) and facial gestures (Blink and Eyebrows movement): Webcam-Blink, Webcam-Eyebrows, Kinect-Blink and Kinect-Eyebrows. Another outcome of this research has been the creation of an evaluation framework based on Fitts' Law with a modified multi-directional task including a central location and a dataset consisting of both colour images and depth data of people performing head movement towards different direction and performing gestures such as eye blink, eyebrows movement and mouth movements. The devices have been tested with healthy participants. From the observed data, it was found that both Kinect-based devices have lower Movement Time and higher Index of Performance and Effective Throughput than the web camera-based devices thus showing that the introduction of the depth data has had a positive impact on the head tracking algorithm. The usability assessment survey, suggests that there is a significant difference in eye fatigue experienced by the participants; blink gesture was less tiring to the eye than eyebrows movement gesture. Also, the analysis of the gestures showed that the Index of Difficulty has a large effect on the error rates of the gesture detection and also that the smaller the Index of Difficulty the higher the error rate