Search CORE

154 research outputs found

A Spatial Shape Constrained Clustering Method for Mammographic Mass Segmentation

Author: Ai-Ze Cao
Jian-Yong Lou
Xu-Lei Yang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Segmentation d'Images solaires en Extrême Ultraviolet par une Approche Classification floue Multispectrale

Author: BARRA Vincent
Delouille Véronique
Hochedez Jean-François
Publication venue: HAL CCSD
Publication date: 02/04/2007
Field of study

L'étude de la variabilité de la couronne solaire et le suivi de régions caractéristiques à sa surface (régions actives, trous coronaux) sont d'une importance capitale en astrophysique et pour le développement de la météorologie de l'espace. Dans ce cadre, nous proposons un algorithme de segmentation multispectrale d'images du Soleil acquises en extrême ultraviolet, utilisant un algorithme de classification flou spatialement contraint. L'utilisation de la logique floue permet de prendre en compte les imprécisions et les incertitudes inhérentes à la définition des différentes régions d'intérêt dans l'image. La méthode est appliquée sur des images prises par le téléscope EIT du satellite SoHO, depuis janvier 1997 jusque mai 2005, couvrant ainsi presque l'intégralité d'un cycle solaire. Les résultats en terme de caractérisation géométrique et radiométrique des régions actives et des trous coronaux sont en accord avec d'autres observations menées par ailleurs. La méthode met de plus en évidence des périodes dans la série temporelle étudiée, reliées à des phénomènes de physique solaire connus

HAL Clermont Université

Statistical facial feature extraction and lip segmentation

Author: Yılmaz Mustafa Berkay
Yilmaz Mustafa Berkay
Publication venue
Publication date: 01/01/2009
Field of study

Facial features such as lip corners, eye corners and nose tip are critical points in a human face. Robust extraction of such facial feature locations is an important problem which is used in a wide range of applications including audio-visual speech recognition, human-computer interaction, emotion recognition, fatigue detection and gesture recognition. In this thesis, we develop a probabilistic method for facial feature extraction. This technique is able to automatically learn location and texture information of facial features from a training set. Facial feature locations are extracted from face regions using joint distributions of locations and textures represented with mixtures of Gaussians. This formulation results in a maximum likelihood (ML) optimization problem which can be solved using either a gradient ascent or Newton type algorithm. Extracted lip corner locations are then used to initialize a lip segmentation algorithm to extract the lip contours. We develop a level-set based method that utilizes adaptive color distributions and shape priors for lip segmentation. More precisely, an implicit curve representation which learns the color information of lip and non-lip points from a training set is employed. The model can adapt itself to the image of interest using a coarse elliptical region. Extracted lip contour provides detailed information about the lip shape. Both methods are tested using different databases for facial feature extraction and lip segmentation. It is shown that the proposed methods achieve better results compared to conventional methods. Our facial feature extraction method outperforms the active appearance models in terms of pixel errors, while our lip segmentation method outperforms region based level-set curve evolutions in terms of precision and recall results

Sabanci University Research Database

Adaptive threshold optimisation for colour-based lip segmentation in automatic lip-reading systems

Author: Gritzman Ashley Daniel
Publication venue
Publication date: 01/01/2016
Field of study

A thesis submitted to the Faculty of Engineering and the Built Environment, University of the Witwatersrand, Johannesburg, in ful lment of the requirements for the degree of Doctor of Philosophy. Johannesburg, September 2016Having survived the ordeal of a laryngectomy, the patient must come to terms with the resulting loss of speech. With recent advances in portable computing power, automatic lip-reading (ALR) may become a viable approach to voice restoration. This thesis addresses the image processing aspect of ALR, and focuses three contributions to colour-based lip segmentation. The rst contribution concerns the colour transform to enhance the contrast between the lips and skin. This thesis presents the most comprehensive study to date by measuring the overlap between lip and skin histograms for 33 di erent colour transforms. The hue component of HSV obtains the lowest overlap of 6:15%, and results show that selecting the correct transform can increase the segmentation accuracy by up to three times. The second contribution is the development of a new lip segmentation algorithm that utilises the best colour transforms from the comparative study. The algorithm is tested on 895 images and achieves percentage overlap (OL) of 92:23% and segmentation error (SE) of 7:39 %. The third contribution focuses on the impact of the histogram threshold on the segmentation accuracy, and introduces a novel technique called Adaptive Threshold Optimisation (ATO) to select a better threshold value. The rst stage of ATO incorporates -SVR to train the lip shape model. ATO then uses feedback of shape information to validate and optimise the threshold. After applying ATO, the SE decreases from 7:65% to 6:50%, corresponding to an absolute improvement of 1:15 pp or relative improvement of 15:1%. While this thesis concerns lip segmentation in particular, ATO is a threshold selection technique that can be used in various segmentation applications.MT201

Wits Institutional Repository on DSPACE

Face Detection And Lip Localization

Author: Husain Benafsh Nadir
Publication venue: DigitalCommons@CalPoly
Publication date: 01/08/2011
Field of study

Integration of audio and video signals for automatic speech recognition has become an important field of study. The Audio-Visual Speech Recognition (AVSR) system is known to have accuracy higher than audio-only or visual-only system. The research focused on the visual front end and has been centered around lip segmentation. Experiments performed for lip feature extraction were mainly done in constrained environment with controlled background noise. In this thesis we focus our attention to a database collected in the environment of a moving car which hampered the quality of the imagery. We first introduce the concept of illumination compensation, where we try to reduce the dependency of light from over- or under-exposed images. As a precursor to lip segmentation, we focus on a robust face detection technique which reaches an accuracy of 95%. We have detailed and compared three different face detection techniques and found a successful way of concatenating them in order to increase the overall accuracy. One of the detection techniques used was the object detection algorithm proposed by Viola-Jones. We have experimented with different color spaces using the Viola-Jones algorithm and have reached interesting conclusions. Following face detection we implement a lip localization algorithm based on the vertical gradients of hybrid equations of color. Despite the challenging background and image quality, success rate of 88% was achieved for lip segmentation

DigitalCommons@CalPoly

Lip region feature extraction analysis by means of stochastic variability modeling

Author: Cárdenas Peña David Augusto
Publication venue
Publication date: 01/01/2011
Field of study

En este trabajo de tesis se analizaron diferentes técnicas de caracterización de la región labial, usadas para modelar la dinámica labial. Para llevar a cabo este análisis, se construyó ´o una base de datos de secuencias de video de la pronunciación del alfabeto español. Esta base de datos se utilizo para entrenar un sistema de reconocimiento visual del habla usando diferentes metodologías de extracción de características. El objetivo del experimento es evaluar la habilidad de cada conjunto de características para modelar adecuadamente el movimiento labial. Se probaron metodologías basadas en apariencia, forma y una representación espacio-temporal. Los resultados reportados permiten seleccionar las características espacio-temporales como los mejores descriptores, dentro de los evaluados, de la dinámica visual del habla / Abstract: On this thesis work, an analysis of lip region characterization techniques used to model lip dynamics was performed. To carry out the analysis a video sequence database of Spanish alphabet was built and used to train a visual speech recognition system with several feature extraction methodologies. The aim of the experiment is to evaluate the ability of each feature set to model accurately lip movement. Appearance based, shape-based and spatiotemporal-based feature extraction methodologies were tested. Reported results let choose the spatiotemporal features as the best descriptors for visual speech dynamicsMaestrí

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Nacional De Colombia - Repositorio Institucional UN

SVM based ASM for facial landmarks location

Author: Du C
Wu Q
Wu Z
Yang J
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/09/2008
Field of study

Finding a new position for each landmark is a crucial step in active shape model (ASM). Mahalanobis distance minimization is used for this finding, provided there are enough training data such that the grey-level profiles for each landmark follow a multivariate Gaussian distribution. However, this condition could not be satisfied in most cases. In this paper, a new method support vector machine (SVM) based ASM (SVMBASM) is proposed. It approaches the finding task as a small sample size classification problem, and uses SVM classifier to deal with this problem. Moreover, considering imbalanced dataset which contains more negative instances(incorrect candidates for new position) than positive instances(correct candidates for new position), a multi-class classification framework is adopted. Performance evaluation on SJTU face database show that the proposed SVMBASM outperforms the original ASM in terms of the average error as well as the average frequency of convergence. © 2008 IEEE

OPUS - University of Technology Sydney

TÉCNICAS COMPUTACIONALES PARA LA REDUCCIÓN DEL ESPACIO DE COLOR EN IMÁGENES DIGITALES: UNA REVISIÓN DEL ESTADO DEL ARTE

Author: Bocanegra Jose
Sanchez Miguel
Toledo Kelly Johanna
Publication venue: Ingenierías & Amazonia
Publication date: 13/11/2018
Field of study

Las imágenes digitales representadas en modelos RGB almacenan grandes cantidades de información. No obstante, para realizar el procesamiento de estas imágenes se necesitan dispositivos con características especiales. Una estrategia para solventar este inconveniente es realizar una reducción del espacio de color de la imagen sin perder las características esenciales. Existen diferentes técnicas y algoritmos basados en inteligencia computacional, y más concretamente en redes neuronales y lógica difusa, que permiten la reducción del espacio de color en una imagen digital. En este artículo hacemos un análisis del estado del arte de los diferentes algoritmos y técnicas relacionadas con áreas de la inteligencia computacional para la reducción del espacio de color

Revistas Científicas de la Universidad de la Amazonia

Investigating spoken emotion : the interplay of language and facial expression

Author: Chong Cheeseng
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2019
Field of study

This thesis aims to investigate how spoken expressions of emotions are influenced by the characteristics of spoken language and the facial emotion expression. The first three chapters examined how production and perception of emotions differed between Cantonese (tone language) and English (non-tone language). The rationale for this contrast was that the acoustic property of Fundamental Frequency (F0) may be used differently in the production and perception of spoken expressions in tone languages as F0 may be preserved as a linguistic resource for the production of lexical tones. To test this idea, I first developed the Cantonese Audio-visual Emotional Speech (CAVES) database, which was then used as stimuli in all the studies presented in this thesis (Chapter 1). An emotion perception study was then conducted to examine how three groups of participants (Australian English, Malaysian Malay and Hong Kong Cantonese speakers) identified spoken expression of emotions that were produced in either English or Cantonese (Chapter 2). As one of the aims of this study was to disambiguate the effects of language from culture, these participants were selected on the basis that they either shared similarities in language type (non-tone language, Malay and English) or culture (collectivist culture, Cantonese and Malay). The results showed that a greater similarity in emotion perception was observed between those who spoke a similar type of language, as opposed to those who shared a similar culture. This suggests some intergroup differences in emotion perception may be attributable to cross-language differences. Following up on these findings, an acoustic analysis study (Chapter 3) showed that compared to English spoken expression of emotions, Cantonese expressions had less F0 related cues (median and flatter F0 contour) and also the use of F0 cues was different. Taken together, these results show that language characteristics (n F0 usage) interact with the production and perception of spoken expression of emotions. The expression of disgust was used to investigate how facial expressions of emotions affect speech articulation. The rationale for selecting disgust was that the facial expression of disgust involves changes to the mouth region such as closure and retraction of the lips, and these changes are likely to have an impact on speech articulation. To test this idea, an automatic lip segmentation and measurement algorithm was developed to quantify the configuration of the lips from images (Chapter 5). By comparing neutral to disgust expressive speech, the results showed that disgust expressive speech is produced with significantly smaller vertical mouth opening, greater horizontal mouth opening and lower first and second formant frequencies (F1 and F2). Overall, this thesis provides an insight into how aspects of expressive speech may be shaped by specific (language type) and universal (face emotion expression) factors

Western Sydney ResearchDirect