646 research outputs found

    Adaptive threshold optimisation for colour-based lip segmentation in automatic lip-reading systems

    Get PDF
    A thesis submitted to the Faculty of Engineering and the Built Environment, University of the Witwatersrand, Johannesburg, in ful lment of the requirements for the degree of Doctor of Philosophy. Johannesburg, September 2016Having survived the ordeal of a laryngectomy, the patient must come to terms with the resulting loss of speech. With recent advances in portable computing power, automatic lip-reading (ALR) may become a viable approach to voice restoration. This thesis addresses the image processing aspect of ALR, and focuses three contributions to colour-based lip segmentation. The rst contribution concerns the colour transform to enhance the contrast between the lips and skin. This thesis presents the most comprehensive study to date by measuring the overlap between lip and skin histograms for 33 di erent colour transforms. The hue component of HSV obtains the lowest overlap of 6:15%, and results show that selecting the correct transform can increase the segmentation accuracy by up to three times. The second contribution is the development of a new lip segmentation algorithm that utilises the best colour transforms from the comparative study. The algorithm is tested on 895 images and achieves percentage overlap (OL) of 92:23% and segmentation error (SE) of 7:39 %. The third contribution focuses on the impact of the histogram threshold on the segmentation accuracy, and introduces a novel technique called Adaptive Threshold Optimisation (ATO) to select a better threshold value. The rst stage of ATO incorporates -SVR to train the lip shape model. ATO then uses feedback of shape information to validate and optimise the threshold. After applying ATO, the SE decreases from 7:65% to 6:50%, corresponding to an absolute improvement of 1:15 pp or relative improvement of 15:1%. While this thesis concerns lip segmentation in particular, ATO is a threshold selection technique that can be used in various segmentation applications.MT201

    A Study of Brain Networks Associated with Swallowing Using Graph-Theoretical Approaches

    Get PDF
    Functional connectivity between brain regions during swallowing tasks is still not well understood. Understanding these complex interactions is of great interest from both a scientific and a clinical perspective. In this study, functional magnetic resonance imaging (fMRI) was utilized to study brain functional networks during voluntary saliva swallowing in twenty-two adult healthy subjects (all females, 23.1±1.52 years of age). To construct these functional connections, we computed mean partial correlation matrices over ninety brain regions for each participant. Two regions were determined to be functionally connected if their correlation was above a certain threshold. These correlation matrices were then analyzed using graph-theoretical approaches. In particular, we considered several network measures for the whole brain and for swallowing-related brain regions. The results have shown that significant pairwise functional connections were, mostly, either local and intra-hemispheric or symmetrically inter-hemispheric. Furthermore, we showed that all human brain functional network, although varying in some degree, had typical small-world properties as compared to regular networks and random networks. These properties allow information transfer within the network at a relatively high efficiency. Swallowing-related brain regions also had higher values for some of the network measures in comparison to when these measures were calculated for the whole brain. The current results warrant further investigation of graph-theoretical approaches as a potential tool for understanding the neural basis of dysphagia. © 2013 Luan et al

    An edge-driven 3D region growing approach for upper airways morphology and volume evaluation in patients with Pierre Robin sequence

    Get PDF
    In this paper, a semi-automatic approach for segmentation of the upper airways is proposed. The implemented approach uses an edge-driven 3D region-growing algorithm to segment ROIs and 3D volume-rendering technique to reconstruct the 3D model of the upper airways. This method can be used to integrate information inside a medical decision support system, making it possible to enhance medical evaluation. The effectiveness of the proposed segmentation approach was evaluated using Jaccard (92.1733%) and dice (94.6441%) similarity indices and specificity (96.8895%) and sensitivity (97.6682%) rates. The proposed method achieved an average computation time reduced by a 16x factor with respect to manual segmentation

    OBIA System for Identifying Mesoscale Oceanic Structures in SeaWiFS and MODIS-Aqua Images

    Get PDF
    The ocean covers over 70% of the surface of our planet and plays a key role in the global climate. Most ocean circulation is mesoscale (scales of 50–500 km and 10–100 days), and the energy in mesoscale circulation is at least one order of magnitude greater than general circulation; therefore, the study of mesoscale oceanic structures (MOS) is crucial to ocean dynamics, making it especially useful for analyzing global changes. The detection of MOS, such as upwellings or eddies, from satellites images is significant for marine environmental studies and coastal resource management. In this paper, we present an object-based image analysis (OBIA) system which segments and classifies regions contained in sea-viewing field-of-view sensor (SeaWiFS) and Moderate Resolution Imaging Spectro-radiometer (MODIS)-Aqua sensor satellite images into MOS. After color clustering and hierarchical data format (HDF) file processing, the OBIA system segments images and extracts image descriptors, producing primary regions. Then, it merges regions, recalculating image descriptors for MOS identification and definition. First, regions are labeled by a human-expert, who identifies MOS: upwellings, eddies, cool, and warm eddies. Labeled regions are then classified by learning algorithms (i.e., decision tree, Bayesian network, artificial neural network, genetic algorithm, and near neighbor algorithm) from selected features. Finally, the OBIA system enables images to be queried from the user interface and retrieved by means of fuzzy descriptors and oceanic structures. We tested our system with images from the Canary Islands and the North West African coast

    Advances in automated tongue diagnosis techniques

    Get PDF
    This paper reviews the recent advances in a significant constituent of traditional oriental medicinal technology, called tongue diagnosis. Tongue diagnosis can be an effective, noninvasive method to perform an auxiliary diagnosis any time anywhere, which can support the global need in the primary healthcare system. This work explores the literature to evaluate the works done on the various aspects of computerized tongue diagnosis, namely preprocessing, tongue detection, segmentation, feature extraction, tongue analysis, especially in traditional Chinese medicine (TCM). In spite of huge volume of work done on automatic tongue diagnosis (ATD), there is a lack of adequate survey, especially to combine it with the current diagnosis trends. This paper studies the merits, capabilities, and associated research gaps in current works on ATD systems. After exploring the algorithms used in tongue diagnosis, the current trend and global requirements in health domain motivates us to propose a conceptual framework for the automated tongue diagnostic system on mobile enabled platform. This framework will be able to connect tongue diagnosis with the future point-of-care health system

    Brain Tumor Detection and Segmentation in Multisequence MRI

    Get PDF
    Tato práce se zabývá detekcí a segmentací mozkového nádoru v multisekvenčních MR obrazech se zaměřením na gliomy vysokého a nízkého stupně malignity. Jsou zde pro tento účel navrženy tři metody. První metoda se zabývá detekcí prezence částí mozkového nádoru v axiálních a koronárních řezech. Jedná se o algoritmus založený na analýze symetrie při různých rozlišeních obrazu, který byl otestován na T1, T2, T1C a FLAIR obrazech. Druhá metoda se zabývá extrakcí oblasti celého mozkového nádoru, zahrnující oblast jádra tumoru a edému, ve FLAIR a T2 obrazech. Metoda je schopna extrahovat mozkový nádor z 2D i 3D obrazů. Je zde opět využita analýza symetrie, která je následována automatickým stanovením intenzitního prahu z nejvíce asymetrických částí. Třetí metoda je založena na predikci lokální struktury a je schopna segmentovat celou oblast nádoru, jeho jádro i jeho aktivní část. Metoda využívá faktu, že většina lékařských obrazů vykazuje vysokou podobnost intenzit sousedních pixelů a silnou korelaci mezi intenzitami v různých obrazových modalitách. Jedním ze způsobů, jak s touto korelací pracovat a používat ji, je využití lokálních obrazových polí. Podobná korelace existuje také mezi sousedními pixely v anotaci obrazu. Tento příznak byl využit v predikci lokální struktury při lokální anotaci polí. Jako klasifikační algoritmus je v této metodě použita konvoluční neuronová síť vzhledem k její známe schopnosti zacházet s korelací mezi příznaky. Všechny tři metody byly otestovány na veřejné databázi 254 multisekvenčních MR obrazech a byla dosáhnuta přesnost srovnatelná s nejmodernějšími metodami v mnohem kratším výpočetním čase (v řádu sekund při použitý CPU), což poskytuje možnost manuálních úprav při interaktivní segmetaci.This work deals with the brain tumor detection and segmentation in multisequence MR images with particular focus on high- and low-grade gliomas. Three methods are propose for this purpose. The first method deals with the presence detection of brain tumor structures in axial and coronal slices. This method is based on multi-resolution symmetry analysis and it was tested for T1, T2, T1C and FLAIR images. The second method deals with extraction of the whole brain tumor region, including tumor core and edema, in FLAIR and T2 images and is suitable to extract the whole brain tumor region from both 2D and 3D. It also uses the symmetry analysis approach which is followed by automatic determination of the intensity threshold from the most asymmetric parts. The third method is based on local structure prediction and it is able to segment the whole tumor region as well as tumor core and active tumor. This method takes the advantage of a fact that most medical images feature a high similarity in intensities of nearby pixels and a strong correlation of intensity profiles across different image modalities. One way of dealing with -- and even exploiting -- this correlation is the use of local image patches. In the same way, there is a high correlation between nearby labels in image annotation, a feature that has been used in the ``local structure prediction'' of local label patches. Convolutional neural network is chosen as a learning algorithm, as it is known to be suited for dealing with correlation between features. All three methods were evaluated on a public data set of 254 multisequence MR volumes being able to reach comparable results to state-of-the-art methods in much shorter computing time (order of seconds running on CPU) providing means, for example, to do online updates when aiming at an interactive segmentation.

    The application of manifold based visual speech units for visual speech recognition

    Get PDF
    This dissertation presents a new learning-based representation that is referred to as a Visual Speech Unit for visual speech recognition (VSR). The automated recognition of human speech using only features from the visual domain has become a significant research topic that plays an essential role in the development of many multimedia systems such as audio visual speech recognition(AVSR), mobile phone applications, human-computer interaction (HCI) and sign language recognition. The inclusion of the lip visual information is opportune since it can improve the overall accuracy of audio or hand recognition algorithms especially when such systems are operated in environments characterized by a high level of acoustic noise. The main contribution of the work presented in this thesis is located in the development of a new learning-based representation that is referred to as Visual Speech Unit for Visual Speech Recognition (VSR). The main components of the developed Visual Speech Recognition system are applied to: (a) segment the mouth region of interest, (b) extract the visual features from the real time input video image and (c) to identify the visual speech units. The major difficulty associated with the VSR systems resides in the identification of the smallest elements contained in the image sequences that represent the lip movements in the visual domain. The Visual Speech Unit concept as proposed represents an extension of the standard viseme model that is currently applied for VSR. The VSU model augments the standard viseme approach by including in this new representation not only the data associated with the articulation of the visemes but also the transitory information between consecutive visemes. A large section of this thesis has been dedicated to analysis the performance of the new visual speech unit model when compared with that attained for standard (MPEG- 4) viseme models. Two experimental results indicate that: 1. The developed VSR system achieved 80-90% correct recognition when the system has been applied to the identification of 60 classes of VSUs, while the recognition rate for the standard set of MPEG-4 visemes was only 62-72%. 2. 15 words are identified when VSU and viseme are employed as the visual speech element. The accuracy rate for word recognition based on VSUs is 7%-12% higher than the accuracy rate based on visemes

    Comparison of Different Methods for Tissue Segmentation in Histopathological Whole-Slide Images

    Full text link
    Tissue segmentation is an important pre-requisite for efficient and accurate diagnostics in digital pathology. However, it is well known that whole-slide scanners can fail in detecting all tissue regions, for example due to the tissue type, or due to weak staining because their tissue detection algorithms are not robust enough. In this paper, we introduce two different convolutional neural network architectures for whole slide image segmentation to accurately identify the tissue sections. We also compare the algorithms to a published traditional method. We collected 54 whole slide images with differing stains and tissue types from three laboratories to validate our algorithms. We show that while the two methods do not differ significantly they outperform their traditional counterpart (Jaccard index of 0.937 and 0.929 vs. 0.870, p < 0.01).Comment: Accepted for poster presentation at the IEEE International Symposium on Biomedical Imaging (ISBI) 201
    • …
    corecore