646 research outputs found
Adaptive threshold optimisation for colour-based lip segmentation in automatic lip-reading systems
A thesis submitted to the Faculty of Engineering and the Built Environment,
University of the Witwatersrand, Johannesburg, in ful lment of the requirements for
the degree of Doctor of Philosophy.
Johannesburg, September 2016Having survived the ordeal of a laryngectomy, the patient must come to terms with
the resulting loss of speech. With recent advances in portable computing power,
automatic lip-reading (ALR) may become a viable approach to voice restoration. This
thesis addresses the image processing aspect of ALR, and focuses three contributions
to colour-based lip segmentation.
The rst contribution concerns the colour transform to enhance the contrast
between the lips and skin. This thesis presents the most comprehensive study to
date by measuring the overlap between lip and skin histograms for 33 di erent
colour transforms. The hue component of HSV obtains the lowest overlap of 6:15%,
and results show that selecting the correct transform can increase the segmentation
accuracy by up to three times.
The second contribution is the development of a new lip segmentation algorithm
that utilises the best colour transforms from the comparative study. The algorithm
is tested on 895 images and achieves percentage overlap (OL) of 92:23% and segmentation
error (SE) of 7:39 %.
The third contribution focuses on the impact of the histogram threshold on the
segmentation accuracy, and introduces a novel technique called Adaptive Threshold
Optimisation (ATO) to select a better threshold value. The rst stage of ATO
incorporates -SVR to train the lip shape model. ATO then uses feedback of shape
information to validate and optimise the threshold. After applying ATO, the SE
decreases from 7:65% to 6:50%, corresponding to an absolute improvement of 1:15 pp
or relative improvement of 15:1%. While this thesis concerns lip segmentation in
particular, ATO is a threshold selection technique that can be used in various
segmentation applications.MT201
A Study of Brain Networks Associated with Swallowing Using Graph-Theoretical Approaches
Functional connectivity between brain regions during swallowing tasks is still not well understood. Understanding these complex interactions is of great interest from both a scientific and a clinical perspective. In this study, functional magnetic resonance imaging (fMRI) was utilized to study brain functional networks during voluntary saliva swallowing in twenty-two adult healthy subjects (all females, 23.1±1.52 years of age). To construct these functional connections, we computed mean partial correlation matrices over ninety brain regions for each participant. Two regions were determined to be functionally connected if their correlation was above a certain threshold. These correlation matrices were then analyzed using graph-theoretical approaches. In particular, we considered several network measures for the whole brain and for swallowing-related brain regions. The results have shown that significant pairwise functional connections were, mostly, either local and intra-hemispheric or symmetrically inter-hemispheric. Furthermore, we showed that all human brain functional network, although varying in some degree, had typical small-world properties as compared to regular networks and random networks. These properties allow information transfer within the network at a relatively high efficiency. Swallowing-related brain regions also had higher values for some of the network measures in comparison to when these measures were calculated for the whole brain. The current results warrant further investigation of graph-theoretical approaches as a potential tool for understanding the neural basis of dysphagia. © 2013 Luan et al
An edge-driven 3D region growing approach for upper airways morphology and volume evaluation in patients with Pierre Robin sequence
In this paper, a semi-automatic approach for segmentation of the upper airways is
proposed. The implemented approach uses an edge-driven 3D region-growing algorithm to segment ROIs and 3D volume-rendering technique to reconstruct the 3D model of the upper airways. This method can be used to integrate information inside a medical decision support system, making it possible to enhance medical evaluation. The effectiveness of the proposed segmentation approach was evaluated using Jaccard (92.1733%) and dice (94.6441%) similarity indices and specificity (96.8895%) and sensitivity (97.6682%) rates.
The proposed method achieved an average computation time reduced by a 16x factor with respect to manual segmentation
OBIA System for Identifying Mesoscale Oceanic Structures in SeaWiFS and MODIS-Aqua Images
The ocean covers over 70% of the surface of our planet and plays a key role in the global climate. Most ocean circulation is mesoscale (scales of 50–500 km and 10–100 days), and the energy in mesoscale circulation is at least one order of magnitude greater than general circulation; therefore, the study of mesoscale oceanic structures (MOS) is crucial to ocean dynamics, making it especially useful for analyzing global changes. The detection of MOS, such as upwellings or eddies, from satellites images is significant for marine environmental studies and coastal resource management. In this paper, we present an object-based image analysis (OBIA) system which segments and classifies regions contained in sea-viewing field-of-view sensor (SeaWiFS) and Moderate Resolution Imaging Spectro-radiometer (MODIS)-Aqua sensor satellite images into MOS. After color clustering and hierarchical data format (HDF) file processing, the OBIA system segments images and extracts image descriptors, producing primary regions. Then, it merges regions, recalculating image descriptors for MOS identification and definition. First, regions are labeled by a human-expert, who identifies MOS: upwellings, eddies, cool, and warm eddies. Labeled regions are then classified by learning algorithms (i.e., decision tree, Bayesian network, artificial neural network, genetic algorithm, and near neighbor algorithm) from selected features. Finally, the OBIA system enables images to be queried from the user interface and retrieved by means of fuzzy descriptors and oceanic structures. We tested our system with images from the Canary Islands and the North West African coast
Advances in automated tongue diagnosis techniques
This paper reviews the recent advances in a significant constituent of traditional oriental medicinal technology, called tongue diagnosis. Tongue diagnosis can be an effective, noninvasive method to perform an auxiliary diagnosis any time anywhere, which can support the global need in the primary healthcare system. This work explores the literature to evaluate the works done on the various aspects of computerized tongue diagnosis, namely preprocessing, tongue detection, segmentation, feature extraction, tongue analysis, especially in traditional Chinese medicine (TCM). In spite of huge volume of work done on automatic tongue diagnosis (ATD), there is a lack of adequate survey, especially to combine it with the current diagnosis trends. This paper studies the merits, capabilities, and associated research gaps in current works on ATD systems. After exploring the algorithms used in tongue diagnosis, the current trend and global requirements in health domain motivates us to propose a conceptual framework for the automated tongue diagnostic system on mobile enabled platform. This framework will be able to connect tongue diagnosis with the future point-of-care health system
Brain Tumor Detection and Segmentation in Multisequence MRI
Tato práce se zabĂ˝vá detekcĂ a segmentacĂ mozkovĂ©ho nádoru v multisekvenÄŤnĂch MR obrazech se zaměřenĂm na gliomy vysokĂ©ho a nĂzkĂ©ho stupnÄ› malignity. Jsou zde pro tento účel navrĹľeny tĹ™i metody. PrvnĂ metoda se zabĂ˝vá detekcĂ prezence částĂ mozkovĂ©ho nádoru v axiálnĂch a koronárnĂch Ĺ™ezech. Jedná se o algoritmus zaloĹľenĂ˝ na analĂ˝ze symetrie pĹ™i rĹŻznĂ˝ch rozlišenĂch obrazu, kterĂ˝ byl otestován na T1, T2, T1C a FLAIR obrazech. Druhá metoda se zabĂ˝vá extrakcĂ oblasti celĂ©ho mozkovĂ©ho nádoru, zahrnujĂcĂ oblast jádra tumoru a edĂ©mu, ve FLAIR a T2 obrazech. Metoda je schopna extrahovat mozkovĂ˝ nádor z 2D i 3D obrazĹŻ. Je zde opÄ›t vyuĹľita analĂ˝za symetrie, která je následována automatickĂ˝m stanovenĂm intenzitnĂho prahu z nejvĂce asymetrickĂ˝ch částĂ. TĹ™etĂ metoda je zaloĹľena na predikci lokálnĂ struktury a je schopna segmentovat celou oblast nádoru, jeho jádro i jeho aktivnà část. Metoda vyuĹľĂvá faktu, Ĺľe vÄ›tšina lĂ©kaĹ™skĂ˝ch obrazĹŻ vykazuje vysokou podobnost intenzit sousednĂch pixelĹŻ a silnou korelaci mezi intenzitami v rĹŻznĂ˝ch obrazovĂ˝ch modalitách. JednĂm ze zpĹŻsobĹŻ, jak s touto korelacĂ pracovat a pouĹľĂvat ji, je vyuĹľitĂ lokálnĂch obrazovĂ˝ch polĂ. Podobná korelace existuje takĂ© mezi sousednĂmi pixely v anotaci obrazu. Tento pĹ™Ăznak byl vyuĹľit v predikci lokálnĂ struktury pĹ™i lokálnĂ anotaci polĂ. Jako klasifikaÄŤnĂ algoritmus je v tĂ©to metodÄ› pouĹľita konvoluÄŤnĂ neuronová sĂĹĄ vzhledem k jejĂ známe schopnosti zacházet s korelacĂ mezi pĹ™Ăznaky. Všechny tĹ™i metody byly otestovány na veĹ™ejnĂ© databázi 254 multisekvenÄŤnĂch MR obrazech a byla dosáhnuta pĹ™esnost srovnatelná s nejmodernÄ›jšĂmi metodami v mnohem kratšĂm vĂ˝poÄŤetnĂm ÄŤase (v řádu sekund pĹ™i pouĹľitĂ˝ CPU), coĹľ poskytuje moĹľnost manuálnĂch Ăşprav pĹ™i interaktivnĂ segmetaci.This work deals with the brain tumor detection and segmentation in multisequence MR images with particular focus on high- and low-grade gliomas. Three methods are propose for this purpose. The first method deals with the presence detection of brain tumor structures in axial and coronal slices. This method is based on multi-resolution symmetry analysis and it was tested for T1, T2, T1C and FLAIR images. The second method deals with extraction of the whole brain tumor region, including tumor core and edema, in FLAIR and T2 images and is suitable to extract the whole brain tumor region from both 2D and 3D. It also uses the symmetry analysis approach which is followed by automatic determination of the intensity threshold from the most asymmetric parts. The third method is based on local structure prediction and it is able to segment the whole tumor region as well as tumor core and active tumor. This method takes the advantage of a fact that most medical images feature a high similarity in intensities of nearby pixels and a strong correlation of intensity profiles across different image modalities. One way of dealing with -- and even exploiting -- this correlation is the use of local image patches. In the same way, there is a high correlation between nearby labels in image annotation, a feature that has been used in the ``local structure prediction'' of local label patches. Convolutional neural network is chosen as a learning algorithm, as it is known to be suited for dealing with correlation between features. All three methods were evaluated on a public data set of 254 multisequence MR volumes being able to reach comparable results to state-of-the-art methods in much shorter computing time (order of seconds running on CPU) providing means, for example, to do online updates when aiming at an interactive segmentation.
The application of manifold based visual speech units for visual speech recognition
This dissertation presents a new learning-based representation that is referred to as a Visual
Speech Unit for visual speech recognition (VSR). The automated recognition of human speech using only features from the visual domain has become a significant research topic that plays an essential role in the development of many multimedia systems such as audio visual speech recognition(AVSR), mobile phone applications, human-computer interaction (HCI) and sign language recognition. The inclusion of the lip visual information is opportune since it can improve the overall accuracy of audio or hand recognition algorithms especially when such systems are operated in environments characterized by a high level of acoustic noise.
The main contribution of the work presented in this thesis is located in the development of a new learning-based representation that is referred to as Visual Speech
Unit for Visual Speech Recognition (VSR). The main components of the developed Visual Speech Recognition system are applied to: (a) segment the mouth region of
interest, (b) extract the visual features from the real time input video image and (c) to identify the visual speech units. The major difficulty associated with the VSR systems resides in the identification of the smallest elements contained in the image sequences that represent the lip movements in the visual domain.
The Visual Speech Unit concept as proposed represents an extension of the standard viseme model that is currently applied for VSR. The VSU model augments the standard viseme approach by including in this new representation not only the data associated with the articulation of the visemes but also the transitory information between consecutive
visemes. A large section of this thesis has been dedicated to analysis the performance of the new visual speech unit model when compared with that attained for standard (MPEG-
4) viseme models. Two experimental results indicate that:
1. The developed VSR system achieved 80-90% correct recognition when the system has been applied to the identification of 60 classes of VSUs, while the
recognition rate for the standard set of MPEG-4 visemes was only 62-72%.
2. 15 words are identified when VSU and viseme are employed as the visual speech element. The accuracy rate for word recognition based on VSUs is 7%-12% higher than the accuracy rate based on visemes
Comparison of Different Methods for Tissue Segmentation in Histopathological Whole-Slide Images
Tissue segmentation is an important pre-requisite for efficient and accurate
diagnostics in digital pathology. However, it is well known that whole-slide
scanners can fail in detecting all tissue regions, for example due to the
tissue type, or due to weak staining because their tissue detection algorithms
are not robust enough. In this paper, we introduce two different convolutional
neural network architectures for whole slide image segmentation to accurately
identify the tissue sections. We also compare the algorithms to a published
traditional method. We collected 54 whole slide images with differing stains
and tissue types from three laboratories to validate our algorithms. We show
that while the two methods do not differ significantly they outperform their
traditional counterpart (Jaccard index of 0.937 and 0.929 vs. 0.870, p < 0.01).Comment: Accepted for poster presentation at the IEEE International Symposium
on Biomedical Imaging (ISBI) 201
- …