89 research outputs found

    Supervised and unsupervised segmentation of textured images by efficient multi-level pattern classification

    Get PDF
    This thesis proposes new, efficient methodologies for supervised and unsupervised image segmentation based on texture information. For the supervised case, a technique for pixel classification based on a multi-level strategy that iteratively refines the resulting segmentation is proposed. This strategy utilizes pattern recognition methods based on prototypes (determined by clustering algorithms) and support vector machines. In order to obtain the best performance, an algorithm for automatic parameter selection and methods to reduce the computational cost associated with the segmentation process are also included. For the unsupervised case, the previous methodology is adapted by means of an initial pattern discovery stage, which allows transforming the original unsupervised problem into a supervised one. Several sets of experiments considering a wide variety of images are carried out in order to validate the developed techniques.Esta tesis propone metodologías nuevas y eficientes para segmentar imágenes a partir de información de textura en entornos supervisados y no supervisados. Para el caso supervisado, se propone una técnica basada en una estrategia de clasificación de píxeles multinivel que refina la segmentación resultante de forma iterativa. Dicha estrategia utiliza métodos de reconocimiento de patrones basados en prototipos (determinados mediante algoritmos de agrupamiento) y máquinas de vectores de soporte. Con el objetivo de obtener el mejor rendimiento, se incluyen además un algoritmo para selección automática de parámetros y métodos para reducir el coste computacional asociado al proceso de segmentación. Para el caso no supervisado, se propone una adaptación de la metodología anterior mediante una etapa inicial de descubrimiento de patrones que permite transformar el problema no supervisado en supervisado. Las técnicas desarrolladas en esta tesis se validan mediante diversos experimentos considerando una gran variedad de imágenes

    Statistical modelling for facial expression dynamics

    Get PDF
    PhDOne of the most powerful and fastest means of relaying emotions between humans are facial expressions. The ability to capture, understand and mimic those emotions and their underlying dynamics in the synthetic counterpart is a challenging task because of the complexity of human emotions, different ways of conveying them, non-linearities caused by facial feature and head motion, and the ever critical eye of the viewer. This thesis sets out to address some of the limitations of existing techniques by investigating three components of expression modelling and parameterisation framework: (1) Feature and expression manifold representation, (2) Pose estimation, and (3) Expression dynamics modelling and their parameterisation for the purpose of driving a synthetic head avatar. First, we introduce a hierarchical representation based on the Point Distribution Model (PDM). Holistic representations imply that non-linearities caused by the motion of facial features, and intrafeature correlations are implicitly embedded and hence have to be accounted for in the resulting expression space. Also such representations require large training datasets to account for all possible variations. To address those shortcomings, and to provide a basis for learning more subtle, localised variations, our representation consists of tree-like structure where a holistic root component is decomposed into leaves containing the jaw outline, each of the eye and eyebrows and the mouth. Each of the hierarchical components is modelled according to its intrinsic functionality, rather than the final, holistic expression label. Secondly, we introduce a statistical approach for capturing an underlying low-dimension expression manifold by utilising components of the previously defined hierarchical representation. As Principal Component Analysis (PCA) based approaches cannot reliably capture variations caused by large facial feature changes because of its linear nature, the underlying dynamics manifold for each of the hierarchical components is modelled using a Hierarchical Latent Variable Model (HLVM) approach. Whilst retaining PCA properties, such a model introduces a probability density model which can deal with missing or incomplete data and allows discovery of internal within cluster structures. All of the model parameters and underlying density model are automatically estimated during the training stage. We investigate the usefulness of such a model to larger and unseen datasets. Thirdly, we extend the concept of HLVM model to pose estimation to address the non-linear shape deformations and definition of the plausible pose space caused by large head motion. Since our head rarely stays still, and its movements are intrinsically connected with the way we perceive and understand the expressions, pose information is an integral part of their dynamics. The proposed 3 approach integrates into our existing hierarchical representation model. It is learned using sparse and discreetly sampled training dataset, and generalises to a larger and continuous view-sphere. Finally, we introduce a framework that models and extracts expression dynamics. In existing frameworks, explicit definition of expression intensity and pose information, is often overlooked, although usually implicitly embedded in the underlying representation. We investigate modelling of the expression dynamics based on use of static information only, and focus on its sufficiency for the task at hand. We compare a rule-based method that utilises the existing latent structure and provides a fusion of different components with holistic and Bayesian Network (BN) approaches. An Active Appearance Model (AAM) based tracker is used to extract relevant information from input sequences. Such information is subsequently used to define the parametric structure of the underlying expression dynamics. We demonstrate that such information can be utilised to animate a synthetic head avatar. Submitte

    IMAGE UNDERSTANDING OF MOLAR PREGNANCY BASED ON ANOMALIES DETECTION

    Get PDF
    Cancer occurs when normal cells grow and multiply without normal control. As the cells multiply, they form an area of abnormal cells, known as a tumour. Many tumours exhibit abnormal chromosomal segregation at cell division. These anomalies play an important role in detecting molar pregnancy cancer. Molar pregnancy, also known as hydatidiform mole, can be categorised into partial (PHM) and complete (CHM) mole, persistent gestational trophoblastic and choriocarcinoma. Hydatidiform moles are most commonly found in women under the age of 17 or over the age of 35. Hydatidiform moles can be detected by morphological and histopathological examination. Even experienced pathologists cannot easily classify between complete and partial hydatidiform moles. However, the distinction between complete and partial hydatidiform moles is important in order to recommend the appropriate treatment method. Therefore, research into molar pregnancy image analysis and understanding is critical. The hypothesis of this research project is that an anomaly detection approach to analyse molar pregnancy images can improve image analysis and classification of normal PHM and CHM villi. The primary aim of this research project is to develop a novel method, based on anomaly detection, to identify and classify anomalous villi in molar pregnancy stained images. The novel method is developed to simulate expert pathologists’ approach in diagnosis of anomalous villi. The knowledge and heuristics elicited from two expert pathologists are combined with the morphological domain knowledge of molar pregnancy, to develop a heuristic multi-neural network architecture designed to classify the villi into their appropriated anomalous types. This study confirmed that a single feature cannot give enough discriminative power for villi classification. Whereas expert pathologists consider the size and shape before textural features, this thesis demonstrated that the textural feature has a higher discriminative power than size and shape. The first heuristic-based multi-neural network, which was based on 15 elicited features, achieved an improved average accuracy of 81.2%, compared to the traditional multi-layer perceptron (80.5%); however, the recall of CHM villi class was still low (64.3%). Two further textural features, which were elicited and added to the second heuristic-based multi-neural network, have improved the average accuracy from 81.2% to 86.1% and the recall of CHM villi class from 64.3% to 73.5%. The precision of the multi-neural network II has also increased from 82.7% to 89.5% for normal villi class, from 81.3% to 84.7% for PHM villi class and from 80.8% to 86% for CHM villi class. To support pathologists to visualise the results of the segmentation, a software tool, Hydatidiform Mole Analysis Tool (HYMAT), was developed compiling the morphological and pathological data for each villus analysis

    Balance-guaranteed optimized tree with reject option for live fish recognition

    Get PDF
    This thesis investigates the computer vision application of live fish recognition, which is needed in application scenarios where manual annotation is too expensive, when there are too many underwater videos. This system can assist ecological surveillance research, e.g. computing fish population statistics in the open sea. Some pre-processing procedures are employed to improve the recognition accuracy, and then 69 types of features are extracted. These features are a combination of colour, shape and texture properties in different parts of the fish such as tail/head/top/bottom, as well as the whole fish. Then, we present a novel Balance-Guaranteed Optimized Tree with Reject option (BGOTR) for live fish recognition. It improves the normal hierarchical method by arranging more accurate classifications at a higher level and keeping the hierarchical tree balanced. BGOTR is automatically constructed based on inter-class similarities. We apply a Gaussian Mixture Model (GMM) and Bayes rule as a reject option after the hierarchical classification to evaluate the posterior probability of being a certain species to filter less confident decisions. This novel classification-rejection method cleans up decisions and rejects unknown classes. After constructing the tree architecture, a novel trajectory voting method is used to eliminate accumulated errors during hierarchical classification and, therefore, achieves better performance. The proposed BGOTR-based hierarchical classification method is applied to recognize the 15 major species of 24150 manually labelled fish images and to detect new species in an unrestricted natural environment recorded by underwater cameras in south Taiwan sea. It achieves significant improvements compared to the state-of-the-art techniques. Furthermore, the sequence of feature selection and constructing a multi-class SVM is investigated. We propose that an Individual Feature Selection (IFS) procedure can be directly exploited to the binary One-versus-One SVMs before assembling the full multiclass SVM. The IFS method selects different subsets of features for each Oneversus- One SVM inside the multiclass classifier so that each vote is optimized to discriminate the two specific classes. The proposed IFS method is tested on four different datasets comparing the performance and time cost. Experimental results demonstrate significant improvements compared to the normal Multiclass Feature Selection (MFS) method on all datasets

    Machine Learning in Image Analysis and Pattern Recognition

    Get PDF
    This book is to chart the progress in applying machine learning, including deep learning, to a broad range of image analysis and pattern recognition problems and applications. In this book, we have assembled original research articles making unique contributions to the theory, methodology and applications of machine learning in image analysis and pattern recognition

    Depth data improves non-melanoma skin lesion segmentation and diagnosis

    Get PDF
    Examining surface shape appearance by touching and observing a lesion from different points of view is a part of the clinical process for skin lesion diagnosis. Motivated by this, we hypothesise that surface shape embodies important information that serves to represent lesion identity and status. A new sensor, Dense Stereo Imaging System (DSIS) allows us to capture 1:1 aligned 3D surface data and 2D colour images simultaneously. This thesis investigates whether the extra surface shape appearance information, represented by features derived from the captured 3D data benefits skin lesion analysis, particularly on the tasks of segmentation and classification. In order to validate the contribution of 3D data to lesion identification, we compare the segmentations resulting from various combinations of images cues (e.g., colour, depth and texture) embedded in a region-based level set segmentation method. The experiments indicate that depth is complementary to colour. Adding the 3D information reduces the error rate from 7:8% to 6:6%. For the purpose of evaluating the segmentation results, we propose a novel ground truth estimation approach that incorporates a prior pattern analysis of a set of manual segmentations. The experiments on both synthetic and real data show that this method performs favourably compared to the state of the art approach STAPLE [1] on ground truth estimation. Finally, we explore the usefulness of 3D information to non-melanoma lesion diagnosis by tests on both human and computer based classifications of five lesion types. The results provide evidence for the benefit of the additional 3D information, i.e., adding the 3D-based features gives a significantly improved classification rate of 80:7% compared to only using colour features (75:3%). The three main contributions of the thesis are improved methods for lesion segmentation, non-melanoma lesion classification and lesion boundary ground-truth estimation

    Rich probabilistic models for semantic labeling

    Get PDF
    Das Ziel dieser Monographie ist es die Methoden und Anwendungen des semantischen Labelings zu erforschen. Unsere Beiträge zu diesem sich rasch entwickelten Thema sind bestimmte Aspekte der Modellierung und der Inferenz in probabilistischen Modellen und ihre Anwendungen in den interdisziplinären Bereichen der Computer Vision sowie medizinischer Bildverarbeitung und Fernerkundung

    On Improving Generalization of CNN-Based Image Classification with Delineation Maps Using the CORF Push-Pull Inhibition Operator

    Get PDF
    Deployed image classification pipelines are typically dependent on the images captured in real-world environments. This means that images might be affected by different sources of perturbations (e.g. sensor noise in low-light environments). The main challenge arises by the fact that image quality directly impacts the reliability and consistency of classification tasks. This challenge has, hence, attracted wide interest within the computer vision communities. We propose a transformation step that attempts to enhance the generalization ability of CNN models in the presence of unseen noise in the test set. Concretely, the delineation maps of given images are determined using the CORF push-pull inhibition operator. Such an operation transforms an input image into a space that is more robust to noise before being processed by a CNN. We evaluated our approach on the Fashion MNIST data set with an AlexNet model. It turned out that the proposed CORF-augmented pipeline achieved comparable results on noise-free images to those of a conventional AlexNet classification model without CORF delineation maps, but it consistently achieved significantly superior performance on test images perturbed with different levels of Gaussian and uniform noise
    corecore