87 research outputs found

    Methods for fast and reliable clustering

    Get PDF

    Classification of Primary versus Metastatic Pancreatic Tumor Cells Using Multiple Biomarkers and Whole Slide Imaging

    Get PDF
    Pancreatic cancer is a challenging cancer with a high mortality rate and a 5-year survival rate between 2% to 9%. The role of biomarkers is crucial in cancer prognosis, diagnosis, and predicting the possible responses to a specific therapy. The Discovery and development of various types of biomarkers have been studied intensively in the hope of determining the best treatment approaches, better management, and possibly cure of this deadly cancer. However, metastasis, responsible for about 90% of the deaths from cancer, is still poorly understood. A few research that have investigated the expression of a particular biomarker or a panel of biomarkers in the primary and secondary (metastatic) tumor demonstrates that the expression of different biomarkers in the primary and secondary tumor sites is not necessarily the same, even though the primary and metastatic tumor cells are originated from the same organ. In this project, we aim to design a classifier to distinguish between primary and secondary tumor cells based on their uptake of different biomarkers, using immunofluorescence whole slide imaging. For this purpose, we first register consecutive images of the same slide together to be able to locate multiple biomarkers that belong to a cell and later we design our classifier based on vectors that show the presence or absence of multiple antibodies in addition to the amount of that antibody in a tumor cell. Advisor: Khalid Sayoo

    Self-organising maps : statistical analysis, treatment and applications.

    Get PDF
    This thesis presents some substantial theoretical analyses and optimal treatments of Kohonen's self-organising map (SOM) algorithm, and explores the practical application potential of the algorithm for vector quantisation, pattern classification, and image processing. It consists of two major parts. In the first part, the SOM algorithm is investigated and analysed from a statistical viewpoint. The proof of its universal convergence for any dimensionality is obtained using a novel and extended form of the Central Limit Theorem. Its feature space is shown to be an approximate multivariate Gaussian process, which will eventually converge and form a mapping, which minimises the mean-square distortion between the feature and input spaces. The diminishing effect of the initial states and implicit effects of the learning rate and neighbourhood function on its convergence and ordering are analysed and discussed. Distinct and meaningful definitions, and associated measures, of its ordering are presented in relation to map's fault-tolerance. The SOM algorithm is further enhanced by incorporating a proposed constraint, or Bayesian modification, in order to achieve optimal vector quantisation or pattern classification. The second part of this thesis addresses the task of unsupervised texture-image segmentation by means of SOM networks and model-based descriptions. A brief review of texture analysis in terms of definitions, perceptions, and approaches is given. Markov random field model-based approaches are discussed in detail. Arising from this a hierarchical self-organised segmentation structure, which consists of a local MRF parameter estimator, a SOM network, and a simple voting layer, is proposed and is shown, by theoretical analysis and practical experiment, to achieve a maximum likelihood or maximum a posteriori segmentation. A fast, simple, but efficient boundary relaxation algorithm is proposed as a post-processor to further refine the resulting segmentation. The class number validation problem in a fully unsupervised segmentation is approached by a classical, simple, and on-line minimum mean-square-error method. Experimental results indicate that this method is very efficient for texture segmentation problems. The thesis concludes with some suggestions for further work on SOM neural networks

    A perceptual learning model to discover the hierarchical latent structure of image collections

    Get PDF
    Biology has been an unparalleled source of inspiration for the work of researchers in several scientific and engineering fields including computer vision. The starting point of this thesis is the neurophysiological properties of the human early visual system, in particular, the cortical mechanism that mediates learning by exploiting information about stimuli repetition. Repetition has long been considered a fundamental correlate of skill acquisition andmemory formation in biological aswell as computational learning models. However, recent studies have shown that biological neural networks have differentways of exploiting repetition in forming memory maps. The thesis focuses on a perceptual learning mechanism called repetition suppression, which exploits the temporal distribution of neural activations to drive an efficient neural allocation for a set of stimuli. This explores the neurophysiological hypothesis that repetition suppression serves as an unsupervised perceptual learning mechanism that can drive efficient memory formation by reducing the overall size of stimuli representation while strengthening the responses of the most selective neurons. This interpretation of repetition is different from its traditional role in computational learning models mainly to induce convergence and reach training stability, without using this information to provide focus for the neural representations of the data. The first part of the thesis introduces a novel computational model with repetition suppression, which forms an unsupervised competitive systemtermed CoRe, for Competitive Repetition-suppression learning. The model is applied to generalproblems in the fields of computational intelligence and machine learning. Particular emphasis is placed on validating the model as an effective tool for the unsupervised exploration of bio-medical data. In particular, it is shown that the repetition suppression mechanism efficiently addresses the issues of automatically estimating the number of clusters within the data, as well as filtering noise and irrelevant input components in highly dimensional data, e.g. gene expression levels from DNA Microarrays. The CoRe model produces relevance estimates for the each covariate which is useful, for instance, to discover the best discriminating bio-markers. The description of the model includes a theoretical analysis using Huber’s robust statistics to show that the model is robust to outliers and noise in the data. The convergence properties of themodel also studied. It is shown that, besides its biological underpinning, the CoRe model has useful properties in terms of asymptotic behavior. By exploiting a kernel-based formulation for the CoRe learning error, a theoretically sound motivation is provided for the model’s ability to avoid local minima of its loss function. To do this a necessary and sufficient condition for global error minimization in vector quantization is generalized by extending it to distance metrics in generic Hilbert spaces. This leads to the derivation of a family of kernel-based algorithms that address the local minima issue of unsupervised vector quantization in a principled way. The experimental results show that the algorithm can achieve a consistent performance gain compared with state-of-the-art learning vector quantizers, while retaining a lower computational complexity (linear with respect to the dataset size). Bridging the gap between the low level representation of the visual content and the underlying high-level semantics is a major research issue of current interest. The second part of the thesis focuses on this problem by introducing a hierarchical and multi-resolution approach to visual content understanding. On a spatial level, CoRe learning is used to pool together the local visual patches by organizing them into perceptually meaningful intermediate structures. On the semantical level, it provides an extension of the probabilistic Latent Semantic Analysis (pLSA) model that allows discovery and organization of the visual topics into a hierarchy of aspects. The proposed hierarchical pLSA model is shown to effectively address the unsupervised discovery of relevant visual classes from pictorial collections, at the same time learning to segment the image regions containing the discovered classes. Furthermore, by drawing on a recent pLSA-based image annotation system, the hierarchical pLSA model is extended to process and representmulti-modal collections comprising textual and visual data. The results of the experimental evaluation show that the proposed model learns to attach textual labels (available only at the level of the whole image) to the discovered image regions, while increasing the precision/ recall performance with respect to flat, pLSA annotation model

    Data fusion by using machine learning and computational intelligence techniques for medical image analysis and classification

    Get PDF
    Data fusion is the process of integrating information from multiple sources to produce specific, comprehensive, unified data about an entity. Data fusion is categorized as low level, feature level and decision level. This research is focused on both investigating and developing feature- and decision-level data fusion for automated image analysis and classification. The common procedure for solving these problems can be described as: 1) process image for region of interest\u27 detection, 2) extract features from the region of interest and 3) create learning model based on the feature data. Image processing techniques were performed using edge detection, a histogram threshold and a color drop algorithm to determine the region of interest. The extracted features were low-level features, including textual, color and symmetrical features. For image analysis and classification, feature- and decision-level data fusion techniques are investigated for model learning using and integrating computational intelligence and machine learning techniques. These techniques include artificial neural networks, evolutionary algorithms, particle swarm optimization, decision tree, clustering algorithms, fuzzy logic inference, and voting algorithms. This work presents both the investigation and development of data fusion techniques for the application areas of dermoscopy skin lesion discrimination, content-based image retrieval, and graphic image type classification --Abstract, page v

    Automatic facial recognition based on facial feature analysis

    Get PDF

    Evaluation of pointer click relevance feedback in PicSOM : deliverable D1.2 of FP7 project nº 216529 PinView

    Get PDF
    This report presents the results of a series of experiments where knowledge of the most relevant part of images is given as additional information to a content-based image retrieval system. The most relevant parts have been identified by search-task-dependent pointer clicks on the images. As such they provide a rudimentary form of explicit enriched relevance feedback and to some extent mimic genuine implicit eye movement measurements which are essential ingredients of the PinView project

    Contributions to unsupervised and supervised learning with applications in digital image processing

    Get PDF
    311 p. : il.[EN]This Thesis covers a broad period of research activities with a commonthread: learning processes and its application to image processing. The twomain categories of learning algorithms, supervised and unsupervised, have beentouched across these years. The main body of initial works was devoted tounsupervised learning neural architectures, specially the Self Organizing Map.Our aim was to study its convergence properties from empirical and analyticalviewpoints.From the digital image processing point of view, we have focused on twobasic problems: Color Quantization and filter design. Both problems have beenaddressed from the context of Vector Quantization performed by CompetitiveNeural Networks. Processing of non-stationary data is an interesting paradigmthat has not been explored with Competitive Neural Networks. We have statesthe problem of Non-stationary Clustering and related Adaptive Vector Quantizationin the context of image sequence processing, where we naturally havea Frame Based Adaptive Vector Quantization. This approach deals with theproblem as a sequence of stationary almost-independent Clustering problems.We have also developed some new computational algorithms for Vector Quantizationdesign.The works on supervised learning have been sparsely distributed in time anddirection. First we worked on the use of Self Organizing Map for the independentmodeling of skin and no-skin color distributions for color based face localization. Second, we have collaborated in the realization of a supervised learning systemfor tissue segmentation in Magnetic Resonance Imaging data. Third, we haveworked on the development, implementation and experimentation with HighOrder Boltzmann Machines, which are a very different learning architecture.Finally, we have been working on the application of Sparse Bayesian Learningto a new kind of classification systems based on Dendritic Computing. This lastresearch line is an open research track at the time of writing this Thesis
    • …
    corecore