369 research outputs found

    Metric Learning in Histopathological Image Classification: Opening the Black Box

    Get PDF
    The application of machine learning techniques to histopathology images enables advances in the field, providing valuable tools that can speed up and facilitate the diagnosis process. The classification of these images is a relevant aid for physicians who have to process a large number of images in long and repetitive tasks. This work proposes the adoption of metric learning that, beyond the task of classifying images, can provide additional information able to support the decision of the classification system. In particular, triplet networks have been employed to create a representation in the embedding space that gathers together images of the same class while tending to separate images with different labels. The obtained representation shows an evident separation of the classes with the possibility of evaluating the similarity and the dissimilarity among input images according to distance criteria. The model has been tested on the BreakHis dataset, a reference and largely used dataset that collects breast cancer images with eight pathology labels and four magnification levels. Our proposed classification model achieves relevant performance on the patient level, with the advantage of providing interpretable information for the obtained results, which represent a specific feature missed by the all the recent methodologies proposed for the same purpose

    Data Reduction Algorithms in Machine Learning and Data Science

    Get PDF
    Raw data are usually required to be pre-processed for better representation or discrimination of classes. This pre-processing can be done by data reduction, i.e., either reduction in dimensionality or numerosity (cardinality). Dimensionality reduction can be used for feature extraction or data visualization. Numerosity reduction is useful for ranking data points or finding the most and least important data points. This thesis proposes several algorithms for data reduction, known as dimensionality and numerosity reduction, in machine learning and data science. Dimensionality reduction tackles feature extraction and feature selection methods while numerosity reduction includes prototype selection and prototype generation approaches. This thesis focuses on feature extraction and prototype selection for data reduction. Dimensionality reduction methods can be divided into three categories, i.e., spectral, probabilistic, and neural network-based methods. The spectral methods have a geometrical point of view and are mostly reduced to the generalized eigenvalue problem. Probabilistic and network-based methods have stochastic and information theoretic foundations, respectively. Numerosity reduction methods can be divided into methods based on variance, geometry, and isolation. For dimensionality reduction, under the spectral category, I propose weighted Fisher discriminant analysis, Roweis discriminant analysis, and image quality aware embedding. I also propose quantile-quantile embedding as a probabilistic method where the distribution of embedding is chosen by the user. Backprojection, Fisher losses, and dynamic triplet sampling using Bayesian updating are other proposed methods in the neural network-based category. Backprojection is for training shallow networks with a projection-based perspective in manifold learning. Two Fisher losses are proposed for training Siamese triplet networks for increasing and decreasing the inter- and intra-class variances, respectively. Two dynamic triplet mining methods, which are based on Bayesian updating to draw triplet samples stochastically, are proposed. For numerosity reduction, principal sample analysis and instance ranking by matrix decomposition are the proposed variance-based methods; these methods rank instances using inter-/intra-class variances and matrix factorization, respectively. Curvature anomaly detection, in which the points are assumed to be the vertices of polyhedron, and isolation Mondrian forest are the proposed methods based on geometry and isolation, respectively. To assess the proposed tools developed for data reduction, I apply them to some applications in medical image analysis, image processing, and computer vision. Data reduction, used as a pre-processing tool, has different applications because it provides various ways of feature extraction and prototype selection for applying to different types of data. Dimensionality reduction extracts informative features and prototype selection selects the most informative data instances. For example, for medical image analysis, I use Fisher losses and dynamic triplet sampling for embedding histopathology image patches and demonstrating how different the tumorous cancer tissue types are from the normal ones. I also propose offline/online triplet mining using extreme distances for this embedding. In image processing and computer vision application, I propose Roweisfaces and Roweisposes for face recognition and 3D action recognition, respectively, using my proposed Roweis discriminant analysis method. I also introduce the concepts of anomaly landscape and anomaly path using the proposed curvature anomaly detection and use them to denoise images and video frames. I report extensive experiments, on different datasets, to show the effectiveness of the proposed algorithms. By experiments, I demonstrate that the proposed methods are useful for extracting informative features and instances for better accuracy, representation, prediction, class separation, data reduction, and embedding. I show that the proposed dimensionality reduction methods can extract informative features for better separation of classes. An example is obtaining an embedding space for separating cancer histopathology patches from the normal patches which helps hospitals diagnose cancers more easily in an automatic way. I also show that the proposed numerosity reduction methods are useful for ranking data instances based on their importance and reducing data volumes without a significant drop in performance of machine learning and data science algorithms

    Histopathological image analysis : a review

    Get PDF
    Over the past decade, dramatic increases in computational power and improvement in image analysis algorithms have allowed the development of powerful computer-assisted analytical approaches to radiological data. With the recent advent of whole slide digital scanners, tissue histopathology slides can now be digitized and stored in digital image form. Consequently, digitized tissue histopathology has now become amenable to the application of computerized image analysis and machine learning techniques. Analogous to the role of computer-assisted diagnosis (CAD) algorithms in medical imaging to complement the opinion of a radiologist, CAD algorithms have begun to be developed for disease detection, diagnosis, and prognosis prediction to complement the opinion of the pathologist. In this paper, we review the recent state of the art CAD technology for digitized histopathology. This paper also briefly describes the development and application of novel image analysis technology for a few specific histopathology related problems being pursued in the United States and Europe

    Embedding MRI information into MRSI data source extraction improves brain tumour delineation in animal models

    Get PDF
    Glioblastoma is the most frequent malignant intra-cranial tumour. Magnetic resonance imaging is the modality of choice in diagnosis, aggressiveness assessment, and follow-up. However, there are examples where it lacks diagnostic accuracy. Magnetic resonance spectroscopy enables the identification of molecules present in the tissue, providing a precise metabolomic signature. Previous research shows that combining imaging and spectroscopy information results in more accurate outcomes and superior diagnostic value. This study proposes a method to combine them, which builds upon a previous methodology whose main objective is to guide the extraction of sources. To this aim, prior knowledge about class-specific information is integrated into the methodology by setting the metric of a latent variable space where Non-negative Matrix Factorisation is performed. The former methodology, which only used spectroscopy and involved combining spectra from different subjects, was adapted to use selected areas of interest that arise from segmenting the T2-weighted image. Results showed that embedding imaging information into the source extraction (the proposed semi-supervised analysis) improved the quality of the tumour delineation, as compared to those obtained without this information (unsupervised analysis). Both approaches were applied to pre-clinical data, involving thirteen brain tumour-bearing mice, and tested against histopathological data. On results of twenty-eight images, the proposed Semi-Supervised Source Extraction (SSSE) method greatly outperformed the unsupervised one, as well as an alternative semi-supervised approach from the literature, with differences being statistically significant. SSSE has proven successful in the delineation of the tumour, while bringing benefits such as 1) not constricting the metabolomic-based prediction to the image-segmented area, 2) ability to deal with signal-to-noise issues, 3) opportunity to answer specific questions by allowing researchers/radiologists define areas of interest that guide the source extraction, 4) creation of an intra-subject model and avoiding contamination from inter-subject overlaps, and 5) extraction of meaningful, good-quality sources that adds interpretability, conferring validation and better understanding of each case

    Biochemical histology analysis of tissue samples by Desorption Electrospray Ionization (DESI) mass spectrometry imaging

    Get PDF
    For over 100 years, the histopathological analysis of cytology, biopsy or resection specimens has been the final step in the process of diagnosing multiple diseases, including cancer. In recent years, standard clinical care is continuously becoming more complex, and as a result, diagnostic pathology workup is also more complex and extensive. Moreover, despite being considered a gold standard in making a diagnosis, histopathological investigations can be timeconsuming. Additionally, an examination of the stained slides is subject to intra-observer error. Therefore, it is evident that some additional techniques are required to complement making a diagnosis. Desorption electrospray ionisation mass spectrometric imaging (DESI-MSI) is an emerging mass spectrometry technique with great potential in tissue analysis, especially in histological settings. DESI-MSI enables visualising the spatial distribution of lipid species across tissue sections allowing a direct correlation of the metabolomic information with the morphological features. However, this technique has always relied on frozen sections, which are not required in routine histopathology settings very often. Moreover, some embedding media, e.g. OCT, a common choice in diagnostic laboratories, have been proven not to be very well suited for MSI. The main aim of this study was to make DESI-MSI more compatible with the standard pathology procedures. Therefore, the first step was to assess OCT's impact on the quality of DESI-MSI data. The acquired data suggested that this embedding medium could be used for histopathological and mass spectrometric analyses. There were no clear polymeric signals causing differences in the negative mode data, but some reduction in intensities might be attributable to polymer-induced ion suppression. In positive mode data, the interferences due to OCT were more overt but could be negated by removing the regular peaks of the various polymeric distributions. As formalin-fixed, paraffin-embedded (FFPE) samples are the gold standard in histopathology laboratories worldwide, the next step was to optimise the pre-DESI-MSI protocol to allow the analysis of specimens that have been processed that way. A new protocol has been adapted and successfully tested on FFPE mouse and human tissue samples for tissue classification. Additionally, DESI-MSI has been used to analyse fresh-frozen and FFPE colorectal samples. 88.5% accuracy for normal samples and 91.7% for tumours was achieved when a batch of 38 fresh-frozen samples was analysed. Tissue microarray (TMA) consisting of 54 cores was used further to test the application of DESI-MSI to FFPE samples. A 10μm thick sections were subjected to analysis in negative and positive modes, and accuracy of over 80% and 92% for tissue prediction was achieved, respectively. Equally good results were obtained for TMA sections which were 5μm thick. This last observation was crucial in the light of making DESIMSI as histology-friendly as possible, as 10μm tissue sections are not routinely prepared in histopathology laboratories. Lastly, a new statistical approach based on ion colocalisation features has been applied to DESI-MSI data acquired for cirrhotic liver diseases. It allowed to identify top correlations of ions, and their distribution within analysed tissue sections was visualised. It is possible that using this approach, some biochemical interactions that are distinguishing the three classes of cirrhotic liver diseases (metabolic, hepatitis and cholangiopathy) could be captured. The colocalisation patterns can potentially be used for data-driven hypothesis generation, suggesting possible local molecular mechanisms characterising the samples of interest.Open Acces
    corecore