    Interpretable Fully Convolutional Classification of Intrapapillary Capillary Loops for Real-Time Detection of Early Squamous Neoplasia

    In this work, we have concentrated our efforts on the interpretability of classification results coming from a fully convolutional neural network. Motivated by the classification of oesophageal tissue for real-time detection of early squamous neoplasia, the most frequent kind of oesophageal cancer in Asia, we present a new dataset and a novel deep learning method that by means of deep supervision and a newly introduced concept, the embedded Class Activation Map (eCAM), focuses on the interpretability of results as a design constraint of a convolutional network. We present a new approach to visualise attention that aims to give some insights on those areas of the oesophageal tissue that lead a network to conclude that the images belong to a particular class and compare them with those visual features employed by clinicians to produce a clinical diagnosis. In comparison to a baseline method which does not feature deep supervision but provides attention by grafting Class Activation Maps, we improve the F1-score from 87.3% to 92.7% and provide more detailed attention maps

    Joint and individual analysis of breast cancer histologic images and genomic covariates

    A key challenge in modern data analysis is understanding connections between complex and differing modalities of data. For example, two of the main approaches to the study of breast cancer are histopathology (analyzing visual characteristics of tumors) and genetics. While histopathology is the gold standard for diagnostics and there have been many recent breakthroughs in genetics, there is little overlap between these two fields. We aim to bridge this gap by developing methods based on Angle-based Joint and Individual Variation Explained (AJIVE) to directly explore similarities and differences between these two modalities. Our approach exploits Convolutional Neural Networks (CNNs) as a powerful, automatic method for image feature extraction to address some of the challenges presented by statistical analysis of histopathology image data. CNNs raise issues of interpretability that we address by developing novel methods to explore visual modes of variation captured by statistical algorithms (e.g. PCA or AJIVE) applied to CNN features. Our results provide many interpretable connections and contrasts between histopathology and genetics

    Representation learning for histopathology image analysis

    Abstract. Nowadays, automatic methods for image representation and analysis have been successfully applied in several medical imaging problems leading to the emergence of novel research areas like digital pathology and bioimage informatics. The main challenge of these methods is to deal with the high visual variability of biological structures present in the images, which increases the semantic gap between their visual appearance and their high level meaning. Particularly, the visual variability in histopathology images is also related to the noise added by acquisition stages such as magnification, sectioning and staining, among others. Many efforts have focused on the careful selection of the image representations to capture such variability. This approach requires expert knowledge as well as hand-engineered design to build good feature detectors that represent the relevant visual information. Current approaches in classical computer vision tasks have replaced such design by the inclusion of the image representation as a new learning stage called representation learning. This paradigm has outperformed the state-of-the-art results in many pattern recognition tasks like speech recognition, object detection, and image scene classification. The aim of this research was to explore and define a learning-based histopathology image representation strategy with interpretative capabilities. The main contribution was a novel approach to learn the image representation for cancer detection. The proposed approach learns the representation directly from a Basal-cell carcinoma image collection in an unsupervised way and was extended to extract more complex features from low-level representations. Additionally, this research proposed the digital staining module, a complementary interpretability stage to support diagnosis through a visual identification of discriminant and semantic features. Experimental results showed a performance of 92% in F-Score, improving the state-of-the-art representation by 7%. This research concluded that representation learning improves the feature detectors generalization as well as the performance for the basal cell carcinoma detection task. As additional contributions, a bag of features image representation was extended and evaluated for Alzheimer detection, obtaining 95% in terms of equal error classification rate. Also, a novel perspective to learn morphometric measures in cervical cells based on bag of features was presented and evaluated obtaining promising results to predict nuclei and cytoplasm areas.Los métodos automáticos para la representación y análisis de imágenes se han aplicado con éxito en varios problemas de imagen médica que conducen a la aparición de nuevas áreas de investigación como la patología digital. El principal desafío de estos métodos es hacer frente a la alta variabilidad visual de las estructuras biológicas presentes en las imágenes, lo que aumenta el vacío semántico entre su apariencia visual y su significado de alto nivel. Particularmente, la variabilidad visual en imágenes de histopatología también está relacionada con el ruido añadido por etapas de adquisición tales como magnificación, corte y tinción entre otros. Muchos esfuerzos se han centrado en la selección de la representacion de las imágenes para capturar dicha variabilidad. Este enfoque requiere el conocimiento de expertos y el diseño de ingeniería para construir buenos detectores de características que representen la información visual relevante. Los enfoques actuales en tareas de visión por computador han reemplazado ese diseño por la inclusión de la representación en la etapa de aprendizaje. Este paradigma ha superado los resultados del estado del arte en muchas de las tareas de reconocimiento de patrones tales como el reconocimiento de voz, la detección de objetos y la clasificación de imágenes. El objetivo de esta investigación es explorar y definir una estrategia basada en el aprendizaje de la representación para imágenes histopatológicas con capacidades interpretativas. La contribución principal de este trabajo es un enfoque novedoso para aprender la representación de la imagen para la detección de cáncer. El enfoque propuesto aprende la representación directamente de una colección de imágenes de carcinoma basocelular en forma no supervisada que permite extraer características más complejas a partir de las representaciones de bajo nivel. También se propone el módulo de tinción digital, una nueva etapa de interpretabilidad para apoyar el diagnóstico a través de una identificación visual de las funciones discriminantes y semánticas. Los resultados experimentales mostraron un rendimiento del 92% en términos de F-Score, mejorando la representación del estado del arte en un 7%. Esta investigación concluye que el aprendizaje de la representación mejora la generalización de los detectores de características así como el desempeño en la detección de carcinoma basocelular. Como contribuciones adicionales, una representación de bolsa de caracteristicas (BdC) fue ampliado y evaluado para la detección de la enfermedad de Alzheimer, obteniendo un 95% en términos de EER. Además, una nueva perspectiva para aprender medidas morfométricas en las células del cuello uterino basado en BdC fue presentada y evaluada obteniendo resultados prometedores para predecir las areás del nucleo y el citoplasma.Maestrí

    Computer-Aided Diagnosis for Melanoma using Ontology and Deep Learning Approaches

    The emergence of deep-learning algorithms provides great potential to enhance the prediction performance of computer-aided supporting diagnosis systems. Recent research efforts indicated that well-trained algorithms could achieve the accuracy level of experienced senior clinicians in the Dermatology field. However, the lack of interpretability and transparency hinders the algorithms’ utility in real-life. Physicians and patients require a certain level of interpretability for them to accept and trust the results. Another limitation of AI algorithms is the lack of consideration of other information related to the disease diagnosis, for example some typical dermoscopic features and diagnostic guidelines. Clinical guidelines for skin disease diagnosis are designed based on dermoscopic features. However, a structured and standard representation of the relevant knowledge in the skin disease domain is lacking. To address the above challenges, this dissertation builds an ontology capable of formally representing the knowledge of dermoscopic features and develops an explainable deep learning model able to diagnose skin diseases and dermoscopic features. Additionally, large-scale, unlabeled datasets can learn from the trained model and automate the feature generation process. The computer vision aided feature extraction algorithms are combined with the deep learning model to improve the overall classification accuracy and save manual annotation efforts

    Data-driven Representation Learning from Histopathology Image Databases to Support Digital Pathology Analysis

    Cancer research is a major public health priority in the world due to its high incidence, diversity and mortality. Despite great advances in this area during recent decades, the high incidence and lack of specialists have proven that one of the major challenges is to achieve early diagnosis. Improved early diagnosis, especially in developing countries, plays a crucial role in timely treatment and patient survival. Recent advances in scanner technology for the digitization of pathology slides and the growth of global initiatives to build databases for cancer research have enabled the emergence of digital pathology as a new approach to support pathology workflows. This has led to the development of many computational methods for automatic histopathology image analysis, which in turn has raised new computational challenges due to the high visual variability of histopathology slides, the difficulty in assessing the effectiveness of methods (considering the lack of annotated data from different pathologists and institutions), and the need of interpretable, efficient and feasible methods for practical use. On the other hand, machine learning techniques have focused on exploiting large databases to automatically extract and induce information and knowledge, in the form of patterns and rules, that allow to connect low-level content with its high-level meaning. Several approaches have emerged as opposed to traditional schemes based on handcrafted features for data representation, which nowadays are known as representation learning. The objective of this thesis is the exploration, development and validation of precise, interpretable and efficient computational machine learning methods for automatic representation learning from histopathology image databases to support diagnosis tasks of different types of cancer. The validation of the proposed methods during the thesis development allowed to corroborate their capability in several histopathology image analysis tasks of different types of cancer. These methods achieve good results in terms of accuracy, robustness, reproducibility, interpretability and feasibility suggesting their potential practical application towards translational and personalized medicine.Resumen. La investigación en cáncer es una de las principales prioridades de salud pública en el mundo debido a su alta incidencia, diversidad y mortalidad. A pesar de los grandes avances en el área en las últimas décadas, la alta incidencia y la falta de especialistas ha llevado a que una de las principales problemáticas sea lograr su detección temprana, en especial en países en vías de desarrollo, como quiera a que de ello depende las posibilidades de un tratamiento oportuno y las oportunidades de supervivencia de los pacientes. Los recientes avances en tecnología de escáneres para digitalización de láminas de patología y el crecimiento de iniciativas mundiales para la construcción de bases de datos para la investigación en cáncer, han permitido el surgimiento de la patología digital como un nuevo enfoque para soportar los flujos de trabajo en patología. Esto ha llevado al desarrollo de una gran variedad de métodos computacionales para el análisis automático de imágenes de histopatología, lo cual ha planteado nuevos desafíos computacionales debido a la alta variabilidad visual de las láminas de histopatología; la dificultad para evaluar la efectividad de los métodos por la falta de datos de diferentes instituciones que cuenten con anotaciones por parte de los patólogos, y la necesidad de métodos interpretables, eficientes y factibles para su uso práctico. Por otro lado, el aprendizaje de máquina se ha enfocado en explotar las grandes bases de datos para extraer e inducir de manera automática información y conocimiento, en forma de patrones y reglas, que permita conectar el contenido de bajo nivel con su significado. Diferentes técnicas han surgido en contraposición a los esquemas tradicionales basados en diseño manual de la representación de los datos, en lo que se conoce como aprendizaje de la representación. El propósito de esta tesis fue la exploración, desarrollo y validación de métodos computacionales de aprendizaje de máquina precisos, interpretables y eficientes a partir de bases de datos de imágenes de histopatología para el aprendizaje automático de la representación en tareas de apoyo al diagnóstico de distintos tipos de cáncer. La validación de los distintos métodos propuestos durante el desarrollo de la tesis permitieron corroborar la capacidad de cada uno de ellos en distintivas tareas de análisis de imágenes de histopatología, en diferentes tipos de cáncer, con buenos resultados en términos de exactitud, robustez, reproducibilidad, interpretabilidad y factibilidad, lo cual sugiere su potencial aplicación práctica hacia la medicina traslacional y personalizada.Doctorad