311 research outputs found

    Identificación automática de marcadores patológicos en imágenes de histopatología

    Get PDF
    Abstract. The inter and intra subject variability is a common problem in several tasks associated to the examination of histopathological samples. This variability might hinder the evaluation of cancerous diseases. The development of automatic image analysis techniques and computerized aided diagnostic tools in pathology aims to reduce the impact of such variability by offering quantitative measurements and estimations. These measurements allow an accurate evaluation and classification of the diseases in virtual slide images. The main problem addressed in this thesis is evaluating the correlation of the automated identification of pathological markers with cancer malignancy and aggresivenes. Hence, a set of classifier models are trained to detect known pathological patterns. The classifiers are then used to quantify the presence of the pathological markers. Finally, the resulting measurements are correlated with the cancer risk recurrence. Results show that the automated detectors are able to quantify patterns that show differences across several cancer risk groups.La variabilidad inter e intra sujeto es un problema frecuente en muchas tareas asociadas al ex´amen de muestras histopatológicas. Esta variabilidad puede incidir negativamente en la evaluación de patologías relacionadas con el cáncer. El desarrollo de técnicas para el análisis automático de imágenes y de herramientas de soporte al diagnóstico en patología tiene como objetivo reducir el impacto de la variabilidad inter/intra sujeto mediante la obtención de medidas y estimaciones cuantitativas. Estas medidas permiten una evaluación y clasificación más precisa de las enfermedades observables en l´aminas virtuales. El principal problema abordado en esta tesis consiste en evaluar la correlación de la identificación automática de marcadores patológicos con la agresividad del cáncer. As´ı, un conjunto de clasificadores son entrenados para detectar marcadores patológicos conocidos. Los clasificadores son posteriormente usados para cuantificar la presencia de los marcadores patológicos. Finalmente, las mediciones resultantes son correlacionadas con el riesgo de recurrencia del cáncer. Los resultados muestran que los detectores automáticos son capaces de cuantificar los patrones que muestran diferencias entre diferentes grupos de riesgo.Doctorad

    Representación de imágenes de histopatología utilizada en tareas de análisis automático: estado del arte

    Get PDF
    This paper presents a review of the state-of-the-art in histopathology image representation used in automatic image analysis tasks. Automatic analysis of histopathology images is important for building computer-assisted diagnosis tools, automatic image enhancing systems and virtual microscopy systems, among other applications. Histopathology images have a rich mix of visual patterns with particularities that make them difficult to analyze. The paper discusses these particularities, the acquisition process and the challenges found when doing automatic analysis. Second an overview of recent works and methods addressed to deal with visual content representation in different automatic image analysis tasks is presented. Third an overview of applications of image representation methods in several medical domains and tasks is presented. Finally, the paper concludes with current trends of automatic analysis of histopathology images like digital pathology

    Data-driven Representation Learning from Histopathology Image Databases to Support Digital Pathology Analysis

    Get PDF
    Cancer research is a major public health priority in the world due to its high incidence, diversity and mortality. Despite great advances in this area during recent decades, the high incidence and lack of specialists have proven that one of the major challenges is to achieve early diagnosis. Improved early diagnosis, especially in developing countries, plays a crucial role in timely treatment and patient survival. Recent advances in scanner technology for the digitization of pathology slides and the growth of global initiatives to build databases for cancer research have enabled the emergence of digital pathology as a new approach to support pathology workflows. This has led to the development of many computational methods for automatic histopathology image analysis, which in turn has raised new computational challenges due to the high visual variability of histopathology slides, the difficulty in assessing the effectiveness of methods (considering the lack of annotated data from different pathologists and institutions), and the need of interpretable, efficient and feasible methods for practical use. On the other hand, machine learning techniques have focused on exploiting large databases to automatically extract and induce information and knowledge, in the form of patterns and rules, that allow to connect low-level content with its high-level meaning. Several approaches have emerged as opposed to traditional schemes based on handcrafted features for data representation, which nowadays are known as representation learning. The objective of this thesis is the exploration, development and validation of precise, interpretable and efficient computational machine learning methods for automatic representation learning from histopathology image databases to support diagnosis tasks of different types of cancer. The validation of the proposed methods during the thesis development allowed to corroborate their capability in several histopathology image analysis tasks of different types of cancer. These methods achieve good results in terms of accuracy, robustness, reproducibility, interpretability and feasibility suggesting their potential practical application towards translational and personalized medicine.Resumen. La investigación en cáncer es una de las principales prioridades de salud pública en el mundo debido a su alta incidencia, diversidad y mortalidad. A pesar de los grandes avances en el área en las últimas décadas, la alta incidencia y la falta de especialistas ha llevado a que una de las principales problemáticas sea lograr su detección temprana, en especial en países en vías de desarrollo, como quiera a que de ello depende las posibilidades de un tratamiento oportuno y las oportunidades de supervivencia de los pacientes. Los recientes avances en tecnología de escáneres para digitalización de láminas de patología y el crecimiento de iniciativas mundiales para la construcción de bases de datos para la investigación en cáncer, han permitido el surgimiento de la patología digital como un nuevo enfoque para soportar los flujos de trabajo en patología. Esto ha llevado al desarrollo de una gran variedad de métodos computacionales para el análisis automático de imágenes de histopatología, lo cual ha planteado nuevos desafíos computacionales debido a la alta variabilidad visual de las láminas de histopatología; la dificultad para evaluar la efectividad de los métodos por la falta de datos de diferentes instituciones que cuenten con anotaciones por parte de los patólogos, y la necesidad de métodos interpretables, eficientes y factibles para su uso práctico. Por otro lado, el aprendizaje de máquina se ha enfocado en explotar las grandes bases de datos para extraer e inducir de manera automática información y conocimiento, en forma de patrones y reglas, que permita conectar el contenido de bajo nivel con su significado. Diferentes técnicas han surgido en contraposición a los esquemas tradicionales basados en diseño manual de la representación de los datos, en lo que se conoce como aprendizaje de la representación. El propósito de esta tesis fue la exploración, desarrollo y validación de métodos computacionales de aprendizaje de máquina precisos, interpretables y eficientes a partir de bases de datos de imágenes de histopatología para el aprendizaje automático de la representación en tareas de apoyo al diagnóstico de distintos tipos de cáncer. La validación de los distintos métodos propuestos durante el desarrollo de la tesis permitieron corroborar la capacidad de cada uno de ellos en distintivas tareas de análisis de imágenes de histopatología, en diferentes tipos de cáncer, con buenos resultados en términos de exactitud, robustez, reproducibilidad, interpretabilidad y factibilidad, lo cual sugiere su potencial aplicación práctica hacia la medicina traslacional y personalizada.Doctorad

    Representation learning for histopathology image analysis

    Get PDF
    Abstract. Nowadays, automatic methods for image representation and analysis have been successfully applied in several medical imaging problems leading to the emergence of novel research areas like digital pathology and bioimage informatics. The main challenge of these methods is to deal with the high visual variability of biological structures present in the images, which increases the semantic gap between their visual appearance and their high level meaning. Particularly, the visual variability in histopathology images is also related to the noise added by acquisition stages such as magnification, sectioning and staining, among others. Many efforts have focused on the careful selection of the image representations to capture such variability. This approach requires expert knowledge as well as hand-engineered design to build good feature detectors that represent the relevant visual information. Current approaches in classical computer vision tasks have replaced such design by the inclusion of the image representation as a new learning stage called representation learning. This paradigm has outperformed the state-of-the-art results in many pattern recognition tasks like speech recognition, object detection, and image scene classification. The aim of this research was to explore and define a learning-based histopathology image representation strategy with interpretative capabilities. The main contribution was a novel approach to learn the image representation for cancer detection. The proposed approach learns the representation directly from a Basal-cell carcinoma image collection in an unsupervised way and was extended to extract more complex features from low-level representations. Additionally, this research proposed the digital staining module, a complementary interpretability stage to support diagnosis through a visual identification of discriminant and semantic features. Experimental results showed a performance of 92% in F-Score, improving the state-of-the-art representation by 7%. This research concluded that representation learning improves the feature detectors generalization as well as the performance for the basal cell carcinoma detection task. As additional contributions, a bag of features image representation was extended and evaluated for Alzheimer detection, obtaining 95% in terms of equal error classification rate. Also, a novel perspective to learn morphometric measures in cervical cells based on bag of features was presented and evaluated obtaining promising results to predict nuclei and cytoplasm areas.Los métodos automáticos para la representación y análisis de imágenes se han aplicado con éxito en varios problemas de imagen médica que conducen a la aparición de nuevas áreas de investigación como la patología digital. El principal desafío de estos métodos es hacer frente a la alta variabilidad visual de las estructuras biológicas presentes en las imágenes, lo que aumenta el vacío semántico entre su apariencia visual y su significado de alto nivel. Particularmente, la variabilidad visual en imágenes de histopatología también está relacionada con el ruido añadido por etapas de adquisición tales como magnificación, corte y tinción entre otros. Muchos esfuerzos se han centrado en la selección de la representacion de las imágenes para capturar dicha variabilidad. Este enfoque requiere el conocimiento de expertos y el diseño de ingeniería para construir buenos detectores de características que representen la información visual relevante. Los enfoques actuales en tareas de visión por computador han reemplazado ese diseño por la inclusión de la representación en la etapa de aprendizaje. Este paradigma ha superado los resultados del estado del arte en muchas de las tareas de reconocimiento de patrones tales como el reconocimiento de voz, la detección de objetos y la clasificación de imágenes. El objetivo de esta investigación es explorar y definir una estrategia basada en el aprendizaje de la representación para imágenes histopatológicas con capacidades interpretativas. La contribución principal de este trabajo es un enfoque novedoso para aprender la representación de la imagen para la detección de cáncer. El enfoque propuesto aprende la representación directamente de una colección de imágenes de carcinoma basocelular en forma no supervisada que permite extraer características más complejas a partir de las representaciones de bajo nivel. También se propone el módulo de tinción digital, una nueva etapa de interpretabilidad para apoyar el diagnóstico a través de una identificación visual de las funciones discriminantes y semánticas. Los resultados experimentales mostraron un rendimiento del 92% en términos de F-Score, mejorando la representación del estado del arte en un 7%. Esta investigación concluye que el aprendizaje de la representación mejora la generalización de los detectores de características así como el desempeño en la detección de carcinoma basocelular. Como contribuciones adicionales, una representación de bolsa de caracteristicas (BdC) fue ampliado y evaluado para la detección de la enfermedad de Alzheimer, obteniendo un 95% en términos de EER. Además, una nueva perspectiva para aprender medidas morfométricas en las células del cuello uterino basado en BdC fue presentada y evaluada obteniendo resultados prometedores para predecir las areás del nucleo y el citoplasma.Maestrí

    Deep Learning-Based Prediction of Molecular Tumor Biomarkers from H&E: A Practical Review

    Full text link
    Molecular and genomic properties are critical in selecting cancer treatments to target individual tumors, particularly for immunotherapy. However, the methods to assess such properties are expensive, time-consuming, and often not routinely performed. Applying machine learning to H&E images can provide a more cost-effective screening method. Dozens of studies over the last few years have demonstrated that a variety of molecular biomarkers can be predicted from H&E alone using the advancements of deep learning: molecular alterations, genomic subtypes, protein biomarkers, and even the presence of viruses. This article reviews the diverse applications across cancer types and the methodology to train and validate these models on whole slide images. From bottom-up to pathologist-driven to hybrid approaches, the leading trends include a variety of weakly supervised deep learning-based approaches, as well as mechanisms for training strongly supervised models in select situations. While results of these algorithms look promising, some challenges still persist, including small training sets, rigorous validation, and model explainability. Biomarker prediction models may yield a screening method to determine when to run molecular tests or an alternative when molecular tests are not possible. They also create new opportunities in quantifying intratumoral heterogeneity and predicting patient outcomes.Comment: 20 pages, 2 figure

    Anotación Automática de Imágenes Médicas Usando la Representación de Bolsa de Características

    Get PDF
    La anotación automática de imágenes médicas se ha convertido en un proceso necesario para la gestión, búsqueda y exploración de las crecientes bases de datos médicas para apoyo al diagnóstico y análisis de imágenes en investigación biomédica. La anotación automática consiste en asignar conceptos de alto nivel a imágenes a partir de las características visuales de bajo nivel. Para esto se busca tener una representación de la imagen que caracterice el contenido visual de ésta y un modelo de aprendizaje entrenado con ejemplos de imágenes anotadas. Este trabajo propone explorar la Bolsa de Características (BdC) para la representación de las imágenes de histología y los Métodos de Kernel (MK) como modelos de aprendizaje de máquina para la anotación automática. Adicionalmente se exploró una metodología de análisis de colecciones de imágenes para encontrar patrones visuales y sus relaciones con los conceptos semánticos usando Análisis de Información Mutua, Selección de Características con Máxima-Relevancia y Mínima-Redundancia (mRMR) y Análisis de Biclustering. La metodología propuesta fue evaluada en dos bases de datos de imágenes, una con imá- genes anotadas con los cuatro tejidos fundamentales y otra con imágenes de tipo de cáncer de piel conocido como carcinoma basocelular. Los resultados en análisis de imágenes revelan que es posible encontrar patrones implícitos en colecciones de imágenes a partir de la representación BdC seleccionan- do las palabras visuales relevantes de la colección y asociándolas a conceptos semánticos mientras que el análisis de biclustering permitió encontrar algunos grupos de imágenes similares que comparten palabras visuales asociadas al tipo de tinción o conceptos. En anotación automática se evaluaron distintas configuraciones del enfoque BdC. Los mejores resultados obtenidos presentan una Precisión de 91 % y un Recall de 88 % en las imágenes de histología, y una Precisión de 59 % y un Recall de 23 % en las imágenes de histopatología. La configuración de la metodología BdC con los mejores resultados en ambas colecciones fue obtenida usando las palabras visuales basadas en DCT con un diccionario de tamaño 1,000 con un kernel Gaussiano. / Abstract. The automatic annotation of medical images has become a necessary process for managing, searching and exploration of growing medical image databases for diagnostic support and image analysis in biomedical research. The automatic annotation is to assign high-level concepts to images from the low-level visual features. For this, is needed to have a image representation that characterizes its visual content and a learning model trained with examples of annotated images. This paper aims to explore the Bag of Features (BOF) for the representation of histology images and Kernel Methods (KM) as models of machine learning for automatic annotation. Additionally, we explored a methodology for image collection analysis in order to _nd visual patterns and their relationships with semantic concepts using Mutual Information Analysis, Features Selection with Max-Relevance and Min- Redundancy (mRMR) and Biclustering Analysis. The proposed methodology was evaluated in two image databases, the _rst have images annotated with the four fundamental tissues, and the second have images of a type of skin cancer known as Basal-cell carcinoma. The image analysis results show that it is possible to _nd implicit patterns in image collections from the BOF representation. This by selecting the relevant visual words in the collection and associating them with semantic concepts, whereas biclustering analysis allowed to _nd groups of similar images that share visual words associated with the type of stain or concepts. The Automatic annotation was evaluated in di_erent settings of BOF approach. The best results have a Precision of 91% and Recall of 88% in the histology images, and a Precision of 59% and Recall of 23% in histopathology images. The con_guration of BOF methodology with the best results in both datasets was obtained using the DCT-based visual words in a dictionary size of 1; 000 with a Gaussian kernel.Maestrí

    Computer aided diagnosis algorithms for digital microscopy

    Get PDF
    Automatic analysis and information extraction from an image is still a highly chal- lenging research problem in the computer vision area, attempting to describe the image content with computational and mathematical techniques. Moreover the in- formation extracted from the image should be meaningful and as most discrimi- natory as possible, since it will be used to categorize its content according to the analysed problem. In the Medical Imaging domain this issue is even more felt because many important decisions that affect the patient care, depend on the use- fulness of the information extracted from the image. Manage medical image is even more complicated not only due to the importance of the problem, but also because it needs a fair amount of prior medical knowledge to be able to represent with data the visual information to which pathologist refer. Today medical decisions that impact patient care rely on the results of laboratory tests to a greater extent than ever before, due to the marked expansion in the number and complexity of offered tests. These developments promise to improve the care of patients, but the more increase the number and complexity of the tests, the more increases the possibility to misapply and misinterpret the test themselves, leading to inappropriate diagnosis and therapies. Moreover, with the increased number of tests also the amount of data to be analysed increases, forcing pathologists to devote much time to the analysis of the tests themselves rather than to patient care and the prescription of the right therapy, especially considering that most of the tests performed are just check up tests and most of the analysed samples come from healthy patients. Then, a quantitative evaluation of medical images is really essential to overcome uncertainty and subjectivity, but also to greatly reduce the amount of data and the timing for the analysis. In the last few years, many computer assisted diagno- sis systems have been developed, attempting to mimic pathologists by extracting features from the images. Image analysis involves complex algorithms to identify and characterize cells or tissues using image pattern recognition technology. This thesis addresses the main problems associated to the digital microscopy analysis in histology and haematology diagnosis, with the development of algorithms for the extraction of useful information from different digital images, but able to distinguish different biological structures in the images themselves. The proposed methods not only aim to improve the degree of accuracy of the analysis, and reducing time, if used as the only means of diagnoses, but also they can be used as intermediate tools for skimming the number of samples to be analysed directly from the pathologist, or as double check systems to verify the correct results of the automated facilities used today

    Machine Learning Approaches to Predict Recurrence of Aggressive Tumors

    Get PDF
    Cancer recurrence is the major cause of cancer mortality. Despite tremendous research efforts, there is a dearth of biomarkers that reliably predict risk of cancer recurrence. Currently available biomarkers and tools in the clinic have limited usefulness to accurately identify patients with a higher risk of recurrence. Consequently, cancer patients suffer either from under- or over- treatment. Recent advances in machine learning and image analysis have facilitated development of techniques that translate digital images of tumors into rich source of new data. Leveraging these computational advances, my work addresses the unmet need to find risk-predictive biomarkers for Triple Negative Breast Cancer (TNBC), Ductal Carcinoma in-situ (DCIS), and Pancreatic Neuroendocrine Tumors (PanNETs). I have developed unique, clinically facile, models that determine the risk of recurrence, either local, invasive, or metastatic in these tumors. All models employ hematoxylin and eosin (H&E) stained digitized images of patient tumor samples as the primary source of data. The TNBC (n=322) models identified unique signatures from a panel of 133 protein biomarkers, relevant to breast cancer, to predict site of metastasis (brain, lung, liver, or bone) for TNBC patients. Even our least significant model (bone metastasis) offered superior prognostic value than clinopathological variables (Hazard Ratio [HR] of 5.123 vs. 1.397 p\u3c0.05). A second model predicted 10-year recurrence risk, in women with DCIS treated with breast conserving surgery, by identifying prognostically relevant features of tumor architecture from digitized H&E slides (n=344), using a novel two-step classification approach. In the validation cohort, our DCIS model provided a significantly higher HR (6.39) versus any clinopathological marker (p\u3c0.05). The third model is a deep-learning based, multi-label (annotation followed by metastasis association), whole slide image analysis pipeline (n=90) that identified a PanNET high risk group with over an 8x higher risk of metastasis (versus the low risk group p\u3c0.05), regardless of cofounding clinical variables. These machine-learning based models may guide treatment decisions and demonstrate proof-of-principle that computational pathology has tremendous clinical utility

    Computer aided diagnosis algorithms for digital microscopy

    Get PDF
    Automatic analysis and information extraction from an image is still a highly chal- lenging research problem in the computer vision area, attempting to describe the image content with computational and mathematical techniques. Moreover the in- formation extracted from the image should be meaningful and as most discrimi- natory as possible, since it will be used to categorize its content according to the analysed problem. In the Medical Imaging domain this issue is even more felt because many important decisions that affect the patient care, depend on the use- fulness of the information extracted from the image. Manage medical image is even more complicated not only due to the importance of the problem, but also because it needs a fair amount of prior medical knowledge to be able to represent with data the visual information to which pathologist refer. Today medical decisions that impact patient care rely on the results of laboratory tests to a greater extent than ever before, due to the marked expansion in the number and complexity of offered tests. These developments promise to improve the care of patients, but the more increase the number and complexity of the tests, the more increases the possibility to misapply and misinterpret the test themselves, leading to inappropriate diagnosis and therapies. Moreover, with the increased number of tests also the amount of data to be analysed increases, forcing pathologists to devote much time to the analysis of the tests themselves rather than to patient care and the prescription of the right therapy, especially considering that most of the tests performed are just check up tests and most of the analysed samples come from healthy patients. Then, a quantitative evaluation of medical images is really essential to overcome uncertainty and subjectivity, but also to greatly reduce the amount of data and the timing for the analysis. In the last few years, many computer assisted diagno- sis systems have been developed, attempting to mimic pathologists by extracting features from the images. Image analysis involves complex algorithms to identify and characterize cells or tissues using image pattern recognition technology. This thesis addresses the main problems associated to the digital microscopy analysis in histology and haematology diagnosis, with the development of algorithms for the extraction of useful information from different digital images, but able to distinguish different biological structures in the images themselves. The proposed methods not only aim to improve the degree of accuracy of the analysis, and reducing time, if used as the only means of diagnoses, but also they can be used as intermediate tools for skimming the number of samples to be analysed directly from the pathologist, or as double check systems to verify the correct results of the automated facilities used today
    corecore