329 research outputs found
Automatic lymphocyte detection on gastric cancer IHC images using deep learning
Tumor-infiltrating lymphocytes (TILs) have received
considerable attention in recent years, as evidence
suggests they are related to cancer prognosis. Distribution
and localization of these and other types of immune cells
are of special interest for pathologists, and frequently involve
manual examination on Immunohistochemistry (IHC) Images.
We present a model based on Deep Convolutional Neural
Networks for Automatic lymphocyte detection on IHC images
of gastric cancer. The dataset created as part of this work is
publicly available for future research.Tesi
Machine learning methods for histopathological image analysis
Abundant accumulation of digital histopathological images has led to the
increased demand for their analysis, such as computer-aided diagnosis using
machine learning techniques. However, digital pathological images and related
tasks have some issues to be considered. In this mini-review, we introduce the
application of digital pathological image analysis using machine learning
algorithms, address some problems specific to such analysis, and propose
possible solutions.Comment: 23 pages, 4 figure
A Colour Wheel to Rule them All: Analysing Colour & Geometry in Medical Microscopy
Personalized medicine is a rapidly growing field in healthcare that aims to customize
medical treatments and preventive measures based on each patient’s unique characteristics,
such as their genes, environment, and lifestyle factors. This approach
acknowledges that people with the same medical condition may respond differently
to therapies and seeks to optimize patient outcomes while minimizing the risk
of adverse effects.
To achieve these goals, personalized medicine relies on advanced technologies,
such as genomics, proteomics, metabolomics, and medical imaging. Digital
histopathology, a crucial aspect of medical imaging, provides clinicians with valuable
insights into tissue structure and function at the cellular and molecular levels. By
analyzing small tissue samples obtained through minimally invasive techniques, such
as biopsy or aspirate, doctors can gather extensive data to evaluate potential diagnoses
and clinical decisions. However, digital analysis of histology images presents
unique challenges, including the loss of 3D information and stain variability, which
is further complicated by sample variability. Limited access to data exacerbates
these challenges, making it difficult to develop accurate computational models for
research and clinical use in digital histology.
Deep learning (DL) algorithms have shown significant potential for improving the
accuracy of Computer-Aided Diagnosis (CAD) and personalized treatment models,
particularly in medical microscopy. However, factors such as limited generability,
lack of interpretability, and bias sometimes hinder their clinical impact. Furthermore,
the inherent variability of histology images complicates the development of robust DL
methods. Thus, this thesis focuses on developing new tools to address these issues.
Our essential objective is to create transparent, accessible, and efficient methods
based on classical principles from various disciplines, including histology, medical
imaging, mathematics, and art, to tackle microscopy image registration and colour
analysis successfully. These methods can contribute significantly to the advancement
of personalized medicine, particularly in studying the tumour microenvironment
for diagnosis and therapy research.
First, we introduce a novel automatic method for colour analysis and non-rigid
histology registration, enabling the study of heterogeneity morphology in tumour
biopsies. This method achieves accurate tissue cut registration, drastically reducing
landmark distance and excellent border overlap. Second, we introduce ABANICCO, a novel colour analysis method that combines
geometric analysis, colour theory, fuzzy colour spaces, and multi-label systems
for automatically classifying pixels into a set of conventional colour categories.
ABANICCO outperforms benchmark methods in accuracy and simplicity. It is
computationally straightforward, making it useful in scenarios involving changing
objects, limited data, unclear boundaries, or when users lack prior knowledge of
the image or colour theory. Moreover, results can be modified to match each
particular task.
Third, we apply the acquired knowledge to create a novel pipeline of rigid
histology registration and ABANICCO colour analysis for the in-depth study of
triple-negative breast cancer biopsies. The resulting heterogeneity map and tumour
score provide valuable insights into the composition and behaviour of the tumour,
informing clinical decision-making and guiding treatment strategies.
Finally, we consolidate the developed ideas into an efficient pipeline for tissue
reconstruction and multi-modality data integration on Tuberculosis infection data.
This enables accurate element distribution analysis to understand better interactions
between bacteria, host cells, and the immune system during the course of infection.
The methods proposed in this thesis represent a transparent approach to computational
pathology, addressing the needs of medical microscopy registration and
colour analysis while bridging the gap between clinical practice and computational
research. Moreover, our contributions can help develop and train better, more
robust DL methods.En una época en la que la medicina personalizada está revolucionando la asistencia
sanitaria, cada vez es más importante adaptar los tratamientos y las medidas
preventivas a la composición genética, el entorno y el estilo de vida de cada
paciente. Mediante el empleo de tecnologías avanzadas, como la genómica, la
proteómica, la metabolómica y la imagen médica, la medicina personalizada se
esfuerza por racionalizar el tratamiento para mejorar los resultados y reducir
los efectos secundarios.
La microscopía médica, un aspecto crucial de la medicina personalizada, permite
a los médicos recopilar y analizar grandes cantidades de datos a partir de pequeñas
muestras de tejido. Esto es especialmente relevante en oncología, donde las terapias
contra el cáncer se pueden optimizar en función de la apariencia tisular específica de
cada tumor. La patología computacional, un subcampo de la visión por ordenador,
trata de crear algoritmos para el análisis digital de biopsias. Sin embargo, antes de
que un ordenador pueda analizar imágenes de microscopía médica, hay que seguir
varios pasos para conseguir las imágenes de las muestras.
La primera etapa consiste en recoger y preparar una muestra de tejido del
paciente. Para que esta pueda observarse fácilmente al microscopio, se corta en
secciones ultrafinas. Sin embargo, este delicado procedimiento no está exento de
dificultades. Los frágiles tejidos pueden distorsionarse, desgarrarse o agujerearse,
poniendo en peligro la integridad general de la muestra.
Una vez que el tejido está debidamente preparado, suele tratarse con tintes de
colores característicos. Estos tintes acentúan diferentes tipos de células y tejidos
con colores específicos, lo que facilita a los profesionales médicos la identificación
de características particulares. Sin embargo, esta mejora en visualización tiene
un alto coste. En ocasiones, los tintes pueden dificultar el análisis informático
de las imágenes al mezclarse de forma inadecuada, traspasarse al fondo o alterar
el contraste entre los distintos elementos.
El último paso del proceso consiste en digitalizar la muestra. Se toman imágenes
de alta resolución del tejido con distintos aumentos, lo que permite su análisis por
ordenador. Esta etapa también tiene sus obstáculos. Factores como una calibración
incorrecta de la cámara o unas condiciones de iluminación inadecuadas pueden
distorsionar o hacer borrosas las imágenes. Además, las imágenes de porta completo
obtenidas so de tamaño considerable, complicando aún más el análisis. En general, si bien la preparación, la tinción y la digitalización de las muestras
de microscopía médica son fundamentales para el análisis digital, cada uno de estos
pasos puede introducir retos adicionales que deben abordarse para garantizar un
análisis preciso. Además, convertir un volumen de tejido completo en unas pocas
secciones teñidas reduce drásticamente la información 3D disponible e introduce
una gran incertidumbre.
Las soluciones de aprendizaje profundo (deep learning, DL) son muy prometedoras
en el ámbito de la medicina personalizada, pero su impacto clínico a veces se
ve obstaculizado por factores como la limitada generalizabilidad, el sobreajuste, la
opacidad y la falta de interpretabilidad, además de las preocupaciones éticas y en
algunos casos, los incentivos privados. Por otro lado, la variabilidad de las imágenes
histológicas complica el desarrollo de métodos robustos de DL. Para superar estos
retos, esta tesis presenta una serie de métodos altamente robustos e interpretables
basados en principios clásicos de histología, imagen médica, matemáticas y arte,
para alinear secciones de microscopía y analizar sus colores.
Nuestra primera contribución es ABANICCO, un innovador método de análisis
de color que ofrece una segmentación de colores objectiva y no supervisada y permite
su posterior refinamiento mediante herramientas fáciles de usar. Se ha demostrado
que la precisión y la eficacia de ABANICCO son superiores a las de los métodos
existentes de clasificación y segmentación del color, e incluso destaca en la detección
y segmentación de objetos completos. ABANICCO puede aplicarse a imágenes
de microscopía para detectar áreas teñidas para la cuantificación de biopsias, un
aspecto crucial de la investigación de cáncer.
La segunda contribución es un método automático y no supervisado de segmentación
de tejidos que identifica y elimina el fondo y los artefactos de las
imágenes de microscopía, mejorando así el rendimiento de técnicas más sofisticadas
de análisis de imagen. Este método es robusto frente a diversas imágenes, tinciones
y protocolos de adquisición, y no requiere entrenamiento.
La tercera contribución consiste en el desarrollo de métodos novedosos para
registrar imágenes histopatológicas de forma eficaz, logrando el equilibrio adecuado
entre un registro preciso y la preservación de la morfología local, en función de
la aplicación prevista.
Como cuarta contribución, los tres métodos mencionados se combinan para
crear procedimientos eficientes para la integración completa de datos volumétricos,
creando visualizaciones altamente interpretables de toda la información presente en
secciones consecutivas de biopsia de tejidos. Esta integración de datos puede tener
una gran repercusión en el diagnóstico y el tratamiento de diversas enfermedades,
en particular el cáncer de mama, al permitir la detección precoz, la realización
de pruebas clínicas precisas, la selección eficaz de tratamientos y la mejora en la
comunicación el compromiso con los pacientes. Por último, aplicamos nuestros hallazgos a la integración multimodal de datos y
la reconstrucción de tejidos para el análisis preciso de la distribución de elementos
químicos en tuberculosis, lo que arroja luz sobre las complejas interacciones entre
las bacterias, las células huésped y el sistema inmunitario durante la infección
tuberculosa. Este método también aborda problemas como el daño por adquisición,
típico de muchas modalidades de imagen.
En resumen, esta tesis muestra la aplicación de métodos clásicos de visión por
ordenador en el registro de microscopía médica y el análisis de color para abordar
los retos únicos de este campo, haciendo hincapié en la visualización eficaz y fácil de
datos complejos. Aspiramos a seguir perfeccionando nuestro trabajo con una amplia
validación técnica y un mejor análisis de los datos. Los métodos presentados en esta
tesis se caracterizan por su claridad, accesibilidad, visualización eficaz de los datos,
objetividad y transparencia. Estas características los hacen perfectos para tender
puentes robustos entre los investigadores de inteligencia artificial y los clínicos e
impulsar así la patología computacional en la práctica y la investigación médicas.Programa de Doctorado en Ciencia y Tecnología Biomédica por la Universidad Carlos III de MadridPresidenta: María Jesús Ledesma Carbayo.- Secretario: Gonzalo Ricardo Ríos Muñoz.- Vocal: Estíbaliz Gómez de Marisca
The impact of pre- and post-image processing techniques on deep learning frameworks: A comprehensive review for digital pathology image analysis.
Recently, deep learning frameworks have rapidly become the main methodology for analyzing medical images. Due to their powerful learning ability and advantages in dealing with complex patterns, deep learning algorithms are ideal for image analysis challenges, particularly in the field of digital pathology. The variety of image analysis tasks in the context of deep learning includes classification (e.g., healthy vs. cancerous tissue), detection (e.g., lymphocytes and mitosis counting), and segmentation (e.g., nuclei and glands segmentation). The majority of recent machine learning methods in digital pathology have a pre- and/or post-processing stage which is integrated with a deep neural network. These stages, based on traditional image processing methods, are employed to make the subsequent classification, detection, or segmentation problem easier to solve. Several studies have shown how the integration of pre- and post-processing methods within a deep learning pipeline can further increase the model's performance when compared to the network by itself. The aim of this review is to provide an overview on the types of methods that are used within deep learning frameworks either to optimally prepare the input (pre-processing) or to improve the results of the network output (post-processing), focusing on digital pathology image analysis. Many of the techniques presented here, especially the post-processing methods, are not limited to digital pathology but can be extended to almost any image analysis field
The impact of pre- and post-image processing techniques on deep learning frameworks: A comprehensive review for digital pathology image analysis
Recently, deep learning frameworks have rapidly become the main methodology for analyzing medical images. Due to their powerful learning ability and advantages in dealing with complex patterns, deep learning algorithms are ideal for image analysis challenges, particularly in the field of digital pathology. The variety of image analysis tasks in the context of deep learning includes classification (e.g., healthy vs. cancerous tissue), detection (e.g., lymphocytes and mitosis counting), and segmentation (e.g., nuclei and glands segmentation). The majority of recent machine learning methods in digital pathology have a pre- and/or post-processing stage which is integrated with a deep neural network. These stages, based on traditional image processing methods, are employed to make the subsequent classification, detection, or segmentation problem easier to solve. Several studies have shown how the integration of pre- and post-processing methods within a deep learning pipeline can further increase the model's performance when compared to the network by itself. The aim of this review is to provide an overview on the types of methods that are used within deep learning frameworks either to optimally prepare the input (pre-processing) or to improve the results of the network output (post-processing), focusing on digital pathology image analysis. Many of the techniques presented here, especially the post-processing methods, are not limited to digital pathology but can be extended to almost any image analysis field
대장암 종양면역미세환경에 대한 면역조직화학염색 슬라이드 이미지 분석 기반의 정량적 고찰
학위논문 (박사) -- 서울대학교 대학원 : 의과대학 의학과, 2021. 2. 강경훈.Purpose: Despite the well-known prognostic value of the tumor–immune microenvironment (TIME) in colorectal cancers (CRCs), objective and readily applicable methods for quantifying tumor-infiltrating lymphocytes (TIL) and the tumor–stroma ratio (TSR) are not yet available.
Experimental Design: We established an open-source software based analytic pipeline for quantifying TILs and the TSR from whole-slide images obtained after CD3 and CD8 immunohistochemical staining. Using random forest classifiers, the method separately quantified intraepithelial TILs (iTIL) and stromal TILs (sTIL). We applied this method to discovery and validation cohorts of 578 and 283 stage III or high-risk stage II CRC patients, respectively, who were subjected to curative surgical resection and oxlaliplatin-based adjuvant chemotherapy.
Results: Automatic quantification of iTILs and sTILs showed a moderate concordance with that obtained after visual inspection by pathologists. The K-means–based consensus clustering of 197 TIME parameters that showed robustness against variations in tumor area annotation caused CRCs to be grouped into five distinctive subgroups, reminiscent of those for consensus molecular subtypes (CMS1-4 and mixed/intermediate group). In accordance with the original CMS report, the CMS4-like subgroup (cluster 4) was significantly associated with a worse 5-year relapse-free survival and proved to be an independent prognostic factor. The clinicopathologic and prognostic features of the TIME subgroups were reproduced in an independent validation cohort.
Conclusions: Machine-learning–based analysis of whole-slide histopathologic images can be useful for extracting quantitative information about the TIME. This information can classify CRCs into clinicopathologically relevant subgroups without performing molecular analyses of the tumors.종양면역미세환경(Tumor-immune microenvironment)이 대장암에서 중요한 예후인자라는 사실은 이전부터 잘 알려져 있었지만, 종양침윤림프구(Tumor-infiltrating lymphocyte, TIL)와 종양 내 기질 분율 (tumor-stroma ratio, TSR)에 대한 객관적이고도 간단한 측정법은 지금까지 발표된 바 없었다. 이에 우리는 종양 조직에 대한 CD3, CD8 면역조직화학염색 슬라이드의 전체 이미지로부터 TIL과 TSR을 정량할 수 있는 공개소프트웨어 기반의 분석 파이프라인을 구축하였다. 대표적 기계학습 기법인 랜덤포레스트 (Random forest)를 이용하여 주어진 이미지 상에서 종양과 기질을 구분할 수 있도록 하였고, 한 환자 당 TIL과 TSR에 대한 208종의 파라미터를 산출하였다. 이 분석기법을 서울대학교병원에서 2005년부터 2012년 사이에 대장암 수술을 받고 2기 고위험군 또는 3기로 진단되어 옥살리플라틴(oxaliplatin) 기반의 항암치료를 받은 578명의 환자군에 적용하였고, 208종의 파라미터 중 반복 분석에도 값이 심하게 흔들리지 않는 197종의 파라미터에 대한 군집분석 (Clustering analysis)을 시행하여 578명의 환자들을 다섯 아형으로 분류하였다. 그 결과 각 아형들의 임상병리학적 특성이 대장암의 분자적 아형으로 기존에 정립되어 있는 consensus molecular subtype (CMS)의 각 아형들의 그것과 1:1 대응 관계를 보인다는 사실을 발견하였다. CMS 아형 중에서는 조직 내 섬유화 정도가 심한 네번째 아형이 가장 나쁜 예후를 보이는 것으로 알려져 있었는데, 본 연구결과에서도 조직 내 섬유화 정도가 심한 아형이 나쁜 5년 무재발 생존율 (relapse-free survival)과 유의미한 상관관계를 보였고 이것이 TNM 병기 및 종양 분화도에 대한 보정 후에도 독립적 예후예측인자로 작용함을 확인할 수 있었다. 또한 이러한 임상병리학적 특성 및 예후적 특성이 분당서울대학교병원에서 2007년부터 2012년 사이에 모집된 283명의 독립적 환자군에서도 재현됨을 확인할 수 있었다. 이는 기계학습 기반의 조직 병리 이미지 분석이 종양면역미세환경에 대한 정량적 정보를 얻기 위한 유용한 방식임을 확인한 것이며, 이러한 정량적 정보를 이용하면 분자생물학적 실험을 수행하지 않고도 대장암 환자들을 임상적으로 유의미한 아형으로 분류할 수 있음을 입증한 것이라 할 수 있다.Abstract i
Table of Contents iii
Chapter 1. Introduction 1
Chapter 2. Materials and Methods 3
2.1. Patients and samples 3
2.2. Immunohistochemistry 4
2.3. Construction of machine learning classifiers for identifying the tumor, stroma, and lymphocytes 4
2.4. Validation of discrimination between iTILs and sTILs 5
2.5. Whole-slide quantification of tumor, stroma, and lymphocytes 6
2.6. Identification of tumor subtypes based on TIME parameters 8
2.7. Molecular analysis 8
2.8. Statistical analysis 10
Chapter 3. Results 11
3.1. Establishment of an analytic pipeline for quantification of TILs and TSR from whole-slide immunohistochemical images 11
3.2. Quantitative description for TIME of stage III and high-risk stage II CRCs indicative of curative surgical resection and oxaliplatin-based adjuvant chemotherapy 13
3.3 Subtyping of CRC based on quantitative features of TIME 15
3.4 Differential prognostic implications of five TIME clusters 26
Chapter 4. Discussion 32
Bibliography 36
Acknowledgement 42
국문초록 43Docto
- …