815 research outputs found

    Self-supervised learning in non-small cell lung cancer discovers novel morphological clusters linked to patient outcome and molecular phenotypes

    Full text link
    Histopathological images provide the definitive source of cancer diagnosis, containing information used by pathologists to identify and subclassify malignant disease, and to guide therapeutic choices. These images contain vast amounts of information, much of which is currently unavailable to human interpretation. Supervised deep learning approaches have been powerful for classification tasks, but they are inherently limited by the cost and quality of annotations. Therefore, we developed Histomorphological Phenotype Learning, an unsupervised methodology, which requires no annotations and operates via the self-discovery of discriminatory image features in small image tiles. Tiles are grouped into morphologically similar clusters which appear to represent recurrent modes of tumor growth emerging under natural selection. These clusters have distinct features which can be identified using orthogonal methods. Applied to lung cancer tissues, we show that they align closely with patient outcomes, with histopathologically recognised tumor types and growth patterns, and with transcriptomic measures of immunophenotype

    Spatial Positioning of Immune Hotspots Reflects the Interplay between B and T Cells in Lung Squamous Cell Carcinoma

    Get PDF
    Beyond tertiary lymphoid structures, a significant number of immune-rich areas without germinal center-like structures are observed in non–small cell lung cancer. Here, we integrated transcriptomic data and digital pathology images to study the prognostic implications, spatial locations, and constitution of immune rich areas (immune hotspots) in a cohort of 935 patients with lung cancer from The Cancer Genome Atlas. A high intratumoral immune hotspot score, which measures the proportion of immune hotspots interfacing with tumor islands, was correlated with poor overall survival in lung squamous cell carcinoma but not in lung adenocarcinoma. Lung squamous cell carcinomas with high intratumoral immune hotspot scores were characterized by consistent upregulation of B-cell signatures. Spatial statistical analyses conducted on serial multiplex IHC slides further revealed that only 4.87% of peritumoral immune hotspots and 0.26% of intratumoral immune hotspots were tertiary lymphoid structures. Significantly lower densities of CD20+CXCR5+ and CD79b+ B cells and less diverse immune cell interactions were found in intratumoral immune hotspots compared with peritumoral immune hotspots. Furthermore, there was a negative correlation between the percentages of CD8+ T cells and T regulatory cells in intratumoral but not in peritumoral immune hotspots, with tertiary lymphoid structures excluded. These findings suggest that the intratumoral immune hotspots reflect an immunosuppressive niche compared with peritumoral immune hotspots, independent of the distribution of tertiary lymphoid structures. A balance toward increased intratumoral immune hotspots is indicative of a compromised antitumor immune response and poor outcome in lung squamous cell carcinoma

    Efficient interaction with large medical imaging databases

    Get PDF
    Everyday, a wide quantity of hospitals and medical centers around the world are producing large amounts of imaging content to support clinical decisions, medical research, and education. With the current trend towards Evidence-based medicine, there is an increasing need of strategies that allow pathologists to properly interact with the valuable information such imaging repositories host and extract relevant content for supporting decision making. Unfortunately, current systems are very limited at providing access to content and extracting information from it because of different semantic and computational challenges. This thesis presents a whole pipeline, comprising 3 building blocks, that aims to to improve the way pathologists and systems interact. The first building block consists in an adaptable strategy oriented to ease the access and visualization of histopathology imaging content. The second block explores the extraction of relevant information from such imaging content by exploiting low- and mid-level information obtained from from morphology and architecture of cell nuclei. The third block aims to integrate high-level information from the expert in the process of identifying relevant information in the imaging content. This final block not only attempts to deal with the semantic gap but also to present an alternative to manual annotation, a time consuming and prone-to-error task. Different experiments were carried out and demonstrated that the introduced pipeline not only allows pathologist to navigate and visualize images but also to extract diagnostic and prognostic information that potentially could support clinical decisions.Resumen: Diariamente, gran cantidad de hospitales y centros médicos de todo el mundo producen grandes cantidades de imágenes diagnósticas para respaldar decisiones clínicas y apoyar labores de investigación y educación. Con la tendencia actual hacia la medicina basada en evidencia, existe una creciente necesidad de estrategias que permitan a los médicos patólogos interactuar adecuadamente con la información que albergan dichos repositorios de imágenes y extraer contenido relevante que pueda ser empleado para respaldar la toma de decisiones. Desafortunadamente, los sistemas actuales son muy limitados en cuanto al acceso y extracción de contenido de las imágenes debido a diferentes desafíos semánticos y computacionales. Esta tesis presenta un marco de trabajo completo para patología, el cual se compone de 3 bloques y tiene como objetivo mejorar la forma en que interactúan los patólogos y los sistemas. El primer bloque de construcción consiste en una estrategia adaptable orientada a facilitar el acceso y la visualización del contenido de imágenes histopatológicas. El segundo bloque explora la extracción de información relevante de las imágenes mediante la explotación de información de características visuales y estructurales de la morfología y la arquitectura de los núcleos celulares. El tercer bloque apunta a integrar información de alto nivel del experto en el proceso de identificación de información relevante en las imágenes. Este bloque final no solo intenta lidiar con la brecha semántica, sino que también presenta una alternativa a la anotación manual, una tarea que demanda mucho tiempo y es propensa a errores. Se llevaron a cabo diferentes experimentos que demostraron que el marco de trabajo presentado no solo permite que el patólogo navegue y visualice imágenes, sino que también extraiga información de diagnóstico y pronóstico que potencialmente podría respaldar decisiones clínicas.Doctorad

    Application of digital pathology-based advanced analytics of tumour microenvironment organisation to predict prognosis and therapeutic response.

    Get PDF
    In recent years, the application of advanced analytics, especially artificial intelligence (AI), to digital H&E images, and other histological image types, has begun to radically change how histological images are used in the clinic. Alongside the recognition that the tumour microenvironment (TME) has a profound impact on tumour phenotype, the technical development of highly multiplexed immunofluorescence platforms has enhanced the biological complexity that can be captured in the TME with high precision. AI has an increasingly powerful role in the recognition and quantitation of image features and the association of such features with clinically important outcomes, as occurs in distinct stages in conventional machine learning. Deep-learning algorithms are able to elucidate TME patterns inherent in the input data with minimum levels of human intelligence and, hence, have the potential to achieve clinically relevant predictions and discovery of important TME features. Furthermore, the diverse repertoire of deep-learning algorithms able to interrogate TME patterns extends beyond convolutional neural networks to include attention-based models, graph neural networks, and multimodal models. To date, AI models have largely been evaluated retrospectively, outside the well-established rigour of prospective clinical trials, in part because traditional clinical trial methodology may not always be suitable for the assessment of AI technology. However, to enable digital pathology-based advanced analytics to meaningfully impact clinical care, specific measures of 'added benefit' to the current standard of care and validation in a prospective setting are important. This will need to be accompanied by adequate measures of explainability and interpretability. Despite such challenges, the combination of expanding datasets, increased computational power, and the possibility of integration of pre-clinical experimental insights into model development means there is exciting potential for the future progress of these AI applications. © 2023 The Authors. The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland

    Topological Tumor Graphs: A Graph-Based Spatial Model to Infer Stromal Recruitment for Immunosuppression in Melanoma Histology.

    Get PDF
    Despite the advent of immunotherapy, metastatic melanoma represents an aggressive tumor type with a poor survival outcome. The successful application of immunotherapy requires in-depth understanding of the biological basis and immunosuppressive mechanisms within the tumor microenvironment. In this study, we conducted spatially explicit analyses of the stromal-immune interface across 400 melanoma hematoxylin and eosin (H&E) specimens from The Cancer Genome Atlas. A computational pathology pipeline (CRImage) was used to classify cells in the H&E specimen into stromal, immune, or cancer cells. The estimated proportions of these cell types were validated by independent measures of tumor purity, pathologists' estimate of lymphocyte density, imputed immune cell subtypes, and pathway analyses. Spatial interactions between these cell types were computed using a graph-based algorithm (topological tumor graphs, TTG). This approach identified two stromal features, namely stromal clustering and stromal barrier, which represented the melanoma stromal microenvironment. Tumors with increased stromal clustering and barrier were associated with reduced intratumoral lymphocyte distribution and poor overall survival independent of existing prognostic factors. To explore the genomic basis of these TTG-derived stromal phenotypes, we used a deep learning approach integrating genomic (copy number) and transcriptomic data, thereby inferring a compressed representation of copy number-driven alterations in gene expression. This integrative analysis revealed that tumors with high stromal clustering and barrier had reduced expression of pathways involved in naïve CD4 signaling, MAPK, and PI3K signaling. Taken together, our findings support the immunosuppressive role of stromal cells and T-cell exclusion within the vicinity of melanoma cells. SIGNIFICANCE: Computational histology-based stromal phenotypes within the tumor microenvironment are significantly associated with prognosis and immune exclusion in melanoma

    Multimodal Data Fusion and Quantitative Analysis for Medical Applications

    Get PDF
    Medical big data is not only enormous in its size, but also heterogeneous and complex in its data structure, which makes conventional systems or algorithms difficult to process. These heterogeneous medical data include imaging data (e.g., Positron Emission Tomography (PET), Computerized Tomography (CT), Magnetic Resonance Imaging (MRI)), and non-imaging data (e.g., laboratory biomarkers, electronic medical records, and hand-written doctor notes). Multimodal data fusion is an emerging vital field to address this urgent challenge, aiming to process and analyze the complex, diverse and heterogeneous multimodal data. The fusion algorithms bring great potential in medical data analysis, by 1) taking advantage of complementary information from different sources (such as functional-structural complementarity of PET/CT images) and 2) exploiting consensus information that reflects the intrinsic essence (such as the genetic essence underlying medical imaging and clinical symptoms). Thus, multimodal data fusion benefits a wide range of quantitative medical applications, including personalized patient care, more optimal medical operation plan, and preventive public health. Though there has been extensive research on computational approaches for multimodal fusion, there are three major challenges of multimodal data fusion in quantitative medical applications, which are summarized as feature-level fusion, information-level fusion and knowledge-level fusion: • Feature-level fusion. The first challenge is to mine multimodal biomarkers from high-dimensional small-sample multimodal medical datasets, which hinders the effective discovery of informative multimodal biomarkers. Specifically, efficient dimension reduction algorithms are required to alleviate "curse of dimensionality" problem and address the criteria for discovering interpretable, relevant, non-redundant and generalizable multimodal biomarkers. • Information-level fusion. The second challenge is to exploit and interpret inter-modal and intra-modal information for precise clinical decisions. Although radiomics and multi-branch deep learning have been used for implicit information fusion guided with supervision of the labels, there is a lack of methods to explicitly explore inter-modal relationships in medical applications. Unsupervised multimodal learning is able to mine inter-modal relationship as well as reduce the usage of labor-intensive data and explore potential undiscovered biomarkers; however, mining discriminative information without label supervision is an upcoming challenge. Furthermore, the interpretation of complex non-linear cross-modal associations, especially in deep multimodal learning, is another critical challenge in information-level fusion, which hinders the exploration of multimodal interaction in disease mechanism. • Knowledge-level fusion. The third challenge is quantitative knowledge distillation from multi-focus regions on medical imaging. Although characterizing imaging features from single lesions using either feature engineering or deep learning methods have been investigated in recent years, both methods neglect the importance of inter-region spatial relationships. Thus, a topological profiling tool for multi-focus regions is in high demand, which is yet missing in current feature engineering and deep learning methods. Furthermore, incorporating domain knowledge with distilled knowledge from multi-focus regions is another challenge in knowledge-level fusion. To address the three challenges in multimodal data fusion, this thesis provides a multi-level fusion framework for multimodal biomarker mining, multimodal deep learning, and knowledge distillation from multi-focus regions. Specifically, our major contributions in this thesis include: • To address the challenges in feature-level fusion, we propose an Integrative Multimodal Biomarker Mining framework to select interpretable, relevant, non-redundant and generalizable multimodal biomarkers from high-dimensional small-sample imaging and non-imaging data for diagnostic and prognostic applications. The feature selection criteria including representativeness, robustness, discriminability, and non-redundancy are exploited by consensus clustering, Wilcoxon filter, sequential forward selection, and correlation analysis, respectively. SHapley Additive exPlanations (SHAP) method and nomogram are employed to further enhance feature interpretability in machine learning models. • To address the challenges in information-level fusion, we propose an Interpretable Deep Correlational Fusion framework, based on canonical correlation analysis (CCA) for 1) cohesive multimodal fusion of medical imaging and non-imaging data, and 2) interpretation of complex non-linear cross-modal associations. Specifically, two novel loss functions are proposed to optimize the discovery of informative multimodal representations in both supervised and unsupervised deep learning, by jointly learning inter-modal consensus and intra-modal discriminative information. An interpretation module is proposed to decipher the complex non-linear cross-modal association by leveraging interpretation methods in both deep learning and multimodal consensus learning. • To address the challenges in knowledge-level fusion, we proposed a Dynamic Topological Analysis framework, based on persistent homology, for knowledge distillation from inter-connected multi-focus regions in medical imaging and incorporation of domain knowledge. Different from conventional feature engineering and deep learning, our DTA framework is able to explicitly quantify inter-region topological relationships, including global-level geometric structure and community-level clusters. K-simplex Community Graph is proposed to construct the dynamic community graph for representing community-level multi-scale graph structure. The constructed dynamic graph is subsequently tracked with a novel Decomposed Persistence algorithm. Domain knowledge is incorporated into the Adaptive Community Profile, summarizing the tracked multi-scale community topology with additional customizable clinically important factors

    Spatial Positioning of Immune Hotspots Reflects the Interplay between B and T Cells in Lung Squamous Cell Carcinoma

    Get PDF
    Beyond tertiary lymphoid structures, a significant number of immune rich areas without germinal center-like structures are observed in non-small cell lung cancer. Here, we integrated transcriptomic data and digital pathology images to study the prognostic implications, spatial locations, and constitution of immune rich areas (immune hotspots) in a cohort of 935 lung cancer patients from the TCGA. A high intratumoral immune hotspot score, which measures the proportion of immune hotspots interfacing with tumor islands, was correlated with poor overall survival in lung squamous cell carcinoma but not in lung adenocarcinoma. Lung squamous cell carcinomas with high intratumoral immune hotspot scores were characterized by consistent upregulation of B cell signatures. Spatial statistical analyses conducted on serial multiplex immunohistochemistry slides further revealed that only 4.87% of peritumoral immune hotspots and 0.26% of intratumoral immune hotspots were tertiary lymphoid structures. Significantly lower densities of CD20+CXCR5+ and CD79b+ B cells and less diverse immune cell interactions were found in intratumoral immune hotspots compared to peritumoral immune hotspots. Furthermore, there was a negative correlation between the percentages of CD8+ T cells and T regulatory cells in intratumoral but not in peritumoral immune hotspots, with tertiary lymphoid structures excluded. These findings suggest that the intratumoral immune hotspots reflect an immunosuppressive niche compared to peritumoral immune hotspots, independent of the distribution of tertiary lymphoid structures. A balance towards increased intratumoral immune hotspots is indicative of a compromised anti-tumor immune response and poor outcome in lung squamous cell carcinoma

    An imaging biomarker of tumor-infiltrating lymphocytes to risk-stratify patients with HPV-associated oropharyngeal cancer

    Get PDF
    BACKGROUND: Human papillomavirus (HPV)-associated oropharyngeal squamous cell carcinoma (OPSCC) has excellent control rates compared to nonvirally associated OPSCC. Multiple trials are actively testing whether de-escalation of treatment intensity for these patients can maintain oncologic equipoise while reducing treatment-related toxicity. We have developed OP-TIL, a biomarker that characterizes the spatial interplay between tumor-infiltrating lymphocytes (TILs) and surrounding cells in histology images. Herein, we sought to test whether OP-TIL can segregate stage I HPV-associated OPSCC patients into low-risk and high-risk groups and aid in patient selection for de-escalation clinical trials. METHODS: Association between OP-TIL and patient outcome was explored on whole slide hematoxylin and eosin images from 439 stage I HPV-associated OPSCC patients across 6 institutional cohorts. One institutional cohort (n = 94) was used to identify the most prognostic features and train a Cox regression model to predict risk of recurrence and death. Survival analysis was used to validate the algorithm as a biomarker of recurrence or death in the remaining 5 cohorts (n = 345). All statistical tests were 2-sided. RESULTS: OP-TIL separated stage I HPV-associated OPSCC patients with 30 or less pack-year smoking history into low-risk (2-year disease-free survival [DFS] = 94.2%; 5-year DFS = 88.4%) and high-risk (2-year DFS = 82.5%; 5-year DFS = 74.2%) groups (hazard ratio = 2.56, 95% confidence interval = 1.52 to 4.32; P \u3c .001), even after adjusting for age, smoking status, T and N classification, and treatment modality on multivariate analysis for DFS (hazard ratio = 2.27, 95% confidence interval = 1.32 to 3.94; P = .003). CONCLUSIONS: OP-TIL can identify stage I HPV-associated OPSCC patients likely to be poor candidates for treatment de-escalation. Following validation on previously completed multi-institutional clinical trials, OP-TIL has the potential to be a biomarker, beyond clinical stage and HPV status, that can be used clinically to optimize patient selection for de-escalation

    AI in Medical Imaging Informatics: Current Challenges and Future Directions

    Get PDF
    This paper reviews state-of-the-art research solutions across the spectrum of medical imaging informatics, discusses clinical translation, and provides future directions for advancing clinical practice. More specifically, it summarizes advances in medical imaging acquisition technologies for different modalities, highlighting the necessity for efficient medical data management strategies in the context of AI in big healthcare data analytics. It then provides a synopsis of contemporary and emerging algorithmic methods for disease classification and organ/ tissue segmentation, focusing on AI and deep learning architectures that have already become the de facto approach. The clinical benefits of in-silico modelling advances linked with evolving 3D reconstruction and visualization applications are further documented. Concluding, integrative analytics approaches driven by associate research branches highlighted in this study promise to revolutionize imaging informatics as known today across the healthcare continuum for both radiology and digital pathology applications. The latter, is projected to enable informed, more accurate diagnosis, timely prognosis, and effective treatment planning, underpinning precision medicine
    corecore