643 research outputs found

    Pan-cancer classifications of tumor histological images using deep learning

    Get PDF
    Histopathological images are essential for the diagnosis of cancer type and selection of optimal treatment. However, the current clinical process of manual inspection of images is time consuming and prone to intra- and inter-observer variability. Here we show that key aspects of cancer image analysis can be performed by deep convolutional neural networks (CNNs) across a wide spectrum of cancer types. In particular, we implement CNN architectures based on Google Inception v3 transfer learning to analyze 27815 H&E slides from 23 cohorts in The Cancer Genome Atlas in studies of tumor/normal status, cancer subtype, and mutation status. For 19 solid cancer types we are able to classify tumor/normal status of whole slide images with extremely high AUCs (0.995±0.008). We are also able to classify cancer subtypes within 10 tissue types with AUC values well above random expectations (micro-average 0.87±0.1). We then perform a cross-classification analysis of tumor/normal status across tumor types. We find that classifiers trained on one type are often effective in distinguishing tumor from normal in other cancer types, with the relationships among classifiers matching known cancer tissue relationships. For the more challenging problem of mutational status, we are able to classify TP53 mutations in three cancer types with AUCs from 0.65-0.80 using a fully-trained CNN, and with similar cross-classification accuracy across tissues. These studies demonstrate the power of CNNs for not only classifying histopathological images in diverse cancer types, but also for revealing shared biology between tumors. We have made software available at: https://github.com/javadnoorb/HistCNNFirst author draf

    Quantitative Assessment of Tissue Biomarkers and Construction of a Model to Predict Outcome in Breast Cancer Using Multiple Imputation

    Get PDF
    Missing data pose one of the greatest challenges in the rigorous evaluation of biomarkers. The limited availability of specimens with complete clinical annotation and quality biomaterial often leads to underpowered studies. Tissue microarray studies, for example, may be further handicapped by the loss of data points because of unevaluable staining, core loss, or the lack of tumor in the histospot. This paper presents a novel approach to these common problems in the context of a tissue protein biomarker analysis in a cohort of patients with breast cancer. Our analysis develops techniques based on multiple imputation to address the missing value problem. We first select markers using a training cohort, identifying a small subset of protein expression levels that are most useful in predicting patient survival. The best model is obtained by including both protein markers (including COX6C, GATA3, NAT1, and ESR1) and lymph node status. The use of either lymph node status or the four protein expression levels provides similar improvements in goodness-of-fit, with both significantly better than a baseline clinical model. Using the same multiple imputation strategy, we then validate the results out-of-sample on a larger independent cohort. Our approach of integrating multiple imputation with each stage of the analysis serves as an example that may be replicated or adapted in future studies with missing values

    The Association between Optimism and Serum Antioxidants in the Midlife in the United States Study

    Get PDF
    Objective Psychological and physical health are often conceptualized as the absence of disease, but less research addresses positive psychological and physical functioning. For example, optimism has been linked with reduced disease risk and biological dysfunction, but very little research has examined associations with markers of healthy biological functioning. Thus, we investigated the association between two indicators of positive health: optimism and serum antioxidants. Methods The cross-sectional association between optimism and antioxidant concentrations was examined in 982 men and women from the Midlife in the United States study. Primary measures included self-reported optimism (assessed with the revised Life Orientation Test) and serum concentrations of nine different antioxidants (carotenoids and Vitamin E). Regression analyses examined the relationship between optimism and antioxidant concentrations in models adjusted for demographics, health status, and health behaviors. Results For every standard deviation increase in optimism, carotenoid concentrations increased by 3–13% in age-adjusted models. Controlling for demographic characteristics and health status attenuated this association. Fruit and vegetable consumption and smoking status were identified as potential pathways underlying the association between optimism and serum carotenoids. Optimism was not significantly associated with Vitamin E. Conclusions Optimism was associated with greater carotenoid concentrations and this association was partially explained by diet and smoking status. The direction of effects cannot be conclusively determined. Effects may be bidirectional given that optimists are likely to engage in health behaviors associated with more serum antioxidants, and more serum antioxidants are likely associated with better physical health that enhances optimism

    Metatranscriptome of human faecal microbial communities in a cohort of adult men

    Get PDF
    The gut microbiome is intimately related to human health, but it is not yet known which functional activities are driven by specific microorganisms\u27 ecological configurations or transcription. We report a large-scale investigation of 372 human faecal metatranscriptomes and 929 metagenomes from a subset of 308 men in the Health Professionals Follow-Up Study. We identified a metatranscriptomic \u27core\u27 universally transcribed over time and across participants, often by different microorganisms. In contrast to the housekeeping functions enriched in this core, a \u27variable\u27 metatranscriptome included specialized pathways that were differentially expressed both across participants and among microorganisms. Finally, longitudinal metagenomic profiles allowed ecological interaction network reconstruction, which remained stable over the six-month timespan, as did strain tracking within and between participants. These results provide an initial characterization of human faecal microbial ecology into core, subject-specific, microorganism-specific and temporally variable transcription, and they differentiate metagenomically versus metatranscriptomically informative aspects of the human faecal microbiome

    Deep learning-based cross-classifications reveal conserved spatial behaviors within tumor histological images.

    Get PDF
    Histopathological images are a rich but incompletely explored data type for studying cancer. Manual inspection is time consuming, making it challenging to use for image data mining. Here we show that convolutional neural networks (CNNs) can be systematically applied across cancer types, enabling comparisons to reveal shared spatial behaviors. We develop CNN architectures to analyze 27,815 hematoxylin and eosin scanned images from The Cancer Genome Atlas for tumor/normal, cancer subtype, and mutation classification. Our CNNs are able to classify TCGA pathologist-annotated tumor/normal status of whole slide images (WSIs) in 19 cancer types with consistently high AUCs (0.995 ± 0.008), as well as subtypes with lower but significant accuracy (AUC 0.87 ± 0.1). Remarkably, tumor/normal CNNs trained on one tissue are effective in others (AUC 0.88 ± 0.11), with classifier relationships also recapitulating known adenocarcinoma, carcinoma, and developmental biology. Moreover, classifier comparisons reveal intra-slide spatial similarities, with an average tile-level correlation of 0.45 ± 0.16 between classifier pairs. Breast cancers, bladder cancers, and uterine cancers have spatial patterns that are particularly easy to detect, suggesting these cancers can be canonical types for image analysis. Patterns for TP53 mutations can also be detected, with WSI self- and cross-tissue AUCs ranging from 0.65-0.80. Finally, we comparatively evaluate CNNs on 170 breast and colon cancer images with pathologist-annotated nuclei, finding that both cellular and intercellular regions contribute to CNN accuracy. These results demonstrate the power of CNNs not only for histopathological classification, but also for cross-comparisons to reveal conserved spatial behaviors across tumors

    Stability of the human faecal microbiome in a cohort of adult men

    Get PDF
    Characterizing the stability of the gut microbiome is important to exploit it as a therapeutic target and diagnostic biomarker. We metagenomically and metatranscriptomically sequenced the faecal microbiomes of 308 participants in the Health Professionals Follow-Up Study. Participants provided four stool samples—one pair collected 24–72 h apart and a second pair ~6 months later. Within-person taxonomic and functional variation was consistently lower than between-person variation over time. In contrast, metatranscriptomic profiles were comparably variable within and between subjects due to higher within-subject longitudinal variation. Metagenomic instability accounted for ~74% of corresponding metatranscriptomic instability. The rest was probably attributable to sources such as regulation. Among the pathways that were differentially regulated, most were consistently over- or under-transcribed at each time point. Together, these results suggest that a single measurement of the faecal microbiome can provide long-term information regarding organismal composition and functional potential, but repeated or short-term measures may be necessary for dynamic features identified by metatranscriptomics
    corecore