36 research outputs found

    A Novel Statistical Method to Diagnose, Quantify and Correct Batch Effects in Genomic Studies.

    Get PDF
    Genome projects now generate large-scale data often produced at various time points by different laboratories using multiple platforms. This increases the potential for batch effects. Currently there are several batch evaluation methods like principal component analysis (PCA; mostly based on visual inspection), and sometimes they fail to reveal all of the underlying batch effects. These methods can also lead to the risk of unintentionally correcting biologically interesting factors attributed to batch effects. Here we propose a novel statistical method, finding batch effect (findBATCH), to evaluate batch effect based on probabilistic principal component and covariates analysis (PPCCA). The same framework also provides a new approach to batch correction, correcting batch effect (correctBATCH), which we have shown to be a better approach to traditional PCA-based correction. We demonstrate the utility of these methods using two different examples (breast and colorectal cancers) by merging gene expression data from different studies after diagnosing and correcting for batch effects and retaining the biological effects. These methods, along with conventional visual inspection-based PCA, are available as a part of an R package exploring batch effect (exploBATCH; https://github.com/syspremed/exploBATCH )

    Heterocellular gene signatures reveal luminal-A breast cancer heterogeneity and differential therapeutic responses.

    Get PDF
    Breast cancer is a highly heterogeneous disease. Although differences between intrinsic breast cancer subtypes have been well studied, heterogeneity within each subtype, especially luminal-A cancers, requires further interrogation to personalize disease management. Here, we applied well-characterized and cancer-associated heterocellular signatures representing stem, mesenchymal, stromal, immune, and epithelial cell types to breast cancer. This analysis stratified the luminal-A breast cancer samples into five subtypes with a majority of them enriched for a subtype (stem-like) that has increased stem and stromal cell gene signatures, representing potential luminal progenitor origin. The enrichment of immune checkpoint genes and other immune cell types in two (including stem-like) of the five heterocellular subtypes of luminal-A tumors suggest their potential response to immunotherapy. These immune-enriched subtypes of luminal-A tumors (containing only estrogen receptor positive samples) showed good or intermediate prognosis along with the two other differentiated subtypes as assessed using recurrence-free and distant metastasis-free patient survival outcomes. On the other hand, a partially differentiated subtype of luminal-A breast cancer with transit-amplifying colon-crypt characteristics showed poor prognosis. Furthermore, published luminal-A subtypes associated with specific somatic copy number alterations and mutations shared similar cellular and mutational characteristics to colorectal cancer subtypes where the heterocellular signatures were derived. These heterocellular subtypes reveal transcriptome and cell-type based heterogeneity of luminal-A and other breast cancer subtypes that may be useful for additional understanding of the cancer type and potential patient stratification and personalized medicine

    Predicting Patterns of Customer Usage, with Niftecash

    Get PDF
    Report is the result of the working during 93rd European Study Group with Industry in Limerick

    Analytical Validation of Multiplex Biomarker Assay to Stratify Colorectal Cancer into Molecular Subtypes.

    Get PDF
    Previously, we classified colorectal cancers (CRCs) into five CRCAssigner (CRCA) subtypes with different prognoses and potential treatment responses, later consolidated into four consensus molecular subtypes (CMS). Here we demonstrate the analytical development and validation of a custom NanoString nCounter platform-based biomarker assay (NanoCRCA) to stratify CRCs into subtypes. To reduce costs, we switched from the standard nCounter protocol to a custom modified protocol. The assay included a reduced 38-gene panel that was selected using an in-house machine-learning pipeline. We applied NanoCRCA to 413 samples from 355 CRC patients. From the fresh frozen samples (n = 237), a subset had matched microarray/RNAseq profiles (n = 47) or formalin-fixed paraffin-embedded (FFPE) samples (n = 58). We also analyzed a further 118 FFPE samples. We compared the assay results with the CMS classifier, different platforms (microarrays/RNAseq) and gene-set classifiers (38 and the original 786 genes). The standard and modified protocols showed high correlation (> 0.88) for gene expression. Technical replicates were highly correlated (> 0.96). NanoCRCA classified fresh frozen and FFPE samples into all five CRCA subtypes with consistent classification of selected matched fresh frozen/FFPE samples. We demonstrate high and significant subtype concordance across protocols (100%), gene sets (95%), platforms (87%) and with CMS subtypes (75%) when evaluated across multiple datasets. Overall, our NanoCRCA assay with further validation may facilitate prospective validation of CRC subtypes in clinical trials and beyond

    Predicting Patterns of Customer Usage, with Niftecash

    Get PDF
    Report is the result of the working during 93rd European Study Group with Industry in Limerick

    Uridine-derived ribose fuels glucose-restricted pancreatic cancer.

    Get PDF
    Pancreatic ductal adenocarcinoma (PDA) is a lethal disease notoriously resistant to therapy1,2. This is mediated in part by a complex tumour microenvironment3, low vascularity4, and metabolic aberrations5,6. Although altered metabolism drives tumour progression, the spectrum of metabolites used as nutrients by PDA remains largely unknown. Here we identified uridine as a fuel for PDA in glucose-deprived conditions by assessing how more than 175 metabolites impacted metabolic activity in 21 pancreatic cell lines under nutrient restriction. Uridine utilization strongly correlated with the expression of uridine phosphorylase 1 (UPP1), which we demonstrate liberates uridine-derived ribose to fuel central carbon metabolism and thereby support redox balance, survival and proliferation in glucose-restricted PDA cells. In PDA, UPP1 is regulated by KRAS-MAPK signalling and is augmented by nutrient restriction. Consistently, tumours expressed high UPP1 compared with non-tumoural tissues, and UPP1 expression correlated with poor survival in cohorts of patients with PDA. Uridine is available in the tumour microenvironment, and we demonstrated that uridine-derived ribose is actively catabolized in tumours. Finally, UPP1 deletion restricted the ability of PDA cells to use uridine and blunted tumour growth in immunocompetent mouse models. Our data identify uridine utilization as an important compensatory metabolic process in nutrient-deprived PDA cells, suggesting a novel metabolic axis for PDA therapy

    A seven-Gene Signature assay improves prognostic risk stratification of perioperative chemotherapy treated gastroesophageal cancer patients from the MAGIC trial

    Get PDF
    BACKGROUND: Following neoadjuvant chemotherapy for operable gastroesophageal cancer, lymph node metastasis is the only validated prognostic variable; however, within lymph node groups there is still heterogeneity with risk of relapse. We hypothesized that gene profiles from neoadjuvant chemotherapy treated resection specimens from gastroesophageal cancer patients can be used to define prognostic risk groups to identify patients at risk for relapse. PATIENTS AND METHODS: The Medical Research Council Adjuvant Gastric Infusional Chemotherapy (MAGIC) trial (n = 202 with high quality RNA) samples treated with perioperative chemotherapy were profiled for a custom gastric cancer gene panel using the NanoString platform. Genes associated with overall survival (OS) were identified using penalized and standard Cox regression, followed by generation of risk scores and development of a NanoString biomarker assay to stratify patients into risk groups associated with OS. An independent dataset served as a validation cohort. RESULTS: Regression and clustering analysis of MAGIC patients defined a seven-Gene Signature and two risk groups with different OS [hazard ratio (HR) 5.1; P < 0.0001]. The median OS of high- and low-risk groups were 10.2 [95% confidence interval (CI) of 6.5 and 13.2 months] and 80.9 months (CI: 43.0 months and not assessable), respectively. Risk groups were independently prognostic of lymph node metastasis by multivariate analysis (HR 3.6 in node positive group, P = 0.02; HR 3.6 in high-risk group, P = 0.0002), and not prognostic in surgery only patients (n = 118; log rank P = 0.2). A validation cohort independently confirmed these findings. CONCLUSIONS: These results suggest that gene-based risk groups can independently predict prognosis in gastroesophageal cancer patients treated with neoadjuvant chemotherapy. This signature and associated assay may help risk stratify these patients for post-surgery chemotherapy in future perioperative chemotherapy-based clinical trials

    A Machine-Learning Tool Concurrently Models Single Omics and Phenome Data for Functional Subtyping and Personalized Cancer Medicine.

    No full text
    One of the major challenges in defining clinically-relevant and less heterogeneous tumor subtypes is assigning biological and/or clinical interpretations to etiological (intrinsic) subtypes. Conventional clustering/subtyping approaches often fail to define such subtypes, as they involve several discrete steps. Here we demonstrate a unique machine-learning method, phenotype mapping (PhenMap), which jointly integrates single omics data with phenotypic information using three published breast cancer datasets (n = 2045). The PhenMap framework uses a modified factor analysis method that is governed by a key assumption that, features from different omics data types are correlated due to specific "hidden/mapping" variables (context-specific mapping variables (CMV)). These variables can be simultaneously modeled with phenotypic data as covariates to yield functional subtypes and their associated features (e.g., genes) and phenotypes. In one example, we demonstrate the identification and validation of six novel "functional" (discrete) subtypes with differential responses to a cyclin-dependent kinase (CDK)4/6 inhibitor and etoposide by jointly integrating transcriptome profiles with four different drug response data from 37 breast cancer cell lines. These robust subtypes are also present in patient breast tumors with different prognosis. In another example, we modeled patient gene expression profiles and clinical covariates together to identify continuous subtypes with clinical/biological implications. Overall, this genome-phenome machine-learning integration tool, PhenMap identifies functional and phenotype-integrated discrete or continuous subtypes with clinical translational potential
    corecore