33 research outputs found

    Estimating Shared Copy Number Aberrations for Array CGH Data: The Linear-Median Method

    Get PDF
    Motivation: Existing methods for estimating copy number variations in array comparative genomic hybridization (aCGH) data are limited to estimations of the gain/loss of chromosome regions for single sample analysis. We propose the linear-median method for estimating shared copy numbers in DNA sequences across multiple samples, demonstrate its operating characteristics through simulations and applications to real cancer data, and compare it to two existing methods. Results: Our proposed linear-median method has the power to estimate common changes that appear at isolated single probe positions or very short regions. Such changes are hard to detect by current methods. This new method shows a higher rate of true positives and a lower rate of false positives. The linear-median method is non-parametric and hence is more robust in estimating copy number. Additionally, the linear-median method is easily computable for practical aCGH data sets compared to other copy number estimation methods. Supplementary Information: Supporting materials are available at Cancer Informatics online

    Mir-21-Sox2 Axis Delineates Glioblastoma Subtypes with Prognostic Impact.

    Get PDF
    UNLABELLED: Glioblastoma (GBM) is the most aggressive human brain tumor. Although several molecular subtypes of GBM are recognized, a robust molecular prognostic marker has yet to be identified. Here, we report that the stemness regulator Sox2 is a new, clinically important target of microRNA-21 (miR-21) in GBM, with implications for prognosis. Using the MiR-21-Sox2 regulatory axis, approximately half of all GBM tumors present in the Cancer Genome Atlas (TCGA) and in-house patient databases can be mathematically classified into high miR-21/low Sox2 (Class A) or low miR-21/high Sox2 (Class B) subtypes. This classification reflects phenotypically and molecularly distinct characteristics and is not captured by existing classifications. Supporting the distinct nature of the subtypes, gene set enrichment analysis of the TCGA dataset predicted that Class A and Class B tumors were significantly involved in immune/inflammatory response and in chromosome organization and nervous system development, respectively. Patients with Class B tumors had longer overall survival than those with Class A tumors. Analysis of both databases indicated that the Class A/Class B classification is a better predictor of patient survival than currently used parameters. Further, manipulation of MiR-21-Sox2 levels in orthotopic mouse models supported the longer survival of the Class B subtype. The MiR-21-Sox2 association was also found in mouse neural stem cells and in the mouse brain at different developmental stages, suggesting a role in normal development. Therefore, this mechanism-based classification suggests the presence of two distinct populations of GBM patients with distinguishable phenotypic characteristics and clinical outcomes. SIGNIFICANCE STATEMENT: Molecular profiling-based classification of glioblastoma (GBM) into four subtypes has substantially increased our understanding of the biology of the disease and has pointed to the heterogeneous nature of GBM. However, this classification is not mechanism based and its prognostic value is limited. Here, we identify a new mechanism in GBM (the miR-21-Sox2 axis) that can classify ∼50% of patients into two subtypes with distinct molecular, radiological, and pathological characteristics. Importantly, this classification can predict patient survival better than the currently used parameters. Further, analysis of the miR-21-Sox2 relationship in mouse neural stem cells and in the mouse brain at different developmental stages indicates that miR-21 and Sox2 are predominantly expressed in mutually exclusive patterns, suggesting a role in normal neural development

    Novel algorithmic approach predicts tumor mutation load and correlates with immunotherapy clinical outcomes using a defined gene mutation set

    Get PDF
    BACKGROUND: While clinical outcomes following immunotherapy have shown an association with tumor mutation load using whole exome sequencing (WES), its clinical applicability is currently limited by cost and bioinformatics requirements. METHODS: We developed a method to accurately derive the predicted total mutation load (PTML) within individual tumors from a small set of genes that can be used in clinical next generation sequencing (NGS) panels. PTML was derived from the actual total mutation load (ATML) of 575 distinct melanoma and lung cancer samples and validated using independent melanoma (n = 312) and lung cancer (n = 217) cohorts. The correlation of PTML status with clinical outcome, following distinct immunotherapies, was assessed using the Kaplan–Meier method. RESULTS: PTML (derived from 170 genes) was highly correlated with ATML in cutaneous melanoma and lung adenocarcinoma validation cohorts (R(2) = 0.73 and R(2) = 0.82, respectively). PTML was strongly associated with clinical outcome to ipilimumab (anti-CTLA-4, three cohorts) and adoptive T-cell therapy (1 cohort) clinical outcome in melanoma. Clinical benefit from pembrolizumab (anti-PD-1) in lung cancer was also shown to significantly correlate with PTML status (log rank P value < 0.05 in all cohorts). CONCLUSIONS: The approach of using small NGS gene panels, already applied to guide employment of targeted therapies, may have utility in the personalized use of immunotherapy in cancer. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12916-016-0705-4) contains supplementary material, which is available to authorized users

    Genetic Association Studies of Copy-Number Variation: Should Assignment of Copy Number States Precede Testing?

    Get PDF
    Recently, structural variation in the genome has been implicated in many complex diseases. Using genomewide single nucleotide polymorphism (SNP) arrays, researchers are able to investigate the impact not only of SNP variation, but also of copy-number variants (CNVs) on the phenotype. The most common analytic approach involves estimating, at the level of the individual genome, the underlying number of copies present at each location. Once this is completed, tests are performed to determine the association between copy number state and phenotype. An alternative approach is to carry out association testing first, between phenotype and raw intensities from the SNP array at the level of the individual marker, and then aggregate neighboring test results to identify CNVs associated with the phenotype. Here, we explore the strengths and weaknesses of these two approaches using both simulations and real data from a pharmacogenomic study of the chemotherapeutic agent gemcitabine. Our results indicate that pooled marker-level testing is capable of offering a dramatic increase in power (-fold) over CNV-level testing, particularly for small CNVs. However, CNV-level testing is superior when CNVs are large and rare; understanding these tradeoffs is an important consideration in conducting association studies of structural variation

    Bayesian Analysis of Curves Shape Variation Through Registration and Regression

    Full text link
    This manuscript reviews the use of Bayesian hierarchical curve registration in Biostatistics and Bioinformatics.Several models allowing for unit-specific random time scales are discussed and applied to longitudinal dataarising in biomedicine, pharmacokinetics and time-course genomics. We consider representations of random functionals based on P-spline priors. Under this framework, straightforward posterior simulation strategies are outlined for inference.Beyond curve registration, we discuss jointregression modeling of both random effects and population level functional quantities. Finally, the use of mixture priors is discussed in the setting of differential expression analysis

    Supplementary material for Estimating Copy Numbers for Shared Array CGH Data: the Linear-Median Method

    Get PDF
    Supplementary material for Estimating Copy Numbers for Shared Array CGH Data: the Linear-Median Metho

    Integrative Bayesian Network Analysis of Genomic Data

    No full text

    Integrative network-based Bayesian analysis of diverse genomics data

    No full text
    Background: In order to better understand cancer as a complex disease with multiple genetic and epigenetic factors, it is vital to model the fundamental biological relationships among these alterations as well as their relationships with important clinical outcomes. Methods: We develop an i ntegrative net work-based Bayesian analysis (iNET) approach that allows us to jointly analyze multi-platform high-dimensional genomic data in a computationally efficient manner. The iNET approach is formulated as an objective Bayesian model selection problem for Gaussian graphical models to model joint dependencies among platform-specific features using known biological mechanisms. Using both simulated datasets and a glioblastoma (GBM) study from The Cancer Genome Atlas (TCGA), we illustrate the iNET approach via integrating three data types, microRNA, gene expression (mRNA), and patient survival time. Results: We show that the iNET approach has greater power in identifying cancer-related microRNAs than non-integrative approaches based on realistic simulated datasets. In the TCGA GBM study, we found many mRNA-microRNA pairs and microRNAs that are associated with patient survival time, with some of these associations identified in previous studies. Conclusions: The iNET discovers relationships consistent with the underlying biological mechanisms among these variables, as well as identifying important biomarkers that are potentially relevant to patient survival. In addition, we identified some microRNAs that can potentially affect patient survival which are missed by non-integrative approaches. </p
    corecore