Matrix and tensor comparisons of genomic profiles to predict cancer survival and drug targets

Abstract

disseratationDespite recent large-scale profiling efforts, the best predictor of a glioblastoma (GBM) brain cancer patient's survival remains the patient's age at diagnosis. The best predictor of an ovarian serous cystadenocarcinoma (OV) patient's survival remains the tumor's stage, an assessment - numbering I to IV - of the spread of the cancer. To identify DNA copy-number alterations (CNAs) that might predict GBM or OV patients' survival, we comparatively modeled matched genomic profiles from The Cancer Genome Atlas (TCGA). Generalized singular value decomposition (GSVD) of patient-matched but probe- independent GBM and normal profiles uncovered a previously unknown global pattern of tumor-exclusive co-occurring CNAs that is correlated, and possibly causally related to, GBM patients' survival and response to chemotherapy. This suggests that the GBM survival phenotype is an outcome of its global genotype. The GSVD, formulated as a framework for comparatively modeling two composite datasets, removes from the pattern variations that occur in the normal human genome (e.g., female-specific X chromosome amplification) and experimental variations, without a-priori knowledge of these variations. The pattern is independent of age, and combined with age, makes a better predictor than age alone. The pattern suggests previously unrecognized targets for personalized GBM drug therapy, the kinase TLK2 and the methyltransferase METTL2A. A novel tensor GSVD of patient- and platform-matched OV and normal genomic profiles revealed multiple chromosome arm-wide patterns of CNAs that are correlated with OV patients' survival. These indicate several, previously unrecognized, subtypes of OV. The tensor GSVD is an exact simultaneous decomposition of two high-dimensional datasets arranged in higher-order tensors. The tensor GSVD generalizes the GSVD, which is limited to two second-order tensors, i.e., matrices. The chromosome arm-wide patterns of CNAs are independent of the OV tumor stage. Combined with stage, each of the patterns makes a better predictor than stage alone. We conclude that the GSVD and the novel tensor GSVD can uncover the relations, and possibly causal coordinations, between different recorded aspects of the same medical phenomenon. GSVD and tensor GSVD comparisons can be used to determine one patient's medical status in relation to other patients in a set, and inform the patient's prognosis, and possibly also treatment

    Similar works