18,325 research outputs found
Joint co-clustering: co-clustering of genomic and clinical bioimaging data
AbstractFor better understanding the genetic mechanisms underlying clinical observations, and better defining a group of potential candidates for protein-family-inhibiting therapy, it is interesting to determine the correlations between genomic, clinical data and data coming from high resolution and fluorescent microscopy. We introduce a computational method, called joint co-clustering, that can find co-clusters or groups of genes, bioimaging parameters and clinical traits that are believed to be closely related to each other based on the given empirical information. As bioimaging parameters, we quantify the expression of growth factor receptor EGFR/erb-B family in non-small cell lung carcinoma (NSCLC) through a fully-automated computer-aided analysis approach. This immunohistochemical analysis is usually performed by pathologists via visual inspection of tissue samples images. Our fully-automated techniques streamlines this error-prone and time-consuming process, thereby facilitating analysis and diagnosis. Experimental results for several real-life datasets demonstrate the high quantitative precision of our approach. The joint co-clustering method was tested with the receptor EGFR/erb-B family data on non-small cell lung carcinoma (NSCLC) tissue and identified statistically significant co-clusters of genes, receptor protein expression and clinical traits. The validation of our results with the literature suggest that the proposed method can provide biologically meaningful co-clusters of genes and traits and that it is a very promising approach to analyse large-scale biological data and to study multi-factorial genetic pathologies through their genetic alterations
INTEGRATIVE ANALYSIS OF OMICS DATA IN ADULT GLIOMA AND OTHER TCGA CANCERS TO GUIDE PRECISION MEDICINE
Transcriptomic profiling and gene expression signatures have been widely applied as effective approaches for enhancing the molecular classification, diagnosis, prognosis or prediction of therapeutic response towards personalized therapy for cancer patients. Thanks to modern genome-wide profiling technology, scientists are able to build engines leveraging massive genomic variations and integrating with clinical data to identify āat riskā individuals for the sake of prevention, diagnosis and therapeutic interventions. In my graduate work for my Ph.D. thesis, I have investigated genomic sequencing data mining to comprehensively characterise molecular classifications and aberrant genomic events associated with clinical prognosis and treatment response, through applying high-dimensional omics genomic data to promote the understanding of gene signatures and somatic molecular alterations contributing to cancer progression and clinical outcomes. Following this motivation, my dissertation has been focused on the following three topics in translational genomics.
1) Characterization of transcriptomic plasticity and its association with the tumor microenvironment in glioblastoma (GBM). I have integrated transcriptomic, genomic, protein and clinical data to increase the accuracy of GBM classification, and identify the association between the GBM mesenchymal subtype and reduced tumorpurity, accompanied with increased presence of tumor-associated microglia. Then I have tackled the sole source of microglial as intrinsic tumor bulk but not their corresponding neurosphere cells through both transcriptional and protein level analysis using a panel of sphere-forming glioma cultures and their parent GBM samples.FurthermoreI have demonstrated my hypothesis through longitudinal analysis of paired primary and recurrent GBM samples that the phenotypic alterations of GBM subtypes are not due to intrinsic proneural-to-mesenchymal transition in tumor cells, rather it is intertwined with increased level of microglia upon disease recurrence. Collectively I have elucidated the critical role of tumor microenvironment (Microglia and macrophages from central nervous system) contributing to the intra-tumor heterogeneity and accurate classification of GBM patients based on transcriptomic profiling, which will not only significantly impact on clinical perspective but also pave the way for preclinical cancer research.
2) Identification of prognostic gene signatures that stratify adult diffuse glioma patientsharboring1p/19q co-deletions. I have compared multiple statistical methods and derived a gene signature significantly associated with survival by applying a machine learning algorithm. Then I have identified inflammatory response and acetylation activity that associated with malignant progression of 1p/19q co-deleted glioma. In addition, I showed this signature translates to other types of adult diffuse glioma, suggesting its universality in the pathobiology of other subset gliomas. My efforts on integrative data analysis of this highly curated data set usingoptimizedstatistical models will reflect the pending update to WHO classification system oftumorsin the central nervous system (CNS).
3) Comprehensive characterization of somatic fusion transcripts in Pan-Cancers. I have identified a panel of novel fusion transcripts across all of TCGA cancer types through transcriptomic profiling. Then I have predicted fusion proteins with kinase activity and hub function of pathway network based on the annotation of genetically mobile domains and functional domain architectures. I have evaluated a panel of in -frame gene fusions as potential driver mutations based on network fusion centrality hypothesis. I have also characterised the emerging complexity of genetic architecture in fusion transcripts through integrating genomic structure and somatic variants and delineating the distinct genomic patterns of fusion events across different cancer types. Overall my exploration of the pathogenetic impact and clinical relevance of candidate gene fusions have provided fundamental insights into the management of a subset of cancer patients by predicting the oncogenic signalling and specific drug targets encoded by these fusion genes.
Taken together, the translational genomic research I have conducted during my Ph.D. study will shed new light on precision medicine and contribute to the cancer research community. The novel classification concept, gene signature and fusion transcripts I have identified will address several hotly debated issues in translational genomics, such as complex interactions between tumor bulks and their adjacent microenvironments, prognostic markers for clinical diagnostics and personalized therapy, distinct patterns of genomic structure alterations and oncogenic events in different cancer types, therefore facilitating our understanding of genomic alterations and moving us towards the development of precision medicine
Recommended from our members
Exploration of PET and MRI radiomic features for decoding breast cancer phenotypes and prognosis.
Radiomics is an emerging technology for imaging biomarker discovery and disease-specific personalized treatment management. This paper aims to determine the benefit of using multi-modality radiomics data from PET and MR images in the characterization breast cancer phenotype and prognosis. Eighty-four features were extracted from PET and MR images of 113 breast cancer patients. Unsupervised clustering based on PET and MRI radiomic features created three subgroups. These derived subgroups were statistically significantly associated with tumor grade (pā=ā2.0āĆā10-6), tumor overall stage (pā=ā0.037), breast cancer subtypes (pā=ā0.0085), and disease recurrence status (pā=ā0.0053). The PET-derived first-order statistics and gray level co-occurrence matrix (GLCM) textural features were discriminative of breast cancer tumor grade, which was confirmed by the results of L2-regularization logistic regression (with repeated nested cross-validation) with an estimated area under the receiver operating characteristic curve (AUC) of 0.76 (95% confidence interval (CI)ā=ā[0.62, 0.83]). The results of ElasticNet logistic regression indicated that PET and MR radiomics distinguished recurrence-free survival, with a mean AUC of 0.75 (95% CIā=ā[0.62, 0.88]) and 0.68 (95% CIā=ā[0.58, 0.81]) for 1 and 2 years, respectively. The MRI-derived GLCM inverse difference moment normalized (IDMN) and the PET-derived GLCM cluster prominence were among the key features in the predictive models for recurrence-free survival. In conclusion, radiomic features from PET and MR images could be helpful in deciphering breast cancer phenotypes and may have potential as imaging biomarkers for prediction of breast cancer recurrence-free survival
Computational purification of individual tumor gene expression profiles leads to significant improvements in prognostic prediction.
Tumor heterogeneity is a limiting factor in cancer treatment and in the discovery of biomarkers to personalize it. We describe a computational purification tool, ISOpure, to directly address the effects of variable normal tissue contamination in clinical tumor specimens. ISOpure uses a set of tumor expression profiles and a panel of healthy tissue expression profiles to generate a purified cancer profile for each tumor sample and an estimate of the proportion of RNA originating from cancerous cells. Applying ISOpure before identifying gene signatures leads to significant improvements in the prediction of prognosis and other clinical variables in lung and prostate cancer
A novel integrative risk index of papillary thyroid cancer progression combining genomic alterations and clinical factors.
Although the majority of papillary thyroid cancer (PTC) is indolent, a subset of PTC behaves aggressively despite the best available treatment. A major clinical challenge is to reliably distinguish early on between those patients who need aggressive treatment from those who do not. Using a large cohort of PTC samples obtained from The Cancer Genome Atlas (TCGA), we analyzed the association between disease progression and multiple forms of genomic data, such as transcriptome, somatic mutations, and somatic copy number alterations, and found that genes related to FOXM1 signaling pathway were significantly associated with PTC progression. Integrative genomic modeling was performed, controlling for demographic and clinical characteristics, which included patient age, gender, TNM stages, histological subtypes, and history of other malignancy, using a leave-one-out elastic net model and 10-fold cross validation. For each subject, the model from the remaining subjects was used to determine the risk index, defined as a linear combination of the clinical and genomic variables from the elastic net model, and the stability of the risk index distribution was assessed through 2,000 bootstrap resampling. We developed a novel approach to combine genomic alterations and patient-related clinical factors that delineates the subset of patients who have more aggressive disease from those whose tumors are indolent and likely will require less aggressive treatment and surveillance (p = 4.62 Ć 10-10, log-rank test). Our results suggest that risk index modeling that combines genomic alterations with current staging systems provides an opportunity for more effective anticipation of disease prognosis and therefore enhanced precision management of PTC
Unconventional machine learning of genome-wide human cancer data
Recent advances in high-throughput genomic technologies coupled with
exponential increases in computer processing and memory have allowed us to
interrogate the complex aberrant molecular underpinnings of human disease from
a genome-wide perspective. While the deluge of genomic information is expected
to increase, a bottleneck in conventional high-performance computing is rapidly
approaching. Inspired in part by recent advances in physical quantum
processors, we evaluated several unconventional machine learning (ML)
strategies on actual human tumor data. Here we show for the first time the
efficacy of multiple annealing-based ML algorithms for classification of
high-dimensional, multi-omics human cancer data from the Cancer Genome Atlas.
To assess algorithm performance, we compared these classifiers to a variety of
standard ML methods. Our results indicate the feasibility of using
annealing-based ML to provide competitive classification of human cancer types
and associated molecular subtypes and superior performance with smaller
training datasets, thus providing compelling empirical evidence for the
potential future application of unconventional computing architectures in the
biomedical sciences
Pancancer analysis of DNA methylation-driven genes using MethylMix.
Aberrant DNA methylation is an important mechanism that contributes to oncogenesis. Yet, few algorithms exist that exploit this vast dataset to identify hypo- and hypermethylated genes in cancer. We developed a novel computational algorithm called MethylMix to identify differentially methylated genes that are also predictive of transcription. We apply MethylMix to 12 individual cancer sites, and additionally combine all cancer sites in a pancancer analysis. We discover pancancer hypo- and hypermethylated genes and identify novel methylation-driven subgroups with clinical implications. MethylMix analysis on combined cancer sites reveals 10 pancancer clusters reflecting new similarities across malignantly transformed tissues
- ā¦