12,797 research outputs found

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    EPMA position paper in cancer:current overview and future perspectives

    Get PDF
    At present, a radical shift in cancer treatment is occurring in terms of predictive, preventive, and personalized medicine (PPPM). Individual patients will participate in more aspects of their healthcare. During the development of PPPM, many rapid, specific, and sensitive new methods for earlier detection of cancer will result in more efficient management of the patient and hence a better quality of life. Coordination of the various activities among different healthcare professionals in primary, secondary, and tertiary care requires well-defined competencies, implementation of training and educational programs, sharing of data, and harmonized guidelines. In this position paper, the current knowledge to understand cancer predisposition and risk factors, the cellular biology of cancer, predictive markers and treatment outcome, the improvement in technologies in screening and diagnosis, and provision of better drug development solutions are discussed in the context of a better implementation of personalized medicine. Recognition of the major risk factors for cancer initiation is the key for preventive strategies (EPMA J. 4(1):6, 2013). Of interest, cancer predisposing syndromes in particular the monogenic subtypes that lead to cancer progression are well defined and one should focus on implementation strategies to identify individuals at risk to allow preventive measures and early screening/diagnosis. Implementation of such measures is disturbed by improper use of the data, with breach of data protection as one of the risks to be heavily controlled. Population screening requires in depth cost-benefit analysis to justify healthcare costs, and the parameters screened should provide information that allow an actionable and deliverable solution, for better healthcare provision

    Identification of Topological Features in Renal Tumor Microenvironment Associated with Patient Survival

    Get PDF
    Motivation As a highly heterogeneous disease, the progression of tumor is not only achieved by unlimited growth of the tumor cells, but also supported, stimulated, and nurtured by the microenvironment around it. However, traditional qualitative and/or semi-quantitative parameters obtained by pathologist’s visual examination have very limited capability to capture this interaction between tumor and its microenvironment. With the advent of digital pathology, computerized image analysis may provide a better tumor characterization and give new insights into this problem. Results We propose a novel bioimage informatics pipeline for automatically characterizing the topological organization of different cell patterns in the tumor microenvironment. We apply this pipeline to the only publicly available large histopathology image dataset for a cohort of 190 patients with papillary renal cell carcinoma obtained from The Cancer Genome Atlas project. Experimental results show that the proposed topological features can successfully stratify early- and middle-stage patients with distinct survival, and show superior performance to traditional clinical features and cellular morphological and intensity features. The proposed features not only provide new insights into the topological organizations of cancers, but also can be integrated with genomic data in future studies to develop new integrative biomarkers

    INTEGRATIVE ANALYSIS OF OMICS DATA IN ADULT GLIOMA AND OTHER TCGA CANCERS TO GUIDE PRECISION MEDICINE

    Get PDF
    Transcriptomic profiling and gene expression signatures have been widely applied as effective approaches for enhancing the molecular classification, diagnosis, prognosis or prediction of therapeutic response towards personalized therapy for cancer patients. Thanks to modern genome-wide profiling technology, scientists are able to build engines leveraging massive genomic variations and integrating with clinical data to identify “at risk” individuals for the sake of prevention, diagnosis and therapeutic interventions. In my graduate work for my Ph.D. thesis, I have investigated genomic sequencing data mining to comprehensively characterise molecular classifications and aberrant genomic events associated with clinical prognosis and treatment response, through applying high-dimensional omics genomic data to promote the understanding of gene signatures and somatic molecular alterations contributing to cancer progression and clinical outcomes. Following this motivation, my dissertation has been focused on the following three topics in translational genomics. 1) Characterization of transcriptomic plasticity and its association with the tumor microenvironment in glioblastoma (GBM). I have integrated transcriptomic, genomic, protein and clinical data to increase the accuracy of GBM classification, and identify the association between the GBM mesenchymal subtype and reduced tumorpurity, accompanied with increased presence of tumor-associated microglia. Then I have tackled the sole source of microglial as intrinsic tumor bulk but not their corresponding neurosphere cells through both transcriptional and protein level analysis using a panel of sphere-forming glioma cultures and their parent GBM samples.FurthermoreI have demonstrated my hypothesis through longitudinal analysis of paired primary and recurrent GBM samples that the phenotypic alterations of GBM subtypes are not due to intrinsic proneural-to-mesenchymal transition in tumor cells, rather it is intertwined with increased level of microglia upon disease recurrence. Collectively I have elucidated the critical role of tumor microenvironment (Microglia and macrophages from central nervous system) contributing to the intra-tumor heterogeneity and accurate classification of GBM patients based on transcriptomic profiling, which will not only significantly impact on clinical perspective but also pave the way for preclinical cancer research. 2) Identification of prognostic gene signatures that stratify adult diffuse glioma patientsharboring1p/19q co-deletions. I have compared multiple statistical methods and derived a gene signature significantly associated with survival by applying a machine learning algorithm. Then I have identified inflammatory response and acetylation activity that associated with malignant progression of 1p/19q co-deleted glioma. In addition, I showed this signature translates to other types of adult diffuse glioma, suggesting its universality in the pathobiology of other subset gliomas. My efforts on integrative data analysis of this highly curated data set usingoptimizedstatistical models will reflect the pending update to WHO classification system oftumorsin the central nervous system (CNS). 3) Comprehensive characterization of somatic fusion transcripts in Pan-Cancers. I have identified a panel of novel fusion transcripts across all of TCGA cancer types through transcriptomic profiling. Then I have predicted fusion proteins with kinase activity and hub function of pathway network based on the annotation of genetically mobile domains and functional domain architectures. I have evaluated a panel of in -frame gene fusions as potential driver mutations based on network fusion centrality hypothesis. I have also characterised the emerging complexity of genetic architecture in fusion transcripts through integrating genomic structure and somatic variants and delineating the distinct genomic patterns of fusion events across different cancer types. Overall my exploration of the pathogenetic impact and clinical relevance of candidate gene fusions have provided fundamental insights into the management of a subset of cancer patients by predicting the oncogenic signalling and specific drug targets encoded by these fusion genes. Taken together, the translational genomic research I have conducted during my Ph.D. study will shed new light on precision medicine and contribute to the cancer research community. The novel classification concept, gene signature and fusion transcripts I have identified will address several hotly debated issues in translational genomics, such as complex interactions between tumor bulks and their adjacent microenvironments, prognostic markers for clinical diagnostics and personalized therapy, distinct patterns of genomic structure alterations and oncogenic events in different cancer types, therefore facilitating our understanding of genomic alterations and moving us towards the development of precision medicine

    A novel integrative risk index of papillary thyroid cancer progression combining genomic alterations and clinical factors.

    Get PDF
    Although the majority of papillary thyroid cancer (PTC) is indolent, a subset of PTC behaves aggressively despite the best available treatment. A major clinical challenge is to reliably distinguish early on between those patients who need aggressive treatment from those who do not. Using a large cohort of PTC samples obtained from The Cancer Genome Atlas (TCGA), we analyzed the association between disease progression and multiple forms of genomic data, such as transcriptome, somatic mutations, and somatic copy number alterations, and found that genes related to FOXM1 signaling pathway were significantly associated with PTC progression. Integrative genomic modeling was performed, controlling for demographic and clinical characteristics, which included patient age, gender, TNM stages, histological subtypes, and history of other malignancy, using a leave-one-out elastic net model and 10-fold cross validation. For each subject, the model from the remaining subjects was used to determine the risk index, defined as a linear combination of the clinical and genomic variables from the elastic net model, and the stability of the risk index distribution was assessed through 2,000 bootstrap resampling. We developed a novel approach to combine genomic alterations and patient-related clinical factors that delineates the subset of patients who have more aggressive disease from those whose tumors are indolent and likely will require less aggressive treatment and surveillance (p = 4.62 × 10-10, log-rank test). Our results suggest that risk index modeling that combines genomic alterations with current staging systems provides an opportunity for more effective anticipation of disease prognosis and therefore enhanced precision management of PTC

    Integrative, In Silico and Comparative Analysis of Breast Cancer Secretome Highlights Invasive-Ductal-Carcinoma-Grade Progression Biomarkers

    Get PDF
    Funding: This research was funded by NHS Grampian Endowment Fund grant number NER11101 Acknowledgments: The authors would like to thank the NHS Grampian Breast Cancer Endowment Fund body for supporting the publication of the present manuscript and funding publication fees.Peer reviewedPublisher PD

    Integrative analyses of transcriptome sequencing identify novel functional lncRNAs in esophageal squamous cell carcinoma.

    Get PDF
    Long non-coding RNAs (lncRNAs) have a critical role in cancer initiation and progression, and thus may mediate oncogenic or tumor suppressing effects, as well as be a new class of cancer therapeutic targets. We performed high-throughput sequencing of RNA (RNA-seq) to investigate the expression level of lncRNAs and protein-coding genes in 30 esophageal samples, comprised of 15 esophageal squamous cell carcinoma (ESCC) samples and their 15 paired non-tumor tissues. We further developed an integrative bioinformatics method, denoted URW-LPE, to identify key functional lncRNAs that regulate expression of downstream protein-coding genes in ESCC. A number of known onco-lncRNA and many putative novel ones were effectively identified by URW-LPE. Importantly, we identified lncRNA625 as a novel regulator of ESCC cell proliferation, invasion and migration. ESCC patients with high lncRNA625 expression had significantly shorter survival time than those with low expression. LncRNA625 also showed specific prognostic value for patients with metastatic ESCC. Finally, we identified E1A-binding protein p300 (EP300) as a downstream executor of lncRNA625-induced transcriptional responses. These findings establish a catalog of novel cancer-associated functional lncRNAs, which will promote our understanding of lncRNA-mediated regulation in this malignancy

    Integrative analyses identify modulators of response to neoadjuvant aromatase inhibitors in patients with early breast cancer

    Get PDF
    Introduction Aromatase inhibitors (AIs) are a vital component of estrogen receptor positive (ER+) breast cancer treatment. De novo and acquired resistance, however, is common. The aims of this study were to relate patterns of copy number aberrations to molecular and proliferative response to AIs, to study differences in the patterns of copy number aberrations between breast cancer samples pre- and post-AI neoadjuvant therapy, and to identify putative biomarkers for resistance to neoadjuvant AI therapy using an integrative analysis approach. Methods Samples from 84 patients derived from two neoadjuvant AI therapy trials were subjected to copy number profiling by microarray-based comparative genomic hybridisation (aCGH, n = 84), gene expression profiling (n = 47), matched pre- and post-AI aCGH (n = 19 pairs) and Ki67-based AI-response analysis (n = 39). Results Integrative analysis of these datasets identified a set of nine genes that, when amplified, were associated with a poor response to AIs, and were significantly overexpressed when amplified, including CHKA, LRP5 and SAPS3. Functional validation in vitro, using cell lines with and without amplification of these genes (SUM44, MDA-MB134-VI, T47D and MCF7) and a model of acquired AI-resistance (MCF7-LTED) identified CHKA as a gene that when amplified modulates estrogen receptor (ER)-driven proliferation, ER/estrogen response element (ERE) transactivation, expression of ER-regulated genes and phosphorylation of V-AKT murine thymoma viral oncogene homolog 1 (AKT1). Conclusions These data provide a rationale for investigation of the role of CHKA in further models of de novo and acquired resistance to AIs, and provide proof of concept that integrative genomic analyses can identify biologically relevant modulators of AI response

    Yes-associated protein (YAP) in pancreatic cancer: at the epicenter of a targetable signaling network associated with patient survival.

    Get PDF
    Pancreatic ductal adenocarcinoma (PDAC) is generally a fatal disease with no efficacious treatment modalities. Elucidation of signaling mechanisms that will lead to the identification of novel targets for therapy and chemoprevention is urgently needed. Here, we review the role of Yes-associated protein (YAP) and WW-domain-containing Transcriptional co-Activator with a PDZ-binding motif (TAZ) in the development of PDAC. These oncogenic proteins are at the center of a signaling network that involves multiple upstream signals and downstream YAP-regulated genes. We also discuss the clinical significance of the YAP signaling network in PDAC using a recently published interactive open-access database (www.proteinatlas.org/pathology) that allows genome-wide exploration of the impact of individual proteins on survival outcomes. Multiple YAP/TEAD-regulated genes, including AJUBA, ANLN, AREG, ARHGAP29, AURKA, BUB1, CCND1, CDK6, CXCL5, EDN2, DKK1, FOSL1,FOXM1, HBEGF, IGFBP2, JAG1, NOTCH2, RHAMM, RRM2, SERP1, and ZWILCH, are associated with unfavorable survival of PDAC patients. Similarly, components of AP-1 that synergize with YAP (FOSL1), growth factors (TGFα, EPEG, and HBEGF), a specific integrin (ITGA2), heptahelical receptors (P2Y2R, GPR87) and an inhibitor of the Hippo pathway (MUC1), all of which stimulate YAP activity, are associated with unfavorable survival of PDAC patients. By contrast, YAP inhibitory pathways (STRAD/LKB-1/AMPK, PKA/LATS, and TSC/mTORC1) indicate a favorable prognosis. These associations emphasize that the YAP signaling network correlates with poor survival of pancreatic cancer patients. We conclude that the YAP pathway is a major determinant of clinical aggressiveness in PDAC patients and a target for therapeutic and preventive strategies in this disease
    corecore