53 research outputs found

    Computational Hybrid Systems for Identifying Prognostic Gene Markers of Lung Cancer

    Get PDF
    Lung cancer is the most fatal cancer around the world. Current lung cancer prognosis and treatment is based on tumor stage population statistics and could not reliably assess the risk for developing recurrence in individual patients. Biomarkers enable treatment options to be tailored to individual patients based on their tumor molecular characteristics. To date, there is no clinically applied molecular prognostic model for lung cancer. Statistics and feature selection methods identify gene candidates by ranking the association between gene expression and disease outcome, but do not account for the interactions among genes. Computational network methods could model interactions, but have not been used for gene selection due to computational inefficiency. Moreover, the curse of dimensionality in human genome data imposes more computational challenges to these methods.;We proposed two hybrid systems for the identification of prognostic gene signatures for lung cancer using gene expressions measured with DNA microarray. The first hybrid system combined t-tests, Statistical Analysis of Microarray (SAM), and Relief feature selections in multiple gene filtering layers. This combinatorial system identified a 12-gene signature with better prognostic performance than published signatures in treatment selection for stage I and II patients (log-rank P\u3c0.04, Kaplan-Meier analyses). The 12-gene signature is a more significant prognostic factor (hazard ratio=4.19, 95% CI: [2.08, 8.46], P\u3c0.00006) than other clinical covariates. The signature genes were found to be involved in tumorigenesis in functional pathway analyses.;The second proposed system employed a novel computational network model, i.e., implication networks based on prediction logic. This network-based system utilizes gene coexpression networks and concurrent coregulation with signaling pathways for biomarker identification. The first application of the system modeled disease-mediated genome-wide coexpression networks. The entire genomic space were extensively explored and 21 gene signatures were discovered with better prognostic performance than all published signatures in stage I patients not receiving chemotherapy (hazard ratio\u3e1, CPE\u3e0.5, P \u3c 0.05). These signatures could potentially be used for selecting patients for adjuvant chemotherapy. The second application of the system modeled the smoking-mediated coexpression networks and identified a smoking-associated 7-gene signature. The 7-gene signature generated significant prognostication specific to smoking lung cancer patients (log-rank P\u3c0.05, Kaplan-Meier analyses), with implications in diagnostic screening of lung cancer risk in smokers (overall accuracy=74%, P\u3c0.006). The coexpression patterns derived from the implication networks in both applications were successfully validated with molecular interactions reported in the literature (FDR\u3c0.1).;Our studies demonstrated that hybrid systems with multiple gene selection layers outperform traditional methods. Moreover, implication networks could efficiently model genome-scale disease-mediated coexpression networks and crosstalk with signaling pathways, leading to the identification of clinically important gene signatures

    Comprehensive evaluation of RNA-seq quantification methods for linearity

    Get PDF
    Figure S3. Concordant analysis between rank of estimated quantifications and rank of measured abundance value at gene level (a) and isoform level (b). The fitted value in the y-axis is estimated from model D∌m×A+n×B+Δ. Ranks were normalized by the number of quantifications in each plot. (PDF 5950 kb

    On the Reproducibility of TCGA Ovarian Cancer MicroRNA Profiles

    Get PDF
    Dysregulated microRNA (miRNA) expression is a well-established feature of human cancer. However, the role of specific miRNAs in determining cancer outcomes remains unclear. Using Level 3 expression data from the Cancer Genome Atlas (TCGA), we identified 61 miRNAs that are associated with overall survival in 469 ovarian cancers profiled by microarray (p<0.01). We also identified 12 miRNAs that are associated with survival when miRNAs were profiled in the same specimens using Next Generation Sequencing (miRNA-Seq) (p<0.01). Surprisingly, only 1 miRNA transcript is associated with ovarian cancer survival in both datasets. Our analyses indicate that this discrepancy is due to the fact that miRNA levels reported by the two platforms correlate poorly, even after correcting for potential issues inherent to signal detection algorithms. Further investigation is warranted

    Hybrid Models Identified a 12-Gene Signature for Lung Cancer Prognosis and Chemoresponse Prediction

    Get PDF
    Lung cancer remains the leading cause of cancer-related deaths worldwide. The recurrence rate ranges from 35-50% among early stage non-small cell lung cancer patients. To date, there is no fully-validated and clinically applied prognostic gene signature for personalized treatment.From genome-wide mRNA expression profiles generated on 256 lung adenocarcinoma patients, a 12-gene signature was identified using combinatorial gene selection methods, and a risk score algorithm was developed with NaĂŻve Bayes. The 12-gene model generates significant patient stratification in the training cohort HLM & UM (n = 256; log-rank P = 6.96e-7) and two independent validation sets, MSK (n = 104; log-rank P = 9.88e-4) and DFCI (n = 82; log-rank P = 2.57e-4), using Kaplan-Meier analyses. This gene signature also stratifies stage I and IB lung adenocarcinoma patients into two distinct survival groups (log-rank P<0.04). The 12-gene risk score is more significant (hazard ratio = 4.19, 95% CI: [2.08, 8.46]) than other commonly used clinical factors except tumor stage (III vs. I) in multivariate Cox analyses. The 12-gene model is more accurate than previously published lung cancer gene signatures on the same datasets. Furthermore, this signature accurately predicts chemoresistance/chemosensitivity to Cisplatin, Carboplatin, Paclitaxel, Etoposide, Erlotinib, and Gefitinib in NCI-60 cancer cell lines (P<0.017). The identified 12 genes exhibit curated interactions with major lung cancer signaling hallmarks in functional pathway analysis. The expression patterns of the signature genes have been confirmed in RT-PCR analyses of independent tumor samples.The results demonstrate the clinical utility of the identified gene signature in prognostic categorization. With this 12-gene risk score algorithm, early stage patients at high risk for tumor recurrence could be identified for adjuvant chemotherapy; whereas stage I and II patients at low risk could be spared the toxic side effects of chemotherapeutic drugs

    Meta-Analysis of the Alzheimer\u27s Disease Human Brain Transcriptome and Functional Dissection in Mouse Models.

    Get PDF
    We present a consensus atlas of the human brain transcriptome in Alzheimer\u27s disease (AD), based on meta-analysis of differential gene expression in 2,114 postmortem samples. We discover 30 brain coexpression modules from seven regions as the major source of AD transcriptional perturbations. We next examine overlap with 251 brain differentially expressed gene sets from mouse models of AD and other neurodegenerative disorders. Human-mouse overlaps highlight responses to amyloid versus tau pathology and reveal age- and sex-dependent expression signatures for disease progression. Human coexpression modules enriched for neuronal and/or microglial genes broadly overlap with mouse models of AD, Huntington\u27s disease, amyotrophic lateral sclerosis, and aging. Other human coexpression modules, including those implicated in proteostasis, are not activated in AD models but rather following other, unexpected genetic manipulations. Our results comprise a cross-species resource, highlighting transcriptional networks altered by human brain pathophysiology and identifying correspondences with mouse models for AD preclinical studies

    Cluster Analysis of Short Sensory Profile Data Reveals Sensory-Based Subgroups in Autism Spectrum Disorder

    No full text
    Autism spectrum disorder is a common, heterogeneous neurodevelopmental disorder lacking targeted treatments. Additional features include restricted, repetitive patterns of behaviors and differences in sensory processing. We hypothesized that detailed sensory features including modality specific hyper- and hypo-sensitivity could be used to identify clinically recognizable subgroups with unique underlying gene variants. Participants included 378 individuals with a clinical diagnosis of autism spectrum disorder who contributed Short Sensory Profile data assessing the frequency of sensory behaviors and whole genome sequencing results to the Autism Speaks’ MSSNG database. Sensory phenotypes in this cohort were not randomly distributed with 10 patterns describing 43% (162/378) of participants. Cross comparison of two independent cluster analyses on sensory responses identified six distinct sensory-based subgroups. We then characterized subgroups by calculating the percent of patients in each subgroup who had variants with a Combined Annotation Dependent Depletion (CADD) score of 15 or greater in each of 24,896 genes. Each subgroup exhibited a unique pattern of genes with a high frequency of variants. These results support the use of sensory features to identify autism spectrum disorder subgroups with shared genetic variants

    MicroRNA, mRNA, and Proteomics Biomarkers and Therapeutic Targets for Improving Lung Cancer Treatment Outcomes

    No full text
    The majority of lung cancer patients are diagnosed with metastatic disease. This study identified a set of 73 microRNAs (miRNAs) that classified lung cancer tumors from normal lung tissues with an overall accuracy of 96.3% in the training patient cohort (n = 109) and 91.7% in unsupervised classification and 92.3% in supervised classification in the validation set (n = 375). Based on association with patient survival (n = 1016), 10 miRNAs were identified as potential tumor suppressors (hsa-miR-144, hsa-miR-195, hsa-miR-223, hsa-miR-30a, hsa-miR-30b, hsa-miR-30d, hsa-miR-335, hsa-miR-363, hsa-miR-451, and hsa-miR-99a), and 4 were identified as potential oncogenes (hsa-miR-21, hsa-miR-31, hsa-miR-411, and hsa-miR-494) in lung cancer. Experimentally confirmed target genes were identified for the 73 diagnostic miRNAs, from which proliferation genes were selected from CRISPR-Cas9/RNA interference (RNAi) screening assays. Pansensitive and panresistant genes to 21 NCCN-recommended drugs with concordant mRNA and protein expression were identified. DGKE and WDR47 were found with significant associations with responses to both systemic therapies and radiotherapy in lung cancer. Based on our identified miRNA-regulated molecular machinery, an inhibitor of PDK1/Akt BX-912, an anthracycline antibiotic daunorubicin, and a multi-targeted protein kinase inhibitor midostaurin were discovered as potential repositioning drugs for treating lung cancer. These findings have implications for improving lung cancer diagnosis, optimizing treatment selection, and discovering new drug options for better patient outcomes

    XMRF: an R package to fit Markov Networks to high-throughput genetics data

    No full text
    Abstract Background Technological advances in medicine have led to a rapid proliferation of high-throughput “omics” data. Tools to mine this data and discover disrupted disease networks are needed as they hold the key to understanding complicated interactions between genes, mutations and aberrations, and epi-genetic markers. Results We developed an R software package, XMRF, that can be used to fit Markov Networks to various types of high-throughput genomics data. Encoding the models and estimation techniques of the recently proposed exponential family Markov Random Fields (Yang et al., 2012), our software can be used to learn genetic networks from RNA-sequencing data (counts via Poisson graphical models), mutation and copy number variation data (categorical via Ising models), and methylation data (continuous via Gaussian graphical models). Conclusions XMRF is the only tool that allows network structure learning using the native distribution of the data instead of the standard Gaussian. Moreover, the parallelization feature of the implemented algorithms computes the large-scale biological networks efficiently. XMRF is available from CRAN and Github ( https://github.com/zhandong/XMRF )
    • 

    corecore