63 research outputs found

    Multi-SNP analysis of GWAS data identifies pathways associated with nonalcoholic fatty liver disease

    Get PDF
    Non-alcoholic fatty liver disease (NAFLD) is a common liver disease; the histological spectrum of which ranges from steatosis to steatohepatitis. Nonalcoholic steatohepatitis (NASH) often leads to cirrhosis and development of hepatocellular carcinoma. To better understand pathogenesis of NAFLD, we performed the pathway of distinction analysis (PoDA) on a genome-wide association study dataset of 250 non-Hispanic white female adult patients with NAFLD, who were enrolled in the NASH Clinical Research Network (CRN) Database Study, to investigate whether biologic process variation measured through genomic variation of genes within these pathways was related to the development of steatohepatitis or cirrhosis. Pathways such as Recycling of eIF2:GDP, biosynthesis of steroids, Terpenoid biosynthesis and Cholesterol biosynthesis were found to be significantly associated with NASH. SNP variants in Terpenoid synthesis, Cholesterol biosynthesis and biosynthesis of steroids were associated with lobular inflammation and cytologic ballooning while those in Terpenoid synthesis were also associated with fibrosis and cirrhosis. These were also related to the NAFLD activity score (NAS) which is derived from the histological severity of steatosis, inflammation and ballooning degeneration. Eukaryotic protein translation and recycling of eIF2:GDP related SNP variants were associated with ballooning, steatohepatitis and cirrhosis. Il2 signaling events mediated by PI3K, Mitotic metaphase/anaphase transition, and Prostanoid ligand receptors were also significantly associated with cirrhosis. Taken together, the results provide evidence for additional ways, beyond the effects of single SNPs, by which genetic factors might contribute to the susceptibility to develop a particular phenotype of NAFLD and then progress to cirrhosis. Further studies are warranted to explain potential important genetic roles of these biological processes in NAFLD

    Delineating Genetic Alterations for Tumor Progression in the MCF10A Series of Breast Cancer Cell Lines

    Get PDF
    To gain insight into the role of genomic alterations in breast cancer progression, we conducted a comprehensive genetic characterization of a series of four cell lines derived from MCF10A. MCF10A is an immortalized mammary epithelial cell line (MEC); MCF10AT is a premalignant cell line generated from MCF10A by transformation with an activated HRAS gene; MCF10CA1h and MCF10CA1a, both derived from MCF10AT xenografts, form well-differentiated and poorly-differentiated malignant tumors in the xenograft models, respectively. We analyzed DNA copy number variation using the Affymetrix 500 K SNP arrays with the goal of identifying gene-specific amplification and deletion events. In addition to a previously noted deletion in the CDKN2A locus, our studies identified MYC amplification in all four cell lines. Additionally, we found intragenic deletions in several genes, including LRP1B in MCF10CA1h and MCF10CA1a, FHIT and CDH13 in MCF10CA1h, and RUNX1 in MCF10CA1a. We confirmed the deletion of RUNX1 in MCF10CA1a by DNA and RNA analyses, as well as the absence of the RUNX1 protein in that cell line. Furthermore, we found that RUNX1 expression was reduced in high-grade primary breast tumors compared to low/mid-grade tumors. Mutational analysis identified an activating PIK3CA mutation, H1047R, in MCF10CA1h and MCF10CA1a, which correlates with an increase of AKT1 phosphorylation at Ser473 and Thr308. Furthermore, we showed increased expression levels for genes located in the genomic regions with copy number gain. Thus, our genetic analyses have uncovered sequential molecular events that delineate breast tumor progression. These events include CDKN2A deletion and MYC amplification in immortalization, HRAS activation in transformation, PIK3CA activation in the formation of malignant tumors, and RUNX1 deletion associated with poorly-differentiated malignant tumors

    Inferring causal molecular networks: empirical assessment through a community-based effort

    Get PDF
    Inferring molecular networks is a central challenge in computational biology. However, it has remained unclear whether causal, rather than merely correlational, relationships can be effectively inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge that focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results constitute the most comprehensive assessment of causal network inference in a mammalian setting carried out to date and suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess the causal validity of inferred molecular networks

    Inferring causal molecular networks: empirical assessment through a community-based effort

    Get PDF
    It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense

    Systematic Genetic Analysis Identifies Cis-eQTL Target Genes Associated with Glioblastoma Patient Survival

    No full text
    <div><p>Prior expression quantitative trait locus (eQTL) studies have demonstrated heritable variation determining differences in gene expression. The majority of eQTL studies were based on cell lines and normal tissues. We performed cis-eQTL analysis using glioblastoma multiforme (GBM) data sets obtained from The Cancer Genome Atlas (TCGA) to systematically investigate germline variation’s contribution to tumor gene expression levels. We identified 985 significant cis-eQTL associations (FDR<0.05) mapped to 978 SNP loci and 159 unique genes. Approximately 57% of these eQTLs have been previously linked to the gene expression in cell lines and normal tissues; 43% of these share cis associations known to be associated with functional annotations. About 25% of these cis-eQTL associations are also common to those identified in Breast Cancer from a recent study. Further investigation of the relationship between gene expression and patient clinical information identified 13 eQTL genes whose expression level significantly correlates with GBM patient survival (p<0.05). Most of these genes are also differentially expressed in tumor samples and organ-specific controls (p<0.05). Our results demonstrated a significant relationship of germline variation with gene expression levels in GBM. The identification of eQTLs-based expression associated survival might be important to the understanding of genetic contribution to GBM cancer prognosis.</p></div

    Comparing the performance of selected variant callers using synthetic data and genome segmentation

    No full text
    Abstract Background High-throughput sequencing has rapidly become an essential part of precision cancer medicine. But validating results obtained from analyzing and interpreting genomic data remains a rate-limiting factor. The gold standard, of course, remains manual validation by expert panels, which is not without its weaknesses, namely high costs in both funding and time as well as the necessarily selective nature of manual validation. But it may be possible to develop more economical, complementary means of validation. In this study we employed four synthetic data sets (variants with known mutations spiked into specific genomic locations) of increasing complexity to assess the sensitivity, specificity, and balanced accuracy of five open-source variant callers: FreeBayes v1.0, VarDict v11.5.1, MuTect v1.1.7, MuTect2, and MuSE v1.0rc. FreeBayes, VarDict, and MuTect were run in bcbio-next gen, and the results were integrated into a single Ensemble call set. The known mutations provided a level of “ground truth” against which we evaluated variant-caller performance. We further facilitated the comparison and evaluation by segmenting the whole genome into 10,000,000 base-pair fragments which yielded 316 segments. Results Differences among the numbers of true positives were small among the callers, but the numbers of false positives varied much more when the tools were used to analyze sets one through three. Both FreeBayes and VarDict produced strikingly more false positives than did the others, although VarDict, somewhat paradoxically also produced the highest number of true positives. The Ensemble approach yielded results characterized by higher specificity and balanced accuracy and fewer false positives than did any of the five tools used alone. Sensitivity and specificity, however, declined for all five callers as the complexity of the data sets increased, but we did not uncover anything more than limited, weak correlations between caller performance and certain DNA structural features: gene density and guanine-cytosine content. Altogether, MuTect2 performed the best among the callers tested, followed by MuSE and MuTect. Conclusions Spiking data sets with specific mutations –single-nucleotide variations (SNVs), single-nucleotide polymorphisms (SNPs), or structural variations (SVs) in this study—at known locations in the genome provides an effective and economical way to compare data analyzed by variant callers with ground truth. The method constitutes a viable alternative to the prolonged, expensive, and noncomprehensive assessment by expert panels. It should be further developed and refined, as should other comparatively “lightweight” methods of assessing accuracy. Given that the scientific community has not yet established gold standards for validating NGS-related technologies such as variant callers, developing multiple alternative means for verifying variant-caller accuracy will eventually lead to the establishment of higher-quality standards than could be achieved by prematurely limiting the range of innovative methods explored by members of the community

    cis-eQTL target genes with significant association with patient survival.

    No full text
    <p>cis-eQTL target genes with significant association with patient survival.</p
    • 

    corecore