91 research outputs found

    Detecting copy number status and uncovering subclonal markers in heterogeneous tumor biopsies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genomic aberrations can be used to determine cancer diagnosis and prognosis. Clinically relevant novel aberrations can be discovered using high-throughput assays such as Single Nucleotide Polymorphism (SNP) arrays and next-generation sequencing, which typically provide aggregate signals of many cells at once. However, heterogeneity of tumor subclones dramatically complicates the task of detecting aberrations.</p> <p>Results</p> <p>The aggregate signal of a population of subclones can be described as a linear system of equations. We employed a measure of allelic imbalance and total amount of DNA to characterize each locus by the copy number status (gain, loss or neither) of the strongest subclonal component. We designed simulated data to compare our measure to existing approaches and we analyzed SNP-arrays from 30 melanoma samples and transcriptome sequencing (RNA-Seq) from one melanoma sample.</p> <p>We showed that any system describing aggregate subclonal signals is underdetermined, leading to non-unique solutions for the exact copy number profile of subclones. For this reason, our illustrative measure was more robust than existing Hidden Markov Model (HMM) based tools in inferring the aberration status, as indicated by tests on simulated data. This higher robustness contributed in identifying numerous aberrations in several loci of melanoma samples. We validated the heterogeneity and aberration status within single biopsies by fluorescent <it>in situ </it>hybridization of four affected and transcriptionally up-regulated genes E2F8, ETV4, EZH2 and FAM84B in 11 melanoma cell lines. Heterogeneity was further demonstrated in the analysis of allelic imbalance changes along single exons from melanoma RNA-Seq.</p> <p>Conclusions</p> <p>These studies demonstrate how subclonal heterogeneity, prevalent in tumor samples, is reflected in aggregate signals measured by high-throughput techniques. Our proposed approach yields high robustness in detecting copy number alterations using high-throughput technologies and has the potential to identify specific subclonal markers from next-generation sequencing data.</p

    The Transcriptomes of Two Heritable Cell Types Illuminate the Circuit Governing Their Differentiation

    Get PDF
    The differentiation of cells into distinct cell types, each of which is heritable for many generations, underlies many biological phenomena. White and opaque cells of the fungal pathogen Candida albicans are two such heritable cell types, each thought to be adapted to unique niches within their human host. To systematically investigate their differences, we performed strand-specific, massively-parallel sequencing of RNA from C. albicans white and opaque cells. With these data we first annotated the C. albicans transcriptome, finding hundreds of novel differentially-expressed transcripts. Using the new annotation, we compared differences in transcript abundance between the two cell types with the genomic regions bound by a master regulator of the white-opaque switch (Wor1). We found that the revised transcriptional landscape considerably alters our understanding of the circuit governing differentiation. In particular, we can now resolve the poor concordance between binding of a master regulator and the differential expression of adjacent genes, a discrepancy observed in several other studies of cell differentiation. More than one third of the Wor1-bound differentially-expressed transcripts were previously unannotated, which explains the formerly puzzling presence of Wor1 at these positions along the genome. Many of these newly identified Wor1-regulated genes are non-coding and transcribed antisense to coding transcripts. We also find that 5′ and 3′ UTRs of mRNAs in the circuit are unusually long and that 5′ UTRs often differ in length between cell-types, suggesting UTRs encode important regulatory information and that use of alternative promoters is widespread. Further analysis revealed that the revised Wor1 circuit bears several striking similarities to the Oct4 circuit that specifies the pluripotency of mammalian embryonic stem cells. Additional characteristics shared with the Oct4 circuit suggest a set of general hallmarks characteristic of heritable differentiation states in eukaryotes

    A Phenotypic Profile of the Candida albicans Regulatory Network

    Get PDF
    Candida albicans is a normal resident of the gastrointestinal tract and also the most prevalent fungal pathogen of humans. It last shared a common ancestor with the model yeast Saccharomyces cerevisiae over 300 million years ago. We describe a collection of 143 genetically matched strains of C. albicans, each of which has been deleted for a specific transcriptional regulator. This collection represents a large fraction of the non-essential transcription circuitry. A phenotypic profile for each mutant was developed using a screen of 55 growth conditions. The results identify the biological roles of many individual transcriptional regulators; for many, this work represents the first description of their functions. For example, a quarter of the strains showed altered colony formation, a phenotype reflecting transitions among yeast, pseudohyphal, and hyphal cell forms. These transitions, which have been closely linked to pathogenesis, have been extensively studied, yet our work nearly doubles the number of transcriptional regulators known to influence them. As a second example, nearly a quarter of the knockout strains affected sensitivity to commonly used antifungal drugs; although a few transcriptional regulators have previously been implicated in susceptibility to these drugs, our work indicates many additional mechanisms of sensitivity and resistance. Finally, our results inform how transcriptional networks evolve. Comparison with the existing S. cerevisiae data (supplemented by additional S. cerevisiae experiments reported here) allows the first systematic analysis of phenotypic conservation by orthologous transcriptional regulators over a large evolutionary distance. We find that, despite the many specific wiring changes documented between these species, the general phenotypes of orthologous transcriptional regulator knockouts are largely conserved. These observations support the idea that many wiring changes affect the detailed architecture of the circuit, but not its overall output

    Origin of Co-Expression Patterns in E.coli and S.cerevisiae Emerging from Reverse Engineering Algorithms

    Get PDF
    BACKGROUND: The concept of reverse engineering a gene network, i.e., of inferring a genome-wide graph of putative gene-gene interactions from compendia of high throughput microarray data has been extensively used in the last few years to deduce/integrate/validate various types of "physical" networks of interactions among genes or gene products. RESULTS: This paper gives a comprehensive overview of which of these networks emerge significantly when reverse engineering large collections of gene expression data for two model organisms, E. coli and S. cerevisiae, without any prior information. For the first organism the pattern of co-expression is shown to reflect in fine detail both the operonal structure of the DNA and the regulatory effects exerted by the gene products when co-participating in a protein complex. For the second organism we find that direct transcriptional control (e.g., transcription factor-binding site interactions) has little statistical significance in comparison to the other regulatory mechanisms (such as co-sharing a protein complex, co-localization on a metabolic pathway or compartment), which are however resolved at a lower level of detail than in E. coli. CONCLUSION: The gene co-expression patterns deduced from compendia of profiling experiments tend to unveil functional categories that are mainly associated to stable bindings rather than transient interactions. The inference power of this systematic analysis is substantially reduced when passing from E. coli to S. cerevisiae. This extensive analysis provides a way to describe the different complexity between the two organisms and discusses the critical limitations affecting this type of methodologies

    Integrative network analysis identified key genes and pathways in the progression of hepatitis C virus induced hepatocellular carcinoma

    Get PDF
    Background: Incidence of hepatitis C virus (HCV) induced hepatocellular carcinoma (HCC) has been increasing in the United States and Europe during recent years. Although HCV-associated HCC shares many pathological characteristics with other types of HCC, its molecular mechanisms of progression remain elusive. Methods: To investigate the underlying pathology, we developed a systematic approach to identify deregulated biological networks in HCC by integrating gene expression profiles with high-throughput protein-protein interaction data. We examined five stages including normal (control) liver, cirrhotic liver, dysplasia, early HCC and advanced HCC. Results: Among the five consecutive pathological stages, we identified four networks including precancerous networks (Normal-Cirrhosis and Cirrhosis-Dysplasia) and cancerous networks (Dysplasia-Early HCC, Early-Advanced HCC). We found little overlap between precancerous and cancerous networks, opposite to a substantial overlap within precancerous or cancerous networks. We further found that the hub proteins interacted with HCV proteins, suggesting direct interventions of these networks by the virus. The functional annotation of each network demonstrates a high degree of consistency with current knowledge in HCC. By assembling these functions into a module map, we could depict the stepwise biological functions that are deregulated in HCV-induced hepatocarcinogenesis. Additionally, these networks enable us to identify important genes and pathways by developmental stage, such as LCK signalling pathways in cirrhosis, MMP genes and TIMP genes in dysplastic liver, and CDC2-mediated cell cycle signalling in early and advanced HCC. CDC2 (alternative symbol CDK1), a cell cycle regulatory gene, is particularly interesting due to its topological position in temporally deregulated networks. Conclusions: Our study uncovers a temporal spectrum of functional deregulation and prioritizes key genes and pathways in the progression of HCV induced HCC. These findings present a wealth of information for further investigation

    A systematic, large-scale comparison of transcription factor binding site models

    Get PDF
    Background The modelling of gene regulation is a major challenge in biomedical research. This process is dominated by transcription factors (TFs) and mutations in their binding sites (TFBSs) may cause the misregulation of genes, eventually leading to disease. The consequences of DNA variants on TF binding are modelled in silico using binding matrices, but it remains unclear whether these are capable of accurately representing in vivo binding. In this study, we present a systematic comparison of binding models for 82 human TFs from three freely available sources: JASPAR matrices, HT-SELEX-generated models and matrices derived from protein binding microarrays (PBMs). We determined their ability to detect experimentally verified “real” in vivo TFBSs derived from ENCODE ChIP-seq data. As negative controls we chose random downstream exonic sequences, which are unlikely to harbour TFBS. All models were assessed by receiver operating characteristics (ROC) analysis. Results While the area- under-curve was low for most of the tested models with only 47 % reaching a score of 0.7 or higher, we noticed strong differences between the various position-specific scoring matrices with JASPAR and HT-SELEX models showing higher success rates than PBM-derived models. In addition, we found that while TFBS sequences showed a higher degree of conservation than randomly chosen sequences, there was a high variability between individual TFBSs. Conclusions Our results show that only few of the matrix-based models used to predict potential TFBS are able to reliably detect experimentally confirmed TFBS. We compiled our findings in a freely accessible web application called ePOSSUM (http:/mutationtaster.charite.de/ePOSSUM/) which uses a Bayes classifier to assess the impact of genetic alterations on TF binding in user-defined sequences. Additionally, ePOSSUM provides information on the reliability of the prediction using our test set of experimentally confirmed binding sites

    Cross species comparison of C/EBPα and PPARγ profiles in mouse and human adipocytes reveals interdependent retention of binding sites

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The transcription factors peroxisome proliferator activated receptor γ (PPARγ) and CCAAT/enhancer binding protein α (C/EBPα) are key transcriptional regulators of adipocyte differentiation and function. We and others have previously shown that binding sites of these two transcription factors show a high degree of overlap and are associated with the majority of genes upregulated during differentiation of murine 3T3-L1 adipocytes.</p> <p>Results</p> <p>Here we have mapped all binding sites of C/EBPα and PPARγ in human SGBS adipocytes and compared these with the genome-wide profiles from mouse adipocytes to systematically investigate what biological features correlate with retention of sites in orthologous regions between mouse and human. Despite a limited interspecies retention of binding sites, several biological features make sites more likely to be retained. First, co-binding of PPARγ and C/EBPα in mouse is the most powerful predictor of retention of the corresponding binding sites in human. Second, vicinity to genes highly upregulated during adipogenesis significantly increases retention. Third, the presence of C/EBPα consensus sites correlate with retention of both factors, indicating that C/EBPα facilitates recruitment of PPARγ. Fourth, retention correlates with overall sequence conservation within the binding regions independent of C/EBPα and PPARγ sequence patterns, indicating that other transcription factors work cooperatively with these two key transcription factors.</p> <p>Conclusions</p> <p>This study provides a comprehensive and systematic analysis of what biological features impact on retention of binding sites between human and mouse. Specifically, we show that the binding of C/EBPα and PPARγ in adipocytes have evolved in a highly interdependent manner, indicating a significant cooperativity between these two transcription factors.</p

    Gene Expression Profiling of Liver Cancer Stem Cells by RNA-Sequencing

    Get PDF
    Background: Accumulating evidence supports that tumor growth and cancer relapse are driven by cancer stem cells. Our previous work has demonstrated the existence of CD90 + liver cancer stem cells (CSCs) in hepatocellular carcinoma (HCC). Nevertheless, the characteristics of these cells are still poorly understood. In this study, we employed a more sensitive RNA-sequencing (RNA-Seq) to compare the gene expression profiling of CD90 + cells sorted from tumor (CD90 +CSCs) with parallel non-tumorous liver tissues (CD90 +NTSCs) and elucidate the roles of putative target genes in hepatocarcinogenesis. Methodology/Principal Findings: CD90 + cells were sorted respectively from tumor and adjacent non-tumorous human liver tissues using fluorescence-activated cell sorting. The amplified RNAs of CD90 + cells from 3 HCC patients were subjected to RNA-Seq analysis. A differential gene expression profile was established between CD90 +CSCs and CD90 +NTSCs, and validated by quantitative real-time PCR (qRT-PCR) on the same set of amplified RNAs, and further confirmed in an independent cohort of 12 HCC patients. Five hundred genes were differentially expressed (119 up-regulated and 381 down-regulated genes) between CD90 +CSCs and CD90 +NTSCs. Gene ontology analysis indicated that the over-expressed genes in CD90 +CSCs were associated with inflammation, drug resistance and lipid metabolism. Among the differentially expressed genes, glypican-3 (GPC3), a member of glypican family, was markedly elevated in CD90 +CSCs compared to CD90 +NTSCs. Immunohistochemistry demonstrated that GPC3 was highly expressed in forty-two human liver tumor tissues but absent in adjacent non-tumorous liver tissues. Flow cytometry indicated that GPC3 was highly expressed in liver CD90 +CSCs and mature cancer cells in liver cancer cell lines and human liver tumor tissues. Furthermore, GPC3 expression was positively correlated with the number of CD90 +CSCs in liver tumor tissues. Conclusions/Significance: The identified genes, such as GPC3 that are distinctly expressed in liver CD90 +CSCs, may be promising gene candidates for HCC therapy without inducing damages to normal liver stem cells. © 2012 Ho et al.published_or_final_versio

    Genome and Transcriptome Analysis of the Food-Yeast Candida utilis

    Get PDF
    The industrially important food-yeast Candida utilis is a Crabtree effect-negative yeast used to produce valuable chemicals and recombinant proteins. In the present study, we conducted whole genome sequencing and phylogenetic analysis of C. utilis, which showed that this yeast diverged long before the formation of the CUG and Saccharomyces/Kluyveromyces clades. In addition, we performed comparative genome and transcriptome analyses using next-generation sequencing, which resulted in the identification of genes important for characteristic phenotypes of C. utilis such as those involved in nitrate assimilation, in addition to the gene encoding the functional hexose transporter. We also found that an antisense transcript of the alcohol dehydrogenase gene, which in silico analysis did not predict to be a functional gene, was transcribed in the stationary-phase, suggesting a novel system of repression of ethanol production. These findings should facilitate the development of more sophisticated systems for the production of useful reagents using C. utilis

    Tinkering Evolution of Post-Transcriptional RNA Regulons: Puf3p in Fungi as an Example

    Get PDF
    Genome-wide studies of post-transcriptional mRNA regulation in model organisms indicate a “post-transcriptional RNA regulon” model, in which a set of functionally related genes is regulated by mRNA–binding RNAs or proteins. One well-studied post-transcriptional regulon by Puf3p functions in mitochondrial biogenesis in budding yeast. The evolution of the Puf3p regulon remains unclear because previous studies have shown functional divergence of Puf3p regulon targets among yeast, fruit fly, and humans. By analyzing evolutionary patterns of Puf3p and its targeted genes in forty-two sequenced fungi, we demonstrated that, although the Puf3p regulon is conserved among all of the studied fungi, the dedicated regulation of mitochondrial biogenesis by Puf3p emerged only in the Saccharomycotina clade. Moreover, the evolution of the Puf3p regulon was coupled with evolution of codon usage bias in down-regulating expression of genes that function in mitochondria in yeast species after genome duplication. Our results provide a scenario for how evolution like a tinker exploits pre-existing materials of a conserved post-transcriptional regulon to regulate gene expression for novel functional roles
    corecore