23,250 research outputs found

    Genome-Wide Survey of MicroRNA - Transcription Factor Feed-Forward Regulatory Circuits in Human

    Full text link
    In this work, we describe a computational framework for the genome-wide identification and characterization of mixed transcriptional/post-transcriptional regulatory circuits in humans. We concentrated in particular on feed-forward loops (FFL), in which a master transcription factor regulates a microRNA, and together with it, a set of joint target protein coding genes. The circuits were assembled with a two step procedure. We first constructed separately the transcriptional and post-transcriptional components of the human regulatory network by looking for conserved over-represented motifs in human and mouse promoters, and 3'-UTRs. Then, we combined the two subnetworks looking for mixed feed-forward regulatory interactions, finding a total of 638 putative (merged) FFLs. In order to investigate their biological relevance, we filtered these circuits using three selection criteria: (I) GeneOntology enrichment among the joint targets of the FFL, (II) independent computational evidence for the regulatory interactions of the FFL, extracted from external databases, and (III) relevance of the FFL in cancer. Most of the selected FFLs seem to be involved in various aspects of organism development and differentiation. We finally discuss a few of the most interesting cases in detail.Comment: 51 pages, 5 figures, 4 tables. Supporting information included. Accepted for publication in Molecular BioSystem

    Turning gold into 'junk': transposable elements utilize central proteins of cellular networks

    Get PDF
    The numerous discovered cases of domesticated transposable element (TE) proteins led to the recognition that TEs are a significant source of evolutionary innovation. However, much less is known about the reverse process, whether and to what degree the evolution of TEs is influenced by the genome of their hosts. We addressed this issue by searching for cases of incorporation of host genes into the sequence of TEs and examined the systems-level properties of these genes using the Saccharomyces cerevisiae and Drosophila melanogaster genomes. We identified 51 cases where the evolutionary scenario was the incorporation of a host gene fragment into a TE consensus sequence, and we show that both the yeast and fly homologues of the incorporated protein sequences have central positions in the cellular networks. An analysis of selective pressure (Ka/Ks ratio) detected significant selection in 37% of the cases. Recent research on retrovirus-host interactions shows that virus proteins preferentially target hubs of the host interaction networks enabling them to take over the host cell using only a few proteins. We propose that TEs face a similar evolutionary pressure to evolve proteins with high interacting capacities and take some of the necessary protein domains directly from their hosts

    Genome-wide screening for DNA variants associated with reading and language traits

    Get PDF
    This research was funded by: Max Planck Society, the University of St Andrews - Grant Number: 018696, US National Institutes of Health - Grant Number: P50 HD027802, Wellcome Trust - Grant Number: 090532/Z/09/Z, and Medical Research Council Hub Grant Grant Number: G0900747 91070Reading and language abilities are heritable traits that are likely to share some genetic influences with each other. To identify pleiotropic genetic variants affecting these traits, we first performed a genome‐wide association scan (GWAS) meta‐analysis using three richly characterized datasets comprising individuals with histories of reading or language problems, and their siblings. GWAS was performed in a total of 1862 participants using the first principal component computed from several quantitative measures of reading‐ and language‐related abilities, both before and after adjustment for performance IQ. We identified novel suggestive associations at the SNPs rs59197085 and rs5995177 (uncorrected P ≈ 10–7 for each SNP), located respectively at the CCDC136/FLNC and RBFOX2 genes. Each of these SNPs then showed evidence for effects across multiple reading and language traits in univariate association testing against the individual traits. FLNC encodes a structural protein involved in cytoskeleton remodelling, while RBFOX2 is an important regulator of alternative splicing in neurons. The CCDC136/FLNC locus showed association with a comparable reading/language measure in an independent sample of 6434 participants from the general population, although involving distinct alleles of the associated SNP. Our datasets will form an important part of on‐going international efforts to identify genes contributing to reading and language skills.Publisher PDFPeer reviewe

    Extensive Copy-Number Variation of Young Genes across Stickleback Populations

    Get PDF
    MM received funding from the Max Planck innovation funds for this project. PGDF was supported by a Marie Curie European Reintegration Grant (proposal nr 270891). CE was supported by German Science Foundation grants (DFG, EI 841/4-1 and EI 841/6-1). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    Integrated Regulatory and Metabolic Networks of the Marine Diatom Phaeodactylum tricornutum Predict the Response to Rising CO2 Levels.

    Get PDF
    Diatoms are eukaryotic microalgae that are responsible for up to 40% of the ocean's primary productivity. How diatoms respond to environmental perturbations such as elevated carbon concentrations in the atmosphere is currently poorly understood. We developed a transcriptional regulatory network based on various transcriptome sequencing expression libraries for different environmental responses to gain insight into the marine diatom's metabolic and regulatory interactions and provide a comprehensive framework of responses to increasing atmospheric carbon levels. This transcriptional regulatory network was integrated with a recently published genome-scale metabolic model of Phaeodactylum tricornutum to explore the connectivity of the regulatory network and shared metabolites. The integrated regulatory and metabolic model revealed highly connected modules within carbon and nitrogen metabolism. P. tricornutum's response to rising carbon levels was analyzed by using the recent genome-scale metabolic model with cross comparison to experimental manipulations of carbon dioxide. IMPORTANCE Using a systems biology approach, we studied the response of the marine diatom Phaeodactylum tricornutum to changing atmospheric carbon concentrations on an ocean-wide scale. By integrating an available genome-scale metabolic model and a newly developed transcriptional regulatory network inferred from transcriptome sequencing expression data, we demonstrate that carbon metabolism and nitrogen metabolism are strongly connected and the genes involved are coregulated in this model diatom. These tight regulatory constraints could play a major role during the adaptation of P. tricornutum to increasing carbon levels. The transcriptional regulatory network developed can be further used to study the effects of different environmental perturbations on P. tricornutum's metabolism

    Paleo-Balkan and Slavic Contributions to the Genetic Pool of Moldavians

    Get PDF
    Moldova has a rich historical and cultural heritage, which may be reflected in the current genetic makeup of its population. To date, no comprehensive studies exist about the population genetic structure of modern Moldavians. To bridge this gap with respect to paternal lineages, we analyzed 37 binary and 17 multiallelic (STRs) polymorphisms on the non-recombining portion of the Y chromosome in 125 Moldavian males. In addition, 53 Ukrainians from eastern Moldova and 54 Romanians from the neighboring eastern Romania were typed using the same set of markers. In Moldavians, 19 Y chromosome haplogroups were identified, the most common being I-M423 (20.8%), R-M17* (17.6%), R-M458 (12.8%), E-v13 (8.8%), RM269* and R-M412* (both 7.2%). In Romanians, 14 haplogroups were found including I-M423 (40.7%), R-M17* (16.7%), RM405 (7.4%), E-v13 and R-M412* (both 5.6%). In Ukrainians, 13 haplogroups were identified including R-M17 (34.0%), I-M423 (20.8%), R-M269* (9.4%), N-M178, R-M458 and R-M73 (each 5.7%). Our results show that a significant majority of the Moldavian paternal gene pool belongs to eastern/central European and Balkan/eastern Mediterranean Y lineages. Phylogenetic and AMOVA analyses based on Y-STR loci also revealed that Moldavians are close to both eastern/central European and Balkan-Carpathian populations. The data correlate well with historical accounts and geographical location of the region and thus allow to hypothesize that extant Moldavian paternal genetic lineages arose from extensive recent admixture between genetically autochthonous populations of the Balkan-Carpathian zone and neighboring Slavic group

    High resolution mapping of Twist to DNA in Drosophila embryos: Efficient functional analysis and evolutionary conservation

    Get PDF
    Cis-regulatory modules (CRMs) function by binding sequence specific transcription factors, but the relationship between in vivo physical binding and the regulatory capacity of factor-bound DNA elements remains uncertain. We investigate this relationship for the well-studied Twist factor in Drosophila melanogaster embryos by analyzing genome-wide factor occupancy and testing the functional significance of Twist occupied regions and motifs within regions. Twist ChIP-seq data efficiently identified previously studied Twist-dependent CRMs and robustly predicted new CRM activity in transgenesis, with newly identified Twist-occupied regions supporting diverse spatiotemporal patterns (>74% positive, n = 31). Some, but not all, candidate CRMs require Twist for proper expression in the embryo. The Twist motifs most favored in genome ChIP data (in vivo) differed from those most favored by Systematic Evolution of Ligands by EXponential enrichment (SELEX) (in vitro). Furthermore, the majority of ChIP-seq signals could be parsimoniously explained by a CABVTG motif located within 50 bp of the ChIP summit and, of these, CACATG was most prevalent. Mutagenesis experiments demonstrated that different Twist E-box motif types are not fully interchangeable, suggesting that the ChIP-derived consensus (CABVTG) includes sites having distinct regulatory outputs. Further analysis of position, frequency of occurrence, and sequence conservation revealed significant enrichment and conservation of CABVTG E-box motifs near Twist ChIP-seq signal summits, preferential conservation of ±150 bp surrounding Twist occupied summits, and enrichment of GA- and CA-repeat sequences near Twist occupied summits. Our results show that high resolution in vivo occupancy data can be used to drive efficient discovery and dissection of global and local cis-regulatory logic

    Islands of linkage in an ocean of pervasive recombination reveals two-speed evolution of human cytomegalovirus genomes

    Get PDF
    Human cytomegalovirus (HCMV) infects most of the population worldwide, persisting throughout the host's life in a latent state with periodic episodes of reactivation. While typically asymptomatic, HCMV can cause fatal disease among congenitally infected infants and immunocompromised patients. These clinical issues are compounded by the emergence of antiviral resistance and the absence of an effective vaccine, the development of which is likely complicated by the numerous immune evasins encoded by HCMV to counter the host's adaptive immune responses, a feature that facilitates frequent super-infections. Understanding the evolutionary dynamics of HCMV is essential for the development of effective new drugs and vaccines. By comparing viral genomes from uncultivated or low-passaged clinical samples of diverse origins, we observe evidence of frequent homologous recombination events, both recent and ancient, and no structure of HCMV genetic diversity at the whole-genome scale. Analysis of individual gene-scale loci reveals a striking dichotomy: while most of the genome is highly conserved, recombines essentially freely and has evolved under purifying selection, 21 genes display extreme diversity, structured into distinct genotypes that do not recombine with each other. Most of these hyper-variable genes encode glycoproteins involved in cell entry or escape of host immunity. Evidence that half of them have diverged through episodes of intense positive selection suggests that rapid evolution of hyper-variable loci is likely driven by interactions with host immunity. It appears that this process is enabled by recombination unlinking hyper-variable loci from strongly constrained neighboring sites. It is conceivable that viral mechanisms facilitating super-infection have evolved to promote recombination between diverged genotypes, allowing the virus to continuously diversify at key loci to escape immune detection, while maintaining a genome optimally adapted to its asymptomatic infectious lifecycle
    corecore