200 research outputs found

    Weak preservation of local neutral substitution rates across mammalian genomes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The rate at which neutral (non-functional) bases undergo substitution is highly dependent on their location within a genome. However, it is not clear how fast these location-dependent rates change, or to what extent the substitution rate <it>patterns </it>are conserved between lineages. To address this question, which is critical not only for understanding the substitution process but also for evaluating phylogenetic footprinting algorithms, we examine ancestral repeats: a predominantly neutral dataset with a significantly higher genomic density than other datasets commonly used to study substitution rate variation. Using this repeat data, we measure the extent to which orthologous ancestral repeat sequences exhibit similar substitution patterns in separate mammalian lineages, allowing us to ascertain how well local substitution rates have been preserved across species.</p> <p>Results</p> <p>We calculated substitution rates for each ancestral repeat in each of three independent mammalian lineages (primate – from human/macaque alignments, rodent – from mouse/rat alignments, and laurasiatheria – from dog/cow alignments). We then measured the correlation of local substitution rates among these lineages. Overall we found the correlations between lineages to be statistically significant, but too weak to have much predictive power (<it>r</it><sup>2 </sup><<it>5%</it>). These correlations were found to be primarily driven by regional effects at the scale of several hundred kb or larger. A few repeat classes (e.g. 7SK, Charlie8, and MER121) also exhibited stronger conservation of rate patterns, likely due to the effect of repeat-specific purifying selection. These classes should be excluded when estimating local neutral substitution rates.</p> <p>Conclusion</p> <p>Although local neutral substitution rates have some correlations among mammalian species, these correlations have little predictive power on the scale of individual repeats. This indicates that local substitution rates have changed significantly among the lineages we have studied, and are likely to have changed even more for more diverged lineages. The correlations that do persist are too weak to be responsible for many of the highly conserved elements found by phylogenetic footprinting algorithms, leading us to conclude that such elements must be conserved due to selective forces.</p

    WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

    Get PDF
    BACKGROUND: This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. RESULTS: We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. CONCLUSION: Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes

    Regulatory conservation of protein coding and microRNA genes in vertebrates: lessons from the opossum genome

    Get PDF
    BACKGROUND: Being the first noneutherian mammal sequenced, Monodelphis domestica (opossum) offers great potential for enhancing our understanding of the evolutionary processes that take place in mammals. This study focuses on the evolutionary relationships between conservation of noncoding sequences, cis-regulatory elements, and biologic functions of regulated genes in opossum and eight vertebrate species. RESULTS: Analysis of 145 intergenic microRNA and all protein coding genes revealed that the upstream sequences of the former are up to twice as conserved as the latter among mammals, except in the first 500 base pairs, where the conservation is similar. Comparison of promoter conservation in 513 protein coding genes and related transcription factor binding sites (TFBSs) showed that 41% of the known human TFBSs are located in the 6.7% of promoter regions that are conserved between human and opossum. Some core biologic processes exhibited significantly fewer conserved TFBSs in human-opossum comparisons, suggesting greater functional divergence. A new measure of efficiency in multigenome phylogenetic footprinting (base regulatory potential rate [BRPR]) shows that including human-opossum conservation increases specificity in finding human TFBSs. CONCLUSION: Opossum facilitates better estimation of promoter conservation and TFBS turnover among mammals. The fact that substantial TFBS numbers are located in a small proportion of the human-opossum conserved sequences emphasizes the importance of marsupial genomes for phylogenetic footprinting-based motif discovery strategies. The BRPR measure is expected to help select genome combinations for optimal performance of these algorithms. Finally, although the etiology of the microRNA upstream increased conservation remains unknown, it is expected to have strong implications for our understanding of regulation of their expression

    A Caenorhabditis motif compendium for studying transcriptional gene regulation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Controlling gene expression is fundamental to biological complexity. The nematode <it>Caenorhabditis elegans </it>is an important model for studying principles of gene regulation in multi-cellular organisms. A comprehensive parts list of putative regulatory motifs was yet missing for this model system. In this study, we compile a set of putative regulatory motifs by combining evidence from conservation and expression data.</p> <p>Description</p> <p>We present an unbiased comparative approach to a regulatory motif compendium for <it>Caenorhabditis </it>species. This involves the assembly of a new nematode genome, whole genome alignments and assessment of conserved <it>k-</it>mers counts. Candidate motifs are selected from a set of 9,500 randomly picked genes by three different motif discovery strategies. Motif candidates have to pass a conservation enrichment filter. Motif degeneracy and length are optimized. Retained motif descriptions are evaluated by expression data using a non-parametric test, which assesses expression changes due to the presence/absence of individual motifs. Finally, we also provide condition-specific motif ensembles by conditional tree analysis.</p> <p>Conclusion</p> <p>The nematode genomes align surprisingly well despite high neutral substitution rates. Our pipeline delivers motif sets by three alternative strategies. Each set contains less than 400 motifs, which are significantly conserved and correlated with 214 out of 270 tested gene expression conditions. This motif compendium is an entry point to comprehensive studies on nematode gene regulation. The website: http://corg.eb.tuebingen.mpg.de/CMC has extensive query capabilities, supplements this article and supports the experimental list.</p

    Army Ants Trapped by Their Evolutionary History

    Get PDF
    A phylogenetic study of army ants has completely changed our view of their evolutionary history, including the origin of the curious and fatal phenomenon known as circular mill formatio

    Identification of conserved domains in the promoter regions of nitric oxide synthase 2: implications for the species-specific transcription and evolutionary differences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The majority of the genes involved in the inflammatory response are highly conserved in mammals. These genes are not significantly expressed under normal conditions and are mainly regulated at the transcription and prost-transcriptional level. Transcription from the promoters of these genes is very dependent on NF-κB activation, which integrates the response to diverse extracellular stresses. However, in spite of the high conservation of the pattern of promoter regulation in κB-regulated genes, there is inter-species diversity in some genes. One example is nitric oxide synthase 2 (NOS-2), which exhibits a species-specific pattern of expression in response to infection or pro-inflammatory challenge.</p> <p>Results</p> <p>We have conducted a comparative genomic analysis of NOS-2 with different bioinformatic approaches. This analysis shows that in the NOS-2 gene promoter the position and the evolutionary divergence of some conserved regions are different in rodents and non-rodent mammals, and in particular in primates. Two not previously described distal regions in rodents that are similar to the unique upstream region responsible of the NF-κB activation of NOS-2 in humans are fragmented and translocated to different locations in the rodent promoters. The rodent sequences moreover lack the functional κB sites and IFN-γ response sites present in the homologous human, rhesus monkey and chimpanzee regions. The absence of κB binding in these regions was confirmed by electrophoretic mobility shift assays.</p> <p>Conclusion</p> <p>The data presented reveal divergence between rodents and other mammals in the location and functionality of conserved regions of the NOS-2 promoter containing NF-κB and IFN-γ response elements.</p

    V1R promoters are well conserved and exhibit common putative regulatory motifs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The mouse vomeronasal organ (VNO) processes chemosensory information, including pheromone signals that influence reproductive behaviors. The sensory neurons of the VNO express two types of chemosensory receptors, V1R and V2R. There are ~165 V1R genes in the mouse genome that have been classified into ~12 divergent subfamilies. Each sensory neuron of the apical compartment of the VNO transcribes only one of the repertoire of V1R genes. A model for mutually exclusive V1R transcription in these cells has been proposed in which each V1R gene might compete stochastically for a single transcriptional complex. This model predicts that the large repertoire of divergent V1R genes in the mouse genome contains common regulatory elements. In this study, we have characterized V1R promoter regions by comparative genomics and by mapping transcription start sites.</p> <p>Results</p> <p>We find that transcription is initiated from ~1 kb promoter regions that are well conserved within V1R subfamilies. While cross-subfamily homology is not evident by traditional methods, we developed a heuristic motif-searching tool, <it>LogoAlign</it>, and applied this tool to identify motifs shared within the promoters of all V1R genes. Our motif-searching tool exhibits rapid convergence to a relatively small number of non-redundant solutions (97% convergence). We also find that the best motifs contain significantly more information than those identified in controls, and that these motifs are more likely to be found in the immediate vicinity of transcription start sites than elsewhere in gene blocks. The best motifs occur near transcription start sites of ~90% of all V1R genes and across all of the divergent subfamilies. Therefore, these motifs are candidate binding sites for transcription factors involved in V1R co-regulation.</p> <p>Conclusion</p> <p>Our analyses show that V1R subfamilies have broad and well conserved promoter regions from which transcription is initiated. Results from a new motif-finding algorithm, <it>LogoAlign</it>, designed for this context and more generally for searching large, hierarchical datasets, suggest the existence of common information-rich regulatory motifs that are shared across otherwise divergent V1R subfamilies.</p

    Divergent Evolution of Human p53 Binding Sites: Cell Cycle Versus Apoptosis

    Get PDF
    The p53 tumor suppressor is a sequence-specific pleiotropic transcription factor that coordinates cellular responses to DNA damage and stress, initiating cell-cycle arrest or triggering apoptosis. Although the human p53 binding site sequence (or response element [RE]) is well characterized, some genes have consensus-poor REs that are nevertheless both necessary and sufficient for transactivation by p53. Identification of new functional gene regulatory elements under these conditions is problematic, and evolutionary conservation is often employed. We evaluated the comparative genomics approach for assessing evolutionary conservation of putative binding sites by examining conservation of 83 experimentally validated human p53 REs against mouse, rat, rabbit, and dog genomes and detected pronounced conservation differences among p53 REs and p53-regulated pathways. Bona fide NRF2 (nuclear factor [erythroid-derived 2]-like 2 nuclear factor) and NFκB (nuclear factor of kappa light chain gene enhancer in B cells) binding sites, which direct oxidative stress and innate immunity responses, were used as controls, and both exhibited high interspecific conservation. Surprisingly, the average p53 RE was not significantly more conserved than background genomic sequence, and p53 REs in apoptosis genes as a group showed very little conservation. The common bioinformatics practice of filtering RE predictions by 80% rodent sequence identity would not only give a false positive rate of ∼19%, but miss up to 57% of true p53 REs. Examination of interspecific DNA base substitutions as a function of position in the p53 consensus sequence reveals an unexpected excess of diversity in apoptosis-regulating REs versus cell-cycle controlling REs (rodent comparisons: p < 1.0 e−12). While some p53 REs show relatively high levels of conservation, REs in many genes such as BAX, FAS, PCNA, CASP6, SIVA1, and P53AIP1 show little if any homology to rodent sequences. This difference suggests that among mammalian species, evolutionary conservation differs among p53 REs, with some having ancient ancestry and others of more recent origin. Overall our results reveal divergent evolutionary pressure among the binding targets of p53 and emphasize that comparative genomics methods must be used judiciously and tailored to the evolutionary history of the targeted functional regulatory regions

    Transcriptional Regulation Of MicroRNA Genes And The Regulatory Networks In Which They Participate

    Get PDF
    MicroRNA genes are short, non-coding RNAs that function as post-transcriptional gene regulators. Although they have been implicated in organismal development as well as a variety of human diseases, there is still surprisingly little known about their transcriptional regulation. The understanding of microRNA transcription is very important for determining their regulators as well as the specific role they may play in signaling cascades. This dissertation focused on the comparison of mammalian microRNA promoters and upstream sequences to those of known protein coding genes. This dissertation is also focused on determining potential regulatory networks that microRNA genes may participate in, particularly those networks involved in the TGFβ / SMAD signaling pathway. The comparison of intergenic microRNA upstream sequences to those of protein coding genes revealed that the former are up to twice as conserved as the latter, except in the first 500 base pairs where the conservation is similar. Further investigation of the upstream sequences by RNA Polymerase II ChIP-chip revealed the transcription start site for 35 primary-microRNA transcripts. The identification of features capable of distinguishing core promoter regions from background sequences using a support vector machine approach revealed that the transcription start site of primary-microRNA genes share the same sequence features as protein coding genes. These results suggest that in fact microRNA genes are transcribed by the same mechanism by which protein coding genes are transcribed. This information allowed us to then identify the regulatory elements of microRNA genes in the same manner in which we use for protein coding genes. Identification of a SMAD family transcription factor binding site upstream of the human let-7d microRNA revealed a feed-forward regulatory circuit involved in epithelial mesenchymal transition. This provided the first evidence of a direct link between a growth factor and the expression of a microRNA gene. The understanding of microRNA transcriptional regulation has great public health significance. The ability to understand how these post-transcriptional gene regulators function in cellular networks may provide new molecular targets for cures or therapies to a variety of human diseases
    corecore