146 research outputs found

    Exclusion of repetitive DNA elements from gnathostome Hox clusters

    Get PDF
    Despite their homology and analogous function, the Hox gene clusters of vertebrates and invertebrates are subject to different constraints on their structural organization. This is demonstrated by a drastically different distribution of repetitive DNA elements in the Hox cluster regions. While gnathostomes have a strong tendency to exclude repetitive DNA elements from the inside of their Hox clusters, no such trend can be detected in the Hox gene clusters of protostomes. Repeats “invade” the gnathostome Hox clusters from the 5′ and 3′ ends while the core of the clusters remains virtually free of repetitive DNA. This invasion appears to be correlated with relaxed constraints associated with gene loss after cluster duplications

    SynBlast: Assisting the Analysis of Conserved Synteny Information

    Get PDF
    Motivation: In the last years more than 20 vertebrate genomes have been sequenced, and the rate at which genomic DNA information becomes available is rapidly accelerating. Gene duplication and gene loss events inherently limit the accuracy of orthology detection based on sequence similarity alone. Fully automated methods for orthology annotation do exist but often fail to identify individual members in cases of large gene families, or to distinguish missing data from traceable gene losses. This situation can be improved in many cases by including conserved synteny information. Results: Here we present the SynBlast pipeline that is designed to construct and evaluate local synteny information. SynBlast uses the genomic region around a focal reference gene to retrieve candidates for homologous regions from a collection of target genomes and ranks them in accord with the available evidence for homology. The pipeline is intended as a tool to aid high quality manual annotation in particular in those cases where automatic procedures fail. We demonstrate how SynBlast is applied to retrieving orthologous and paralogous clusters using the vertebrate Hox and ParaHox clusters as examples

    Independent Hox‐cluster duplications in lampreys

    Get PDF
    The analysis of the publicly available Hox gene sequences from the sea lamprey Petromyzon marinus provides evidence that the Hox clusters in lampreys and other vertebrate species arose from independent duplications. In particular, our analysis supports the hypothesis that the last common ancestor of agnathans and gnathostomes had only a single Hox cluster which was subsequently duplicated independently in the two lineages

    A Story of Growing Confusion: Genes and Their Regulation

    Get PDF
    High-throughput experiments have produced convicing evidence for an extensive contribution of diverse classes of RNAs in the expression of genetic information. Instead of a simple arrangement of mostly protein-coding genes, the human tran- scriptome features a complex arrangement of overlapping transcripts, many of which do not code for proteins at all, while others “sample” exons from several different “genes”. The complexity of the transcriptome and the prevalence of non- coding transcripts forces us to reconsider both the concept of the “gene” itself and our understanding of the mechanisms that regulate “gene expression”

    'Genes'

    Get PDF
    In order to describe a cell at molecular level, a notion of a “gene” is neither necessary nor helpful. It is sufficient to consider the molecules (i.e., chromosomes, transcripts, proteins) and their interactions to describe cellular processes. The downside of the resulting high resolution is that it becomes very tedious to address features on the organismal and phenotypic levels with a language based on molecular terms. Looking for the missing link between biological disciplines dealing with different levels of biological organization, we suggest to return to the original intent behind the term “gene”. To this end, we propose to investigate whether a useful notion of “gene” can be constructed based on an underlying notion of function, and whether this can serve as the necessary link and embed the various distinct gene concepts of biological (sub)disciplines in a coherent theoretical framework. In reply to the Genon Theory recently put forward by Klaus Scherrer and Jürgen Jost in this journal, we shall discuss a general approach to assess a gene definition that should then be tested for its expressiveness and potential cross-disciplinary relevance

    Divergence of Conserved Non-Coding Sequences: Rate Estimate and Relative Rate Tests

    Get PDF
    In many eukaryotic genomes only a small fraction of the DNA codes for proteins, but the non-protein coding DNA harbors important genetic elements directing the development and the physiology of the organisms, like promoters, enhancers, insulators, and micro-RNA genes. The molecular evolution of these genetic elements is difficult to study because their functional significance is hard to deduce from sequence information alone. Here we propose an approach to the study of the rate of evolution of functional non-coding sequences at a macro-evolutionary scale. We identify functionally important non-coding sequences as Conserved Non-Coding Nucleotide (CNCN) sequences from the comparison of two outgroup species. The CNCN sequences so identified are then compared to their homologous sequences in a pair of ingroup species, and we monitor the degree of modification these sequences suffered in the two ingroup lineages. We propose a method to test for rate differences in the modification of CNCN sequences among the two ingroup lineages, as well as a method to estimate their rate of modification. We apply this method to the full sequences of the HoxA clusters from six gnathostome species: a shark, Heterodontus francisci; a basal ray finned fish, Polypterus senegalus; the amphibian, Xenopus tropicalis; as well as three mammalian species, human, rat and mouse. The results show that the evolutionary rate of CNCN sequences is not distinguishable among the three mammalian lineages, while the Xenopus lineage has a significantly increased rate of evolution. Furthermore the estimates of the rate parameters suggest that in the stem lineage of mammals the rate of CNCN sequence evolution was more than twice the rate observed within the placental amniotes clade, suggesting a high rate of evolution of cis-regulatory elements during the origin of amniotes and mammals. We conclude that the proposed methods can be used for testing hypotheses about the rate and pattern of evolution of putative cis-regulatory elements

    The Footprint Sorting Problem

    Get PDF
    Phylogenetic footprints are short pieces of noncoding DNA sequence in the vicinity of a gene that are conserved between evolutionary distant species. A seemingly simple problem is to sort footprints in their order along the genomes. It is complicated by the fact that not all footprints are collinear:  they may cross each other. The problem thus becomes the identification of the crossing footprints, the sorting of the remaining collinear cliques, and finally the insertion of the noncollinear ones at “reasonable” positions. We show that solving the footprint sorting problem requires the solution of the “Minimum Weight Vertex Feedback Set Problem”, which is known to be NP-complete and APX-hard. Nevertheless good approximations can be obtained for data sets of interest. The remaining steps of the sorting process are straightforward:  computation of the transitive closure of an acyclic graph, linear extension of the resulting partial order, and finally sorting w.r.t. the linear extension. Alternatively, the footprint sorting problem can be rephrased as a combinatorial optimization problem for which approximate solutions can be obtained by means of general purpose heuristics. Footprint sortings obtained with different methods can be compared using a version of multiple sequence alignment that allows the identification of unambiguously ordered sublists. As an application we show that the rat has a slighly increased insertion/deletion rate in comparison to the mouse genome

    Rate variations, phylogenetics, and partial orders

    Get PDF
    The systematic assessment of rate variations across large datasets requires a systematic approach for summarizing results from individual tests. Often, this is performed by coarse-graining the phylogeny to consider rate variations at the level of sub-claded. In a phylo-geographic setting, however, one is often more interested in other partitions of the data, and in an exploratory mode a pre-specified subdivision of the data is often undesirable. We propose here to arrange rate variation data as the partially ordered set defined by the significant test results
    corecore