2,254 research outputs found

    Mode and tempo of gene and genome evolution in plants

    Get PDF

    Phytome: a platform for plant comparative genomics

    Get PDF
    Phytome is an online comparative genomics resource that can be applied to functional plant genomics, molecular breeding and evolutionary studies. It contains predicted protein sequences, protein family assignments, multiple sequence alignments, phylogenies and functional annotations for proteins from a large, phylogenetically diverse set of plant taxa. Phytome serves as a glue between disparate plant gene databases both by identifying the evolutionary relationships among orthologous and paralogous protein sequences from different species and by enabling cross-references between different versions of the same gene curated independently by different database groups. The web interface enables sophisticated queries on lineage-specific patterns of gene/protein family proliferation and loss. This rich dataset is serving as a platform for the unification of sequence-anchored comparative maps across taxonomic families of plants. The Phytome web interface can be accessed at the following URL: . Batch homology searches and bulk downloads are available upon free registration

    MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The development, in the last decade, of stochastic heuristics implemented in robust application softwares has made large phylogeny inference a key step in most comparative studies involving molecular sequences. Still, the choice of a phylogeny inference software is often dictated by a combination of parameters not related to the raw performance of the implemented algorithm(s) but rather by practical issues such as ergonomics and/or the availability of specific functionalities.</p> <p>Results</p> <p>Here, we present MetaPIGA v2.0, a robust implementation of several stochastic heuristics for large phylogeny inference (under maximum likelihood), including a Simulated Annealing algorithm, a classical Genetic Algorithm, and the Metapopulation Genetic Algorithm (metaGA) together with complex substitution models, discrete Gamma rate heterogeneity, and the possibility to partition data. MetaPIGA v2.0 also implements the Likelihood Ratio Test, the Akaike Information Criterion, and the Bayesian Information Criterion for automated selection of substitution models that best fit the data. Heuristics and substitution models are highly customizable through manual batch files and command line processing. However, MetaPIGA v2.0 also offers an extensive graphical user interface for parameters setting, generating and running batch files, following run progress, and manipulating result trees. MetaPIGA v2.0 uses standard formats for data sets and trees, is platform independent, runs in 32 and 64-bits systems, and takes advantage of multiprocessor and multicore computers.</p> <p>Conclusions</p> <p>The metaGA resolves the major problem inherent to classical Genetic Algorithms by maintaining high inter-population variation even under strong intra-population selection. Implementation of the metaGA together with additional stochastic heuristics into a single software will allow rigorous optimization of each heuristic as well as a meaningful comparison of performances among these algorithms. MetaPIGA v2.0 gives access both to high customization for the phylogeneticist, as well as to an ergonomic interface and functionalities assisting the non-specialist for sound inference of large phylogenetic trees using nucleotide sequences. MetaPIGA v2.0 and its extensive user-manual are freely available to academics at <url>http://www.metapiga.org</url>.</p

    Broad Phylogenomic Sampling and the Sister Lineage of Land Plants

    Get PDF
    The tremendous diversity of land plants all descended from a single charophyte green alga that colonized the land somewhere between 430 and 470 million years ago. Six orders of charophyte green algae, in addition to embryophytes, comprise the Streptophyta s.l. Previous studies have focused on reconstructing the phylogeny of organisms tied to this key colonization event, but wildly conflicting results have sparked a contentious debate over which lineage gave rise to land plants. The dominant view has been that ‘stoneworts,’ or Charales, are the sister lineage, but an alternative hypothesis supports the Zygnematales (often referred to as “pond scum”) as the sister lineage. In this paper, we provide a well-supported, 160-nuclear-gene phylogenomic analysis supporting the Zygnematales as the closest living relative to land plants. Our study makes two key contributions to the field: 1) the use of an unbiased method to collect a large set of orthologs from deeply diverging species and 2) the use of these data in determining the sister lineage to land plants. We anticipate this updated phylogeny not only will hugely impact lesson plans in introductory biology courses, but also will provide a solid phylogenetic tree for future green-lineage research, whether it be related to plants or green algae

    Characterization of Microbial Communities:From Fragments of Genes to Full Genomes

    Get PDF

    Workflows for the Large-Scale Assessment of miRNA Evolution: Birth and Death of miRNA Genes in Tunicates

    Get PDF
    As described over 20 years ago with the discovery of RNA interference (RNAi), double-stranded RNAs occupied key roles in regulation and as defense-line in animal cells. This thesis focuses on metazoan microRNAs (miRNAs). These small non-coding RNAs are distinguished from their small-interfering RNA (siRNA) relatives by their tightly controlled, efficient and flexible biogenesis, together with a broader flexibility to target multiple mRNAs by a seed imperfect base-pairing. As potent regulators, miRNAs are involved in mRNA stability and post-transcriptional regulation tasks, being a conserved mechanism used repetitively by the evolution, not only in metazoans, but plants and unicellular organisms. Through a comprehensive revision of the current animal miRNA model, the canonical pathway dominates the extensive literature about miRNAs, and served as a scaffold to understand the scenes behind the regulatory landscape performed by the cell. The characterization of a diverse set of non-canonical pathways has expanded this view, suggesting a diverse, rich and flexible regulatory landscape to generate mature miRNAs. The production of miRNAs, derived from isolated or clustered transcripts, is an efficient and highly conserved mechanism traced back to animals with high fidelity at family level. In evolutionary terms, expansions of miRNA families have been associated with an increasing morphological and developmental complexity. In particular, the Chordata clade (the ancient cephalochordates, highly derived and secondary simplified tunicates, and the well-known vertebrates) represents an interesting scenario to study miRNA evolution. Despite clearly conserved miRNAs along these clades, tunicates display massive restructuring events, including emergence of highly derived miRNAs. As shown in this thesis, model organisms or vertebrate-specific bias exist in current animal miRNA annotations, misrepresenting more diverse groups, such as marine invertebrates. Current miRNA databases, such as miRBase and Rfam, classified miRNAs under different definitions and possessed annotations that are not simple to be linked. As an alternative, this thesis proposes a method to curate and merge those annotations, making use of miRBase precursor/mature annotations and genomes together with Rfam predicted sequences. This approach generated structural models for shared miRNA families, based on the alignment of their correct-positioned mature sequences as anchors. In this process, the developed structural curation steps flagged 33 miRNA families from the Rfam as questionable. Curated Rfam and miRBase anchored-structural alignments provided a rich resource for constructing predictive miRNA profiles, using correspondent hidden Markov (HMMs) and covariance models (CMs). As a direct application, the use of those models is time-consuming, and the user has to deal with multiple iterations to achieve a genome-wide non-overlapping annotation. To resolve this, the proposed miRNAture pipeline provides an automatic and flexible solution to annotate miRNAs. It combines multiple homology approaches to generate the best candidates validated at sequence and structural levels. This increases the achievable sensitivity to annotate canonical miRNAs, and the evaluation against human annotation shows that clear false positive calls are rare and additional counterparts lie in retained-introns, transcribed lncRNAs or repeat families. Further development of miRNAture suggests an inclusion of multiple rules to distinguish non-canonical miRNA families. This thesis describes multiple homology approaches to annotate the genomic information from a non-model chordate: the colonial tunicate Didemnum vexillum. Detected high levels of genetic variance and unexpected levels of DNA degradation were evidenced through a comprehensive analysis of genome-assembly methods and gene annotation. Despite those challenges, it was possible to find candidate homeobox and skeletogenesis- related genes. On its own, the ncRNA annotation included expected conserved families, and an extensive search of the Rhabdomyosarcoma 2-associated transcript (RMST) lncRNA family traced-back at the divergence of deuterostomes. In addition, a complete study of the annotation thresholds suggested variations to detect miRNAs, later implemented on the miRNAture tool. This chapter is a showcase of the usual workflow that should follow comprehensive sequencing, assembly and annotation project, in the light of the increasing research approaching DNA sequencing. In the last 10 years, the remarkable increment in tunicate sequencing projects boosted the access to an expanded miRNA annotation landscape. In this way, a comprehensive homology approach annotated the miRNA complement of 28 deuterostome genomes (including current 16 reported tunicates) using miRNAture. To get proper structural models as input, corrected miRBase structural alignments served as a scaffold for building correspondent CMs, based on a developed genetic algorithm. By this means, this automatic approach selected the set of sequences that composed the alignments, generating 2492 miRNA CMs. Despite the multiple sources and associated heterogeneity of the studied genomes, a clustering approach successfully gathered five groups of similar assemblies and highlighted low quality assemblies. The overall family and loci reduction on tunicates is notorious, showing on average 374 microRNA (miRNA) loci, in comparison to other clades: Cephalochordata (2119), Vertebrata (3638), Hemichordata (1092) and Echinodermata (2737). Detection of 533 miRNA families on the divergence of tunicates shows an expanded landscape regarding currently miRNA annotated families. Shared sets of ancestral, chordates, Olfactores, and specific clade-specific miRNAs were uncovered using a phyloge- netic conservation criteria. Compared to current annotations, the family repertories were expanded in all cases. Finally, relying on the adjacent elements from annotated miRNAs, this thesis proposes an additional syntenic support to cluster miRNA loci. In this way, the structural alignment of miR-1497, originally annotated in three model tunicates, was expanded with a clear syntenic support on tunicates
    corecore