312 research outputs found

    Pervasive Phylogenomic Incongruence Underlies Evolutionary Relationships in Eyebrights (Euphrasia, Orobanchaceae)

    Get PDF
    Disentangling the phylogenetic relationships of taxonomically complex plant groups is often mired by challenges associated with recent speciation, hybridization, complex mating systems, and polyploidy. Here, we perform a phylogenomic analysis of eyebrights (Euphrasia), a group renowned for taxonomic complexity, with the aim of documenting the extent of phylogenetic discordance at both deep and at shallow phylogenetic scales. We generate whole-genome sequencing data and integrate this with prior genomic data to perform a comprehensive analysis of nuclear genomic, nuclear ribosomal (nrDNA), and complete plastid genomes from 57 individuals representing 36 Euphrasia species. The species tree analysis of 3,454 conserved nuclear scaffolds (46 Mb) reveals that at shallow phylogenetic scales postglacial colonization of North Western Europe occurred in multiple waves from discrete source populations, with most species not being monophyletic, and instead combining genomic variants from across clades. At a deeper phylogenetic scale, the Euphrasia phylogeny is structured by geography and ploidy, and partially by taxonomy. Comparative analyses show Southern Hemisphere tetraploids include a distinct subgenome indicative of independent polyploidy events from Northern Hemisphere taxa. In contrast to the nuclear genome analyses, the plastid genome phylogeny reveals limited geographic structure, while the nrDNA phylogeny is informative of some geographic and taxonomic affinities but more thorough phylogenetic inference is impeded by the retention of ancestral polymorphisms in the polyploids. Overall our results reveal extensive phylogenetic discordance at both deeper and shallower nodes, with broad-scale geographic structure of genomic variation but a lack of definitive taxonomic signal. This suggests that Euphrasia species either have polytopic origins or are maintained by narrow genomic regions in the face of extensive homogenizing gene flow. Moreover, these results suggest genome skimming will not be an effective extended barcode to identify species in groups such as Euphrasia, or many other postglacial species groups

    Methods for Obtaining and Analyzing Whole Chloroplast Genome Sequences

    Get PDF
    During the past decade there has been a rapid increase in our understanding of plastid genome organization and evolution due to the availability of many new completely sequenced genomes. Currently there are 43 complete genomes published and ongoing projects are likely to increase this sampling to nearly 200 genomes during the next five years. Several groups of researchers including ours have been developing new techniques for gathering and analyzing entire plastid genome sequences and details of these developments are summarized in this chapter. The most important recent developments that enhance our ability to generate whole chloroplast genome sequences involve the generation of pure fractions of chloroplast genomes by whole genome amplification using rolling circular amplification, cloning genomes into Fosmid or BAC vectors, and the development of an organellar annotation program (DOGMA). In addition to providing details of these methods, we provide an overview of methods for analyzing complete plastid genome sequences for repeats and gene content, as well as approaches for using gene order and sequence data for phylogeny reconstruction. This explosive increase in the number of sequenced plastid genomes and improved computational tools will provide many insights into the evolution of these genomes and much new data for assessing relationships at deep nodes in plants and other photosynthetic organisms

    Analysis of 81 Genes From 64 Plastid Genomes Resolves Relationships in Angiosperms and Identifies Genome-Scale Evolutionary Patterns

    Get PDF
    Angiosperms are the largest and most successful clade of land plants with \u3e250,000 species distributed in nearly every terrestrial habitat. Many phylogenetic studies have been based on DNA sequences of one to several genes, but, despite decades of intensive efforts, relationships among early diverging lineages and several of the major clades remain either incompletely resolved or weakly supported. We performed phylogenetic analyses of 81 plastid genes in 64 sequenced genomes, including 13 new genomes, to estimate relationships among the major angiosperm clades, and the resulting trees are used to examine the evolution of gene and intron content. Phylogenetic trees from multiple methods, including model-based approaches, provide strong support for the position of Amborella as the earliest diverging lineage of flowering plants, followed by Nymphaeales and Austrobaileyales. The plastid genome trees also provide strong support for a sister relationship between eudicots and monocots, and this group is sister to a clade that includes Chloranthales and magnoliids. Resolution of relationships among the major clades of angiosperms provides the necessary framework for addressing numerous evolutionary questions regarding the rapid diversification of angiosperms. Gene and intron content are highly conserved among the early diverging angiosperms and basal eudicots, but 62 independent gene and intron losses are limited to the more derived monocot and eudicot clades. Moreover, a lineage-specific correlation was detected between rates of nucleotide substitutions, indels, and genomic rearrangements

    Comparison of next generation sequencing technologies for transcriptome characterization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the <it>Arabidopsis </it>genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and <it>de novo </it>assemblies for the basal eudicot California poppy (<it>Eschscholzia californica</it>) and the magnoliid avocado (<it>Persea americana</it>) using a variety of methods for cDNA synthesis.</p> <p>Results</p> <p>The <it>Arabidopsis </it>reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The <it>Arabidopsis </it>data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc <url>http://fgp.huck.psu.edu/NG_Sims/ngsim.pl</url>, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics.</p> <p>Conclusion</p> <p>NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms.</p

    Application of qRT-PCR and RNA-Seq analysis for the identification of housekeeping genes useful for normalization of gene expression values during Striga hermonthica development.

    Get PDF
    Abstract Striga is a root parasitic weed that attacks many of the staple crops in Africa, India and Southeast Asia, inflicting tremendous losses in yield and for which there are few effective control measures. Studies of parasitic plant virulence and host resistance will be greatly facilitated by the recent emergence of genomic resources that include extensive transcriptome sequence datasets spanning all life stages of S. hermonthica. Functional characterization of Striga genes will require detailed analyses of gene expression patterns. Quantitative real-time PCR is a powerful tool for quantifying gene expression, but correct normalization of expression levels requires identification of control genes that have stable expression across tissues and life stages. Since no S. hermonthica housekeeping genes have been established for this purpose, we evaluated the suitability of six candidate housekeeping genes across key life stages of S. hermonthica from seed conditioning to flower initiation using qRT-PCR and high-throughput cDNA sequencing. Based on gene expression analysis by qRT-PCR and RNA-Seq across heterogeneous Striga life stages, we determined that using the combination of three genes, UBQ1, PP2A and TUB1 provides the best normalization for gene expression throughout the parasitic life cycle. The housekeeping genes characterized here provide robust standards that will facilitate powerful descriptions of parasite gene expression patterns

    Data access for the 1,000 Plants (1KP) project

    Get PDF
    © 2014 Matasci et al.; licensee BioMed Central Ltd. The 1,000 plants (1KP) project is an international multi-disciplinary consortium that has generated transcriptome data from over 1,000 plant species, with exemplars for all of the major lineages across the Viridiplantae (green plants) clade. Here, we describe how to access the data used in a phylogenomics analysis of the first 85 species, and how to visualize our gene and species trees. Users can develop computational pipelines to analyse these data, in conjunction with data of their own that they can upload. Computationally estimated protein-protein interactions and biochemical pathways can be visualized at another site. Finally, we comment on our future plans and how they fit within this scalable system for the dissemination, visualization, and analysis of large multi-species data sets

    Floral gene resources from basal angiosperms for comparative genomics research

    Get PDF
    BACKGROUND: The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. RESULTS: Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. CONCLUSION: Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways

    A physical map for the Amborella trichopoda genome sheds light on the evolution of angiosperm genome structure

    Get PDF
    Background: Recent phylogenetic analyses have identified Amborella trichopoda, an understory tree species endemic to the forests of New Caledonia, as sister to a clade including all other known flowering plant species. The Amborella genome is a unique reference for understanding the evolution of angiosperm genomes because it can serve as an outgroup to root comparative analyses. A physical map, BAC end sequences and sample shotgun sequences provide a first view of the 870 Mbp Amborella genome.Results: Analysis of Amborella BAC ends sequenced from each contig suggests that the density of long terminal repeat retrotransposons is negatively correlated with that of protein coding genes. Syntenic, presumably ancestral, gene blocks were identified in comparisons of the Amborella BAC contigs and the sequenced Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa genomes. Parsimony mapping of the loss of synteny corroborates previous analyses suggesting that the rate of structural change has been more rapid on lineages leading to Arabidopsis and Oryza compared with lineages leading to Populus and Vitis. The gamma paleohexiploidy event identified in the Arabidopsis, Populus and Vitis genomes is shown to have occurred after the divergence of all other known angiosperms from the lineage leading to Amborella.Conclusions: When placed in the context of a physical map, BAC end sequences representing just 5.4% of the Amborella genome have facilitated reconstruction of gene blocks that existed in the last common ancestor of all flowering plants. The Amborella genome is an invaluable reference for inferences concerning the ancestral angiosperm and subsequent genome evolution

    Transcriptome Characterization by RNA-seq Unravels the Mechanisms of Butyrate-Induced Epigenomic Regulation in Bovine Cells

    Get PDF
    Short-chain fatty acids (SCFAs), especially butyrate, affect cell differentiation, proliferation, and motility. Butyrate also induces cell cycle arrest and apoptosis through its inhibition of histone deacetylases (HDACs). In addition, butyrate is a potent inducer of histone hyper-acetylation in cells. Therefore, this SCFA provides an excellent in vitro model for studying the epigenomic regulation of gene expression induced by histone acetylation. In this study, we analyzed the differential in vitro expression of genes induced by butyrate in bovine epithelial cells by using deep RNA-sequencing technology (RNA-seq). The number of sequences read, ranging from 57,303,693 to 78,933,744, were generated per sample. Approximately 11,408 genes were significantly impacted by butyrate, with a false discovery rate (FDR) <0.05. The predominant cellular processes affected by butyrate included cell morphological changes, cell cycle arrest, and apoptosis. Our results provided insight into the transcriptome alterations induced by butyrate, which will undoubtedly facilitate our understanding of the molecular mechanisms underlying butyrate-induced epigenomic regulation in bovine cells
    • …
    corecore