Search CORE

222 research outputs found

Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels

Author: dePamphilis Claude W
Duarte Jill M
Edger Patrick P
Landherr Lena L
Leebens-Mack Jim
Ma Hong
Pires J Chris
Wall P Kerr
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Although the overwhelming majority of genes found in angiosperms are members of gene families, and both gene- and genome-duplication are pervasive forces in plant genomes, some genes are sufficiently distinct from all other genes in a genome that they can be operationally defined as 'single copy'. Using the gene clustering algorithm MCL-tribe, we have identified a set of 959 single copy genes that are shared single copy genes in the genomes of <it>Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera </it>and <it>Oryza sativa</it>. To characterize these genes, we have performed a number of analyses examining GO annotations, coding sequence length, number of exons, number of domains, presence in distant lineages, such as <it>Selaginella </it>and <it>Physcomitrella</it>, and phylogenetic analysis to estimate copy number in other seed plants and to demonstrate their phylogenetic utility. We then provide examples of how these genes may be used in phylogenetic analyses to reconstruct organismal history, both by using extant coverage in EST databases for seed plants and <it>de novo </it>amplification via RT-PCR in the family Brassicaceae. Results There are 959 single copy nuclear genes shared in <it>Arabidopsis</it>, <it>Populus</it>, <it>Vitis </it>and <it>Oryza </it>["APVO SSC genes"]. The majority of these genes are also present in the <it>Selaginella </it>and <it>Physcomitrella </it>genomes. Public EST sets for 197 species suggest that most of these genes are present across a diverse collection of seed plants, and appear to exist as single or very low copy genes, though exceptions are seen in recently polyploid taxa and in lineages where there is significant evidence for a shared large-scale duplication event. Genes encoding proteins localized in organelles are more commonly single copy than expected by chance, but the evolutionary forces responsible for this bias are unknown. Regardless of the evolutionary mechanisms responsible for the large number of shared single copy genes in diverse flowering plant lineages, these genes are valuable for phylogenetic and comparative analyses. Eighteen of the APVO SSC single copy genes were amplified in the Brassicaceae using RT-PCR and directly sequenced. Alignments of these sequences provide improved resolution of Brassicaceae phylogeny compared to recent studies using plastid and ITS sequences. An analysis of sequences from 13 APVO SSC genes from 69 species of seed plants, derived mainly from public EST databases, yielded a phylogeny that was largely congruent with prior hypotheses based on multiple plastid sequences. Whereas single gene phylogenies that rely on EST sequences have limited bootstrap support as the result of limited sequence information, concatenated alignments result in phylogenetic trees with strong bootstrap support for already established relationships. Overall, these single copy nuclear genes are promising markers for phylogenetics, and contain a greater proportion of phylogenetically-informative sites than commonly used protein-coding sequences from the plastid or mitochondrial genomes. Conclusions Putatively orthologous, shared single copy nuclear genes provide a vast source of new evidence for plant phylogenetics, genome mapping, and other applications, as well as a substantial class of genes for which functional characterization is needed. Preliminary evidence indicates that many of the shared single copy nuclear genes identified in this study may be well suited as markers for addressing phylogenetic hypotheses at a variety of taxonomic levels.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Recommended from our members

Complete chloroplast genome sequences of Drimys, Liriodendron, andPiper: Implications for the phylogeny of magnoliids and the evolution ofGC content

Author: Boore J.L.
Carlson J.
dePamphilis C.W.
Jansen R.K.
Kuehl J.V.
Leebens-Mack J.
Penaflor C.
Zhengqiu C.
Publication venue: COLLABORATION - U.Texas
Publication date: 01/06/2006
Field of study

The magnoliids represent the largest basal angiosperm clade with four orders, 19 families and 8,500 species. Although several recent angiosperm molecular phylogenies have supported the monophyly of magnoliids and suggested relationships among the orders, the limited number of genes examined resulted in only weak support, and these issues remain controversial. Furthermore, considerable incongruence has resulted in phylogenies supporting three different sets of relationships among magnoliids and the two large angiosperm clades, monocots and eudicots. This is one of the most important remaining issues concerning relationships among basal angiosperms. We sequenced the chloroplast genomes of three magnoliids, Drimys (Canellales), Liriodendron (Magnoliales), and Piper (Piperales), and used these data in combination with 32 other completed angiosperm chloroplast genomes to assess phylogenetic relationships among magnoliids. The Drimys and Piper chloroplast genomes are nearly identical in size at 160,606 and 160,624 bp, respectively. The genomes include a pair of inverted repeats of 26,649 bp (Drimys) and 27,039 (Piper), separated by a small single copy region of 18,621 (Drimys) and 18,878 (Piper) and a large single copy region of 88,685 bp (Drimys) and 87,666 bp (Piper). The gene order of both taxa is nearly identical to many other unrearranged angiosperm chloroplast genomes, including Calycanthus, the other published magnoliid genome. Comparisons of angiosperm chloroplast genomes indicate that GC content is not uniformly distributed across the genome. Overall GC content ranges from 34-39%, and coding regions have a substantially higher GC content than non-coding regions (both intergenic spacers and introns). Among protein-coding genes, GC content varies by codon position with 1st codon > 2nd codon > 3rd codon, and it varies by functional group with photosynthetic genes having the highest percentage and NADH genes the lowest. Across the genome, GC content is highest in the inverted repeat due to the presence of rRNA genes and lowest in the small single copy region where most NADH genes are located. Phylogenetic analyses using maximum parsimony and maximum likelihood methods were performed on DNA sequences of 61 protein-coding genes. Trees from both analyses provided strong support for the monophyly of magnoliids and two strongly supported groups were identified, the Canellales/Piperales and the Laurales/Magnoliales. The phylogenies also provided moderate to strong support for the basal position of Amborella, and a sister relationship of magnoliids to a clade that includes monocots and eudicots. The complete sequences of three magnoliid chloroplast genomes provide new data from the largest basal angiosperm clade. Evolutionary comparisons of these new genome sequences, combined with other published angiosperm genome, confirm that GC content is unevenly distributed across the genome by location, codon position, and functional group. Furthermore, phylogenetic analyses provide the strongest support so far for the hypothesis that the magnoliids are sister to a large clade that includes both monocots and eudicots

UNT Digital Library

PhylomeDB: a database for genome-wide collections of gene phylogenies

Author: A. Bueno
Birney
Comas
Duret
Edgar
Gabaldon
Gascuel
Guindon
Huerta-Cepas
J. Dopazo
J. Huerta-Cepas
Leebens-Mack
Li
Ronquist
Sicheritz-Ponten
Smith
T. Gabaldon
Publication venue: Oxford University Press
Publication date
Field of study

The complete collection of evolutionary histories of all genes in a genome, also known as phylome, constitutes a valuable source of information. The reconstruction of phylomes has been previously prevented by large demands of time and computer power, but is now feasible thanks to recent developments in computers and algorithms. To provide a publicly available repository of complete phylomes that allows researchers to access and store large-scale phylogenomic analyses, we have developed PhylomeDB. PhylomeDB is a database of complete phylomes derived for different genomes within a specific taxonomic range. All phylomes in the database are built using a high-quality phylogenetic pipeline that includes evolutionary model testing and alignment trimming phases. For each genome, PhylomeDB provides the alignments, phylogentic trees and tree-based orthology predictions for every single encoded protein. The current version of PhylomeDB includes the phylomes of Human, the yeast Saccharomyces cerevisiae and the bacterium Escherichia coli, comprising a total of 32 289 seed sequences with their corresponding alignments and 172 324 phylogenetic trees. PhylomeDB can be publicly accessed at http://phylomedb.bioinfo.cipf.e

Crossref

PubMed Central

TreeSnatcher plus: capturing phylogenetic trees from images

Author: Arndt von Haeseler
J Hughes
J Leebens-Mack
JM Wicherts
Martin J Lercher
T Laubach
Thomas Laubach
TY Zhang
W Burger
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Recommended from our members

Access to RNA-sequencing data from 1,173 plant species: The 1000 Plant transcriptomes initiative (1KP)

Author: Ayyampalayam S.
Barker M.
Barkmann T.
Bowler C.
Carpenter E.
Dorrell R.
Du W.
Gitzendanner M.
Jimenez Vieira F.
Leebens-Mack J.
Li L.
Matasci N.
Sun J.
Ullrich K.
Wickett N.
Wong G.
Wu S.
Yu J.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/10/2019
Field of study

The 1000 Plant transcriptomes initiative (1KP) explored genetic diversity by sequencing RNA from 1,342 samples representing 1,173 species of green plants (Viridiplantae).This data release accompanies the initiative's final/capstone publication on a set of 3 analyses inferring species trees, whole genome duplications, and gene family expansions. These and previous analyses are based on de novo transcriptome assemblies and related gene predictions. Here, we assess their data and assembly qualities and explain how we detected potential contaminations.These data will be useful to plant and/or evolutionary scientists with interests in particular gene families, either across the green plant tree of life or in more focused lineages

The University of Arizona

MPG.PuRe

phyloXML: XML for evolutionary biology and comparative genomics

Author: Christian M Zmasek
CM Zmasek
CM Zmasek
CM Zmasek
DR Maddison
E Antezana
J Felsenstein
J Felsenstein
J Leebens-Mack
JA Eisen
JC Avise
JE Stajich
Mira V Han
MW Peterson
N Cannata
N Goto
PJ Cock
Q Zhang
R Gilmour
T Bray
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Evolutionary trees are central to a wide range of biological studies. In many of these studies, tree nodes and branches need to be associated (or annotated) with various attributes. For example, in studies concerned with organismal relationships, tree nodes are associated with taxonomic names, whereas tree branches have lengths and oftentimes support values. Gene trees used in comparative genomics or phylogenomics are usually annotated with taxonomic information, genome-related data, such as gene names and functional annotations, as well as events such as gene duplications, speciations, or exon shufflings, combined with information related to the evolutionary tree itself. The data standards currently used for evolutionary trees have limited capacities to incorporate such annotations of different data types. Results We developed a XML language, named phyloXML, for describing evolutionary trees, as well as various associated data items. PhyloXML provides elements for commonly used items, such as branch lengths, support values, taxonomic names, and gene names and identifiers. By using "property" elements, phyloXML can be adapted to novel and unforeseen use cases. We also developed various software tools for reading, writing, conversion, and visualization of phyloXML formatted data. Conclusion PhyloXML is an XML language defined by a complete schema in XSD that allows storing and exchanging the structures of evolutionary trees as well as associated data. More information about phyloXML itself, the XSD schema, as well as tools implementing and supporting phyloXML, is available at <url>http://www.phyloxml.org</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Methods for Obtaining and Analyzing Whole Chloroplast Genome Sequences

Author: Alverson Andrew J.
Boore Jeffrey L.
Chumley Timothy W.
Cui Liying
dePamphilis Claude W.
Fourcade H. Matthew
Haberle Rosemarie C.
Herman Sallie J.
Jansen Robert K.
Kuehl Jennifer V.
Leebens-Mack James
McNeal Joel R.
Peery Riannon
Raubeson Linda A.
Wyman Stacia K.
Publication venue: ScholarWorks@CWU
Publication date: 01/01/2005
Field of study

During the past decade there has been a rapid increase in our understanding of plastid genome organization and evolution due to the availability of many new completely sequenced genomes. Currently there are 43 complete genomes published and ongoing projects are likely to increase this sampling to nearly 200 genomes during the next five years. Several groups of researchers including ours have been developing new techniques for gathering and analyzing entire plastid genome sequences and details of these developments are summarized in this chapter. The most important recent developments that enhance our ability to generate whole chloroplast genome sequences involve the generation of pure fractions of chloroplast genomes by whole genome amplification using rolling circular amplification, cloning genomes into Fosmid or BAC vectors, and the development of an organellar annotation program (DOGMA). In addition to providing details of these methods, we provide an overview of methods for analyzing complete plastid genome sequences for repeats and gene content, as well as approaches for using gene order and sequence data for phylogeny reconstruction. This explosive increase in the number of sequenced plastid genomes and improved computational tools will provide many insights into the evolution of these genomes and much new data for assessing relationships at deep nodes in plants and other photosynthetic organisms

ScholarWorks at Central Washington University

eScholarship - University of California

Characterization of the basal angiosperm Aristolochia fimbriata: a potential experimental system for genetic studies

Author: Abdelali Barakat
Barbara J Bliss
Christoph Neinhuis
Claude W dePamphilis
Hong Ma
Jim Leebens-Mack
Kathiravetpilla Arumuganathan
Lena Landherr
Norman Wickett
P Kerr Wall
Paula E Ralph
Sandra W Clifton
Saravanaraj Ayyampalayam
Siela N Maximova
Stefan Wanke
Yi Hu
Yuannian Jiao
Publication venue: Springer Nature
Publication date: 01/01/2013
Field of study

BACKGROUND: Previous studies in basal angiosperms have provided insight into the diversity within the angiosperm lineage and helped to polarize analyses of flowering plant evolution. However, there is still not an experimental system for genetic studies among basal angiosperms to facilitate comparative studies and functional investigation. It would be desirable to identify a basal angiosperm experimental system that possesses many of the features found in existing plant model systems (e.g., Arabidopsis and Oryza). RESULTS: We have considered all basal angiosperm families for general characteristics important for experimental systems, including availability to the scientific community, growth habit, and membership in a large basal angiosperm group that displays a wide spectrum of phenotypic diversity. Most basal angiosperms are woody or aquatic, thus are not well-suited for large scale cultivation, and were excluded. We further investigated members of Aristolochiaceae for ease of culture, life cycle, genome size, and chromosome number. We demonstrated self-compatibility for Aristolochia elegans and A. fimbriata, and transformation with a GFP reporter construct for Saruma henryi and A. fimbriata. Furthermore, A. fimbriata was easily cultivated with a life cycle of just three months, could be regenerated in a tissue culture system, and had one of the smallest genomes among basal angiosperms. An extensive multi-tissue EST dataset was produced for A. fimbriata that includes over 3.8 million 454 sequence reads. CONCLUSIONS: Aristolochia fimbriata has numerous features that facilitate genetic studies and is suggested as a potential model system for use with a wide variety of technologies. Emerging genetic and genomic tools for A. fimbriata and closely related species can aid the investigation of floral biology, developmental genetics, biochemical pathways important in plant-insect interactions as well as human health, and various other features present in early angiosperms

Springer - Publisher Connector

PubMed Central

A universal probe set for targeted sequencing of 353 nuclear genes from any flowering plant designed using k-medoids clustering

Author: Baker William J.
Botigue Laura R.
Cowan Robyn S.
Devault Alison
Dodsworth Steven
Eiserhardt Wolf L.
Epitawalage Niroshini
Forest Felix
Johnson Matthew G.
Kim Jan T.
Leebens-Mack James H.
Leitch Ilia J.
Maurin Olivier
Pokorny Lisa
Soltis Douglas E.
Soltis Pamela S.
Wickett Norman J.
Wong Gane Ka-Shu
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2018
Field of study

Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost of developing targeted sequencing approaches is associated with the generation of preliminary data needed for the identification of orthologous loci for probe design. In plants, identifying orthologous loci has proven difficult due to a large number of whole-genome duplication events, especially in the angiosperms (flowering plants).We used multiple sequence alignments from over 600 angiosperms for 353 putatively single-copy protein-coding genes identified by the One Thousand Plant Transcriptomes Initiative to design a set of targeted sequencing probes for phylogenetic studies of any angiosperm group. To maximize the phylogenetic potential of the probes, while minimizing the cost of production, we introduce a k-medoids clustering approach to identify the minimum number of sequences necessary to represent each coding sequence in the final probe set. Using this method, 5–15 representative sequences were selected per orthologous locus, representing the sequence diversity of angiosperms more efficiently than if probes were designed using available sequenced genomes alone. To test our approximately 80,000 probes, we hybridized libraries from 42 species spanning all higher-order groups of angiosperms, with a focus on taxa not present in the sequence alignments used to design the probes. Out of a possible 353 coding sequences, we recovered an average of 283 per species and at least 100 in all species. Differences among taxa in sequence recovery could not be explained by relatedness to the representative taxa selected for probe design, suggesting that there is no phylogenetic bias in the probe set. Our probe set, which targeted 260 kbp of coding sequence, achieved a median recovery of 137 kbp per taxon in coding regions, a maximum recovery of 250 kbp, and an additional median of 212 kbp per taxon in flanking non-coding regions across all species. These results suggest that the Angiosperms353 probe set described here is effective for any group of flowering plants and would be useful for phylogenetic studies from the species level to higher-order groups, including the entire angiosperm clade itself

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Shared Research Repository

ZENODO

Dryad Digital Repository (Duke University)

Electronic Archiving System

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Diposit Digital de Documents de la UAB

Digital.CSIC

University of Bedfordshire Repository

Complex scaffold remodeling in plant triterpene biosynthesis

Author: Harkess Alex
Hodgson Hannah
Jimenez Luis
Leebens-Mack James
Liu Chun-Ting
Martin Azahara Carmen
Osbourn Anne
Owen Charlotte
Pena Ricardo De La
Sattely Elizabeth
Stephenson Michael J
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 27/01/2023
Field of study

Triterpenes with complex scaffold modifications are widespread in the plant kingdom. Limonoids are an exemplary family that are responsible for the bitter taste in citrus (e.g., limonin) and the active constituents of neem oil, a widely used bioinsecticide (e.g., azadirachtin). Despite the commercial value of limonoids, a complete biosynthetic route has not been described. We report the discovery of 22 enzymes, including a pair of neofunctionalized sterol isomerases, that catalyze 12 distinct reactions in the total biosynthesis of kihadalactone A and azadirone, products that bear the signature limonoid furan. These results enable access to valuable limonoids and provide a template for discovery and reconstitution of triterpene biosynthetic pathways in plants that require multiple skeletal rearrangements and oxidations

University of East Anglia digital repository