3,874 research outputs found

    Systematic identification of gene families for use as markers for phylogenetic and phylogeny- driven ecological studies of bacteria and archaea and their major subgroups

    Full text link
    With the astonishing rate that the genomic and metagenomic sequence data sets are accumulating, there are many reasons to constrain the data analyses. One approach to such constrained analyses is to focus on select subsets of gene families that are particularly well suited for the tasks at hand. Such gene families have generally been referred to as marker genes. We are particularly interested in identifying and using such marker genes for phylogenetic and phylogeny-driven ecological studies of microbes and their communities. We therefore refer to these as PhyEco (for phylogenetic and phylogenetic ecology) markers. The dual use of these PhyEco markers means that we needed to develop and apply a set of somewhat novel criteria for identification of the best candidates for such markers. The criteria we focused on included universality across the taxa of interest, ability to be used to produce robust phylogenetic trees that reflect as much as possible the evolution of the species from which the genes come, and low variation in copy number across taxa. We describe here an automated protocol for identifying potential PhyEco markers from a set of complete genome sequences. The protocol combines rapid searching, clustering and phylogenetic tree building algorithms to generate protein families that meet the criteria listed above. We report here the identification of PhyEco markers for different taxonomic levels including 40 for all bacteria and archaea, 114 for all bacteria, and much more for some of the individual phyla of bacteria. This new list of PhyEco markers should allow much more detailed automated phylogenetic and phylogenetic ecology analyses of these groups than possible previously.Comment: 24 pages, 3 figure

    Insights into a dinoflagellate genome through expressed sequence tag analysis

    Get PDF
    BACKGROUND: Dinoflagellates are important marine primary producers and grazers and cause toxic "red tides". These taxa are characterized by many unique features such as immense genomes, the absence of nucleosomes, and photosynthetic organelles (plastids) that have been gained and lost multiple times. We generated EST sequences from non-normalized and normalized cDNA libraries from a culture of the toxic species Alexandrium tamarense to elucidate dinoflagellate evolution. Previous analyses of these data have clarified plastid origin and here we study the gene content, annotate the ESTs, and analyze the genes that are putatively involved in DNA packaging. RESULTS: Approximately 20% of the 6,723 unique (11,171 total 3'-reads) ESTs data could be annotated using Blast searches against GenBank. Several putative dinoflagellate-specific mRNAs were identified, including one novel plastid protein. Dinoflagellate genes, similar to other eukaryotes, have a high GC-content that is reflected in the amino acid codon usage. Highly represented transcripts include histone-like (HLP) and luciferin binding proteins and several genes occur in families that encode nearly identical proteins. We also identified rare transcripts encoding a predicted protein highly similar to histone H2A.X. We speculate this histone may be retained for its role in DNA double-strand break repair. CONCLUSION: This is the most extensive collection to date of ESTs from a toxic dinoflagellate. These data will be instrumental to future research to understand the unique and complex cell biology of these organisms and for potentially identifying the genes involved in toxin production

    Ancient properties of spider silks revealed by the complete gene sequence of the prey-wrapping silk protein (AcSp1).

    Get PDF
    Spider silk fibers have impressive mechanical properties and are primarily composed of highly repetitive structural proteins (termed spidroins) encoded by a single gene family. Most characterized spidroin genes are incompletely known because of their extreme size (typically >9 kb) and repetitiveness, limiting understanding of the evolutionary processes that gave rise to their unusual gene architectures. The only complete spidroin genes characterized thus far form the dragline in the Western black widow, Latrodectus hesperus. Here, we describe the first complete gene sequence encoding the aciniform spidroin AcSp1, the primary component of spider prey-wrapping fibers. L. hesperus AcSp1 contains a single enormous (∼19 kb) exon. The AcSp1 repeat sequence is exceptionally conserved between two widow species (∼94% identity) and between widows and distantly related orb-weavers (∼30% identity), consistent with a history of strong purifying selection on its amino acid sequence. Furthermore, the 16 repeats (each 371-375 amino acids long) found in black widow AcSp1 are, on average, >99% identical at the nucleotide level. A combination of stabilizing selection on amino acid sequence, selection on silent sites, and intragenic recombination likely explains the extreme homogenization of AcSp1 repeats. In addition, phylogenetic analyses of spidroin paralogs support a gene duplication event occurring concomitantly with specialization of the aciniform glands and the tubuliform glands, which synthesize egg-case silk. With repeats that are dramatically different in length and amino acid composition from dragline spidroins, our L. hesperus AcSp1 expands the knowledge base for developing silk-based biomimetic technologies

    Molecular evolution of RRM-containing proteins and glycine-rich RNA-binding proteins in plants

    Get PDF
    *Abstract*

*Background:*
In angiosperms, RNA-binding proteins with an RNA recognition motif (RRM)-type RNA interaction domain play an important role in developmental and environmental responses. Despite their pivotal role, a comprehensive analysis of their number and diversity has only been performed in _Arabidopsis_ so far.

*Results:*
Here we present a detailed phylogenetic analysis of RRM-containing proteins in plants, the red algae _Cyanidioschyzon merolae_ and cyanobacteria. We identified two major events during the diversification of the RRM in plants, one at the emergence of green plants, and the other at the water-to-land transition. We focused on proteins that combine a single RRM with a glycine-rich stretch, known as glycine-rich RNA-binding proteins (GRPs). We found that GRPs are present in cyanobacteria, however plant and cyanobacterial GRPs are not of monophyletic origin. We provide evidence that plant GRPs form a polyphyletic group.
 
*Conclusion:*
Our work provides insights into the origin of GRPs in plants. We determined that the RRM from plants and cyanobacteria do not have a common origin. We could also determine that the acquisition of the glycine-rich stretch has happened at least on three separate occasions during the evolution of GRPs. One event led to the emergence of cyanobacterial GRPs, while later acquisition events led to the emergence of GRPs in the green lineage. No GRPs were found in red or marine green algae. We found a subgroup of GRPs exclusive to land plants, and its appearance may be linked to challenges related to the water-to-land transition.
&#xa

    A tale of three kingdoms: Members of the Phylum Nematoda independently acquired the detoxifying enzyme cyanase through horizontal gene transfer from plants and bacteria

    Get PDF
    Horizontal gene transfer (HGT) has played an important role in the evolution of nematodes. Among candidate genes, cyanase, which is typically found only in plants, bacteria and fungi, is present in more than 35 members of the Phylum Nematoda, but absent from free-living and clade V organisms. Phylogenetic analyses showed that the cyanases of clade I organisms Trichinella spp., Trichuris spp. and Soboliphyme baturini (Subclass: Dorylaimia) represent a well-supported monophyletic clade with plant cyanases. In contrast, all cyanases found within the Subclass Chromadoria which encompasses filarioids, ascaridoids and strongyloids are homologous to those of bacteria. Western blots exhibited typical multimeric forms of the native molecule in protein extracts of Trichinella spiralis muscle larvae, where immunohisto- chemical staining localized the protein to the worm hypodermis and underlying muscle. Recombinant Trichinella cyanase was bioactive where gene transcription profiles support functional activity in vivo. Results suggest that: (1) independent HGT in parasitic nematodes originated from different Kingdoms; (2) cyanase acquired an active role in the biology of extant Trichinella; (3) acquisition occurred more than 400 million years ago (MYA), prior to the divergence of the Trichinellida and Dioctophymatida, and (4) early, free-living ances- tors of the genus Trichinella had an association with terrestrial plants

    Evolutionary and Functional Relationships in the Truncated Hemoglobin Family

    Get PDF
    Predicting function from sequence is an important goal in current biological research, and although, broad functional assignment is possible when a protein is assigned to a family, predicting functional specificity with accuracy is not straightforward. If function is provided by key structural properties and the relevant properties can be computed using the sequence as the starting point, it should in principle be possible to predict function in detail. The truncated hemoglobin family presents an interesting benchmark study due to their ubiquity, sequence diversity in the context of a conserved fold and the number of characterized members. Their functions are tightly related to O2affinity and reactivity, as determined by the association and dissociation rate constants, both of which can be predicted and analyzed using in-silico based tools. In the present work we have applied a strategy, which combines homology modeling with molecular based energy calculations, to predict and analyze function of all known truncated hemoglobins in an evolutionary context. Our results show that truncated hemoglobins present conserved family features, but that its structure is flexible enough to allow the switch from high to low affinity in a few evolutionary steps. Most proteins display moderate to high oxygen affinities and multiple ligand migration paths, which, besides some minor trends, show heterogeneous distributions throughout the phylogenetic tree, again suggesting fast functional adaptation. Our data not only deepens our comprehension of the structural basis governing ligand affinity, but they also highlight some interesting functional evolutionary trends.Fil: Bustamante, Juan Pablo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química, Física de los Materiales, Medioambiente y Energía. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química, Física de los Materiales, Medioambiente y Energía; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; ArgentinaFil: Radusky, Leandro Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química Biológica de la Facultad de Ciencias Exactas y Naturales; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Biológica; ArgentinaFil: Boechi, Leonardo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo; ArgentinaFil: Estrin, Dario Ariel. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Química, Física de los Materiales, Medioambiente y Energía. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Química, Física de los Materiales, Medioambiente y Energía; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Química Inorgánica, Analítica y Química Física; ArgentinaFil: Ten Have, Arjen. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mar del Plata. Instituto de Investigaciones Biológicas. Universidad Nacional de Mar del Plata. Facultad de Ciencias Exactas y Naturales. Instituto de Investigaciones Biológicas; ArgentinaFil: Marti, Marcelo Adrian. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria; Argentina. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Cálculo; Argentin
    corecore