2,613 research outputs found
Evolution of the Kdo2-lipid A biosynthesis in bacteria
<p>Abstract</p> <p>Background</p> <p>Lipid A is the highly immunoreactive endotoxic center of lipopolysaccharide (LPS). It anchors the LPS into the outer membrane of most Gram-negative bacteria. Lipid A can be recognized by animal cells, triggers defense-related responses, and causes Gram-negative sepsis. The biosynthesis of Kdo<sub>2</sub>-lipid A, the LPS substructure, involves with nine enzymatic steps.</p> <p>Results</p> <p>In order to elucidate the evolutionary pathway of Kdo<sub>2</sub>-lipid A biosynthesis, we examined the distribution of genes encoding the nine enzymes across bacteria. We found that not all Gram-negative bacteria have all nine enzymes. Some Gram-negative bacteria have no genes encoding these enzymes and others have genes only for the first four enzymes (LpxA, LpxC, LpxD, and LpxB). Among the nine enzymes, five appeared to have arisen from three independent gene duplication events. Two of such events happened within the Proteobacteria lineage, followed by functional specialization of the duplicated genes and pathway optimization in these bacteria.</p> <p>Conclusions</p> <p>The nine-enzyme pathway, which was established based on the studies mainly in <it>Escherichia coli </it>K12, appears to be the most derived and optimized form. It is found only in <it>E. coli </it>and related Proteobacteria. Simpler and probably less efficient pathways are found in other bacterial groups, with Kdo<sub>2</sub>-lipid A variants as the likely end products. The Kdo<sub>2</sub>-lipid A biosynthetic pathway exemplifies extremely plastic evolution of bacterial genomes, especially those of Proteobacteria, and how these mainly pathogenic bacteria have adapted to their environment.</p
Next-Generation Transcriptome Assembly: Strategies and Performance Analysis
Accurate and comprehensive transcriptome assemblies lay the foundation for a range of analyses, such as differential gene expression analysis, metabolic pathway reconstruction, novel gene discovery, or metabolic flux analysis. With the arrival of next-generation sequencing technologies, it has become possible to acquire the whole transcriptome data rapidly even from non-model organisms. However, the problem of accurately assembling the transcriptome for any given sample remains extremely challenging, especially in species with a high prevalence of recent gene or genome duplications, those with alternative splicing of transcripts, or those whose genomes are not well studied. In this chapter, we provided a detailed overview of the strategies used for transcriptome assembly. We reviewed the different statistics available for measuring the quality of transcriptome assemblies with the emphasis on the types of errors each statistic does and does not detect. We also reviewed simulation protocols to computationally generate RNAseq data that present biologically realistic problems such as gene expression bias and alternative splicing. Using such simulated RNAseq data, we presented a comparison of the accuracy, strengths, and weaknesses of nine representative transcriptome assemblers including de novo, genome-guided, and ensemble methods
Evolution of SET-Domain Protein Families in the Unicellular and Multicellular Ascomycota Fungi
Background: The evolution of multicellularity is accompanied by the occurrence of differentiated tissues, of organismal developmental programs, and of mechanisms keeping the balance between proliferation and differentiation. Initially, the SET-domain proteins were associated exclusively with regulation of developmental genes in metazoa. However, finding of SET-domain genes in the unicellular yeasts Saccharomyces cerevisiae and Schizosaccharomyces pombe suggested that SET-domain proteins regulate a much broader variety of biological programs. Intuitively, it is expected that the numbers, types, and biochemical specificity of SET-domain proteins of multicellular versus unicellular forms would reflect the differences in their biology. However, comparisons across the unicellular and multicellular domains of life are complicated by the lack of knowledge of the ancestral SET-domain genes. Even within the crown group, different biological systems might use the epigenetic \u27code\u27 differently, adapting it to organism-specific needs. Simplifying the model, we undertook a systematic phylogenetic analysis of one monophyletic fungal group (Ascomycetes) containing unicellular yeasts, Saccharomycotina (hemiascomycetes), and a filamentous fungal group, Pezizomycotina (euascomycetes). Results: Systematic analysis of the SET-domain genes across an entire eukaryotic phylum has outlined clear distinctions in the SET-domain gene collections in the unicellular and in the multicellular (filamentous) relatives; diversification of SET-domain gene families has increased further with the expansion and elaboration of multicellularity in animal and plant systems. We found several ascomycota-specific SET-domain gene groups; each was unique to either Saccharomycotina or Pezizomycotina fungi. Our analysis revealed that the numbers and types of SET-domain genes in the Saccharomycotina did not reflect the habitats, pathogenicity, mechanisms of sexuality, or the ability to undergo morphogenic transformations. However, novel genes have appeared for functions associated with the transition to multicellularity. Descendents of most of the SET-domain gene families found in the filamentous fungi could be traced in the genomes of extant animals and plants, albeit as more complex structural forms. Conclusion: SET-domain genes found in the filamentous species but absent from the unicellular sister group reflect two alternative evolutionary events: deletion from the yeast genomes or appearance of novel structures in filamentous fungal groups. There were no Ascomycota-specific SET-domain gene families (i.e., absent from animal and plant genomes); however, plants and animals share SET-domain gene subfamilies that do not exist in the fungi. Phylogenetic and genestructure analyses defined several animal and plant SET-domain genes as sister groups while those of fungal origin were basal to them. Plants and animals also share SET-domain subfamilies that do not exist in fungi
Codon usage in twelve species of \u3ci\u3eDrosophila\u3c/i\u3e
Background: Codon usage bias (CUB), the uneven use of synonymous codons, is a ubiquitous observation in virtually all organisms examined. The pattern of codon usage is generally similar among closely related species, but differs significantly among distantly related organisms, e.g., bacteria, yeast, and Drosophila. Several explanations for CUB have been offered and some have been supported by observations and experiments, although a thorough understanding of the evolutionary forces (random drift, mutation bias, and selection) and their relative importance remains to be determined. The recently available complete genome DNA sequences of twelve phylogenetically defined species of Drosophila offer a hitherto unprecedented opportunity to examine these problems. We report here the patterns of codon usage in the twelve species and offer insights on possible evolutionary forces involved. Results: (1) Codon usage is quite stable across 11/12 of the species: G- and especially C-ending codons are used most frequently, thus defining the preferred codons. (2) The only amino acid that changes in preferred codon is Serine with six species of the melanogaster group favoring TCC while the other species, particularly subgenus Drosophila species, favor AGC. (3) D. willistoni is an exception to these generalizations in having a shifted codon usage for seven amino acids toward A/T in the wobble position. (4) Amino acids differ in their contribution to overall CUB, Leu having the greatest and Asp the least. (5) Among two-fold degenerate amino acids, A/G ending amino acids have more selection on codon usage than T/C ending amino acids. (6) Among the different chromosome arms or elements, genes on the non-recombining element F (dot chromosome) have the least CUB, while genes on the element A (X chromosome) have the most. (7) Introns indicate that mutation bias in all species is approximately 2:1, AT:GC, the opposite of codon usage bias. (8) There is also evidence for some overall regional bias in base composition that may influence codon usage. Conclusion: Overall, these results suggest that natural selection has acted on codon usage in the genus Drosophila, at least often enough to leave a footprint of selection in modern genomes. However, there is evidence in the data that random forces (drift and mutation) have also left patterns in the data, especially in genes under weak selection for codon usage for example genes in regions of low recombination. The documentation of codon usage patterns in each of these twelve genomes also aids in ongoing annotation efforts
Codon usage in twelve species of Drosophila
<p>Abstract</p> <p>Background</p> <p>Codon usage bias (CUB), the uneven use of synonymous codons, is a ubiquitous observation in virtually all organisms examined. The pattern of codon usage is generally similar among closely related species, but differs significantly among distantly related organisms, e.g., bacteria, yeast, and <it>Drosophila</it>. Several explanations for CUB have been offered and some have been supported by observations and experiments, although a thorough understanding of the evolutionary forces (random drift, mutation bias, and selection) and their relative importance remains to be determined. The recently available complete genome DNA sequences of twelve phylogenetically defined species of <it>Drosophila</it> offer a hitherto unprecedented opportunity to examine these problems. We report here the patterns of codon usage in the twelve species and offer insights on possible evolutionary forces involved.</p> <p>Results</p> <p>(1) Codon usage is quite stable across 11/12 of the species: G- and especially C-ending codons are used most frequently, thus defining the preferred codons. (2) The only amino acid that changes in preferred codon is Serine with six species of the <it>melanogaster </it>group favoring TCC while the other species, particularly subgenus <it>Drosophila</it> species, favor AGC. (3) <it>D. willistoni </it>is an exception to these generalizations in having a shifted codon usage for seven amino acids toward A/T in the wobble position. (4) Amino acids differ in their contribution to overall CUB, Leu having the greatest and Asp the least. (5) Among two-fold degenerate amino acids, A/G ending amino acids have more selection on codon usage than T/C ending amino acids. (6) Among the different chromosome arms or elements, genes on the non-recombining element F (dot chromosome) have the least CUB, while genes on the element A (X chromosome) have the most. (7) Introns indicate that mutation bias in all species is approximately 2:1, AT:GC, the opposite of codon usage bias. (8) There is also evidence for some overall regional bias in base composition that may influence codon usage.</p> <p>Conclusion</p> <p>Overall, these results suggest that natural selection has acted on codon usage in the genus <it>Drosophila</it>, at least often enough to leave a footprint of selection in modern genomes. However, there is evidence in the data that random forces (drift and mutation) have also left patterns in the data, especially in genes under weak selection for codon usage for example genes in regions of low recombination. The documentation of codon usage patterns in each of these twelve genomes also aids in ongoing annotation efforts.</p
Assessing Multiple Sequence Alignments Using Visual Tools
Bioinformatics and molecular evolutionary analyses most often start with comparing DNA or amino acid sequences by aligning them. Pairwise alignment, for example, is used to measure the similarities between a query sequence and each of those in a database in BLAST similarity search, the most used bioinformatics tool (Altschul et al., 1990; Camacho et al.
7TMRmine: a Web server for hierarchical mining of 7TMR proteins
Background: Seven-transmembrane region-containing receptors (7TMRs) play central roles in eukaryotic signal transduction. Due to their biomedical importance, thorough mining of 7TMRs from diverse genomes has been an active target of bioinformatics and pharmacogenomics research. The need for new and accurate 7TMR/GPCR prediction tools is paramount with the accelerated rate of acquisition of diverse sequence information. Currently available and often used protein classification methods (e.g., profile hidden Markov Models) are highly accurate for identifying their membership information among already known 7TMR subfamilies. However, these alignment-based methods are less effective for identifying remote similarities, e.g., identifying proteins from highly divergent or possibly new 7TMR families. In this regard, more sensitive (e.g., alignment-free) methods are needed to complement the existing protein classification methods. A better strategy would be to combine different classifiers, from more specific to more sensitive methods, to identify a broader spectrum of 7TMR protein candidates. Description: We developed a Web server, 7TMRmine, by integrating alignment-free and alignment-based classifiers specifically trained to identify candidate 7TMR proteins as well as transmembrane (TM) prediction methods. This new tool enables researchers to easily assess the distribution of GPCR functionality in diverse genomes or individual newly-discovered proteins. 7TMRmine is easily customized and facilitates exploratory analysis of diverse genomes. Users can integrate various alignment-based, alignment-free, and TM-prediction methods in any combination and in any hierarchical order. Sixteen classifiers (including two TM-prediction methods) are available on the 7TMRmine Web server. Not only can the 7TMRmine tool be used for 7TMR mining, but also for general TM-protein analysis. Users can submit protein sequences for analysis, or explore pre-analyzed results for multiple genomes. The server currently includes prediction results and the summary statistics for 68 genomes. Conclusion: 7TMRmine facilitates the discovery of 7TMR proteins. By combining prediction results from different classifiers in a multi-level filtering process, prioritized sets of 7TMR candidates can be obtained for further investigation. 7TMRmine can be also used as a general TM-protein classifier. Comparisons of TM and 7TMR protein distributions among 68 genomes revealed interesting differences in evolution of these protein families among major eukaryotic phyla
- …