27 research outputs found
SoyDB: a knowledge database of soybean transcription factors
<p>Abstract</p> <p>Background</p> <p>Transcription factors play the crucial rule of regulating gene expression and influence almost all biological processes. Systematically identifying and annotating transcription factors can greatly aid further understanding their functions and mechanisms. In this article, we present SoyDB, a user friendly database containing comprehensive knowledge of soybean transcription factors.</p> <p>Description</p> <p>The soybean genome was recently sequenced by the Department of Energy-Joint Genome Institute (DOE-JGI) and is publicly available. Mining of this sequence identified 5,671 soybean genes as putative transcription factors. These genes were comprehensively annotated as an aid to the soybean research community. We developed SoyDB - a knowledge database for all the transcription factors in the soybean genome. The database contains protein sequences, predicted tertiary structures, putative DNA binding sites, domains, homologous templates in the Protein Data Bank (PDB), protein family classifications, multiple sequence alignments, consensus protein sequence motifs, web logo of each family, and web links to the soybean transcription factor database PlantTFDB, known EST sequences, and other general protein databases including Swiss-Prot, Gene Ontology, KEGG, EMBL, TAIR, InterPro, SMART, PROSITE, NCBI, and Pfam. The database can be accessed via an interactive and convenient web server, which supports full-text search, PSI-BLAST sequence search, database browsing by protein family, and automatic classification of a new protein sequence into one of 64 annotated transcription factor families by hidden Markov models.</p> <p>Conclusions</p> <p>A comprehensive soybean transcription factor database was constructed and made publicly accessible at <url>http://casp.rnet.missouri.edu/soydb/</url>.</p
Comparative mapping and targeted-capture sequencing of the gametocidal loci in Aegilops sharonensis
Gametocidal (Gc) chromosomes or elements in species such as Aegilops sharonensis are preferentially transmitted to the next generation through both the male and female gametes when introduced into wheat. Furthermore, any genes, e.g. genes that control agronomically important traits, showing complete linkage with gametocidal elements, are also transmitted preferentially to the next generation without the need for selection. The mechanism for the preferential transmission of the gametocidal elements appears to occur by the induction of extensive chromosome damage in any gametes that lack the gametocidal chromosome in question. Previous studies on the mechanism of the gametocidal action in Ae. sharonensis indicates that at least two-linked elements are involved. The first, the ‘breaker’ element, induces chromosome breakage in gametes, which have lost the gametocidal elements while the second, the ‘inhibitor’ element, prevents the chromosome breakage action of the ‘breaker’ element in gametes, which carry the Gc elements. In this study, we have used comparative genomic studies to map 54 single nucleotide polymorphism (SNP) markers in an Ae. sharonensis 4SshL introgression segment in wheat and have also identified 18 candidate genes in Ae. sharonensis for the ‘breaker’ element through targeted sequencing of this 4SshL introgression segment. This valuable genomic resource will aide in further mapping the Gc locus that could be exploited in wheat breeding to produce new, superior varieties of wheat
3: Whole Genome Shotgun Sequencing
A Howard Hughes Medical Institute (HHMI) video production describing the Whole Genome Shotgun Sequencing process at the US Department of Energy\u27s Joint Genome Institute (JGI)
Pathways of carbon assimilation and ammonia oxidation suggested by environmental genomic analyses of marine Crenarchaeota.
Marine Crenarchaeota represent an abundant component of oceanic microbiota with potential to significantly influence biogeochemical cycling in marine ecosystems. Prior studies using specific archaeal lipid biomarkers and isotopic analyses indicated that planktonic Crenarchaeota have the capacity for autotrophic growth, and more recent cultivation studies support an ammonia-based chemolithoautotrophic energy metabolism. We report here analysis of fosmid sequences derived from the uncultivated marine crenarchaeote, Cenarchaeum symbiosum, focused on the reconstruction of carbon and energy metabolism. Genes predicted to encode multiple components of a modified 3-hydroxypropionate cycle of autotrophic carbon assimilation were identified, consistent with utilization of carbon dioxide as a carbon source. Additionally, genes predicted to encode a near complete oxidative tricarboxylic acid cycle were also identified, consistent with the consumption of organic carbon and in the production of intermediates for amino acid and cofactor biosynthesis. Therefore, C. symbiosum has the potential to function either as a strict autotroph, or as a mixotroph utilizing both carbon dioxide and organic material as carbon sources. From the standpoint of energy metabolism, genes predicted to encode ammonia monooxygenase subunits, ammonia permease, urease, and urea transporters were identified, consistent with the use of reduced nitrogen compounds as energy sources fueling autotrophic metabolism. Homologues of these genes, recovered from ocean waters worldwide, demonstrate the conservation and ubiquity of crenarchaeal pathways for carbon assimilation and ammonia oxidation. These findings further substantiate the likely global metabolic importance of Crenarchaeota with respect to key steps in the biogeochemical transformation of carbon and nitrogen in marine ecosystems
Comparative Genomics Suggests an Independent Origin of Cytoplasmic Incompatibility in Cardinium hertigii
Terrestrial arthropods are commonly infected with maternally inherited bacterial symbionts that cause cytoplasmic incompatibility (CI). In CI, the outcome of crosses between symbiont-infected males and uninfected females is reproductive failure, increasing the relative fitness of infected females and leading to spread of the symbiont in the host population. CI symbionts have profound impacts on host genetic structure and ecology and may lead to speciation and the rapid evolution of sex determination systems. Cardinium hertigii, a member of the Bacteroidetes and symbiont of the parasitic wasp Encarsia pergandiella, is the only known bacterium other than the Alphaproteobacteria Wolbachia to cause CI. Here we report the genome sequence of Cardinium hertigii cEper1. Comparison with the genomes of CI–inducing Wolbachia pipientis strains wMel, wRi, and wPip provides a unique opportunity to pinpoint shared proteins mediating host cell interaction, including some candidate proteins for CI that have not previously been investigated. The genome of Cardinium lacks all major biosynthetic pathways but harbors a complete biotin biosynthesis pathway, suggesting a potential role for Cardinium in host nutrition. Cardinium lacks known protein secretion systems but encodes a putative phage-derived secretion system distantly related to the antifeeding prophage of the entomopathogen Serratia entomophila. Lastly, while Cardinium and Wolbachia genomes show only a functional overlap of proteins, they show no evidence of laterally transferred elements that would suggest common ancestry of CI in both lineages. Instead, comparative genomics suggests an independent evolution of CI in Cardinium and Wolbachia and provides a novel context for understanding the mechanistic basis of CI.© 2012 Penz et a
Variant profiling of evolving prokaryotic populations
Genomic heterogeneity of bacterial species is observed and studied in experimental evolution experiments and clinical diagnostics, and occurs as micro-diversity of natural habitats. The challenge for genome research is to accurately capture this heterogeneity with the currently used short sequencing reads. Recent advances in NGS technologies improved the speed and coverage and thus allowed for deep sequencing of bacterial populations. This facilitates the quantitative assessment of genomic heterogeneity, including low frequency alleles or haplotypes. However, false positive variant predictions due to sequencing errors and mapping artifacts of short reads need to be prevented. We therefore created VarCap, a workflow for the reliable prediction of different types of variants even at low frequencies. In order to predict SNPs, InDels and structural variations, we evaluated the sensitivity and accuracy of different software tools using synthetic read data. The results suggested that the best sensitivity could be reached by a union of different tools, however at the price of increased false positives. We identified possible reasons for false predictions and used this knowledge to improve the accuracy by post-filtering the predicted variants according to properties such as frequency, coverage, genomic environment/localization and co-localization with other variants. We observed that best precision was achieved by using an intersection of at least two tools per variant. This resulted in the reliable prediction of variants above a minimum relative abundance of 2%. VarCap is designed for being routinely used within experimental evolution experiments or for clinical diagnostics. The detected variants are reported as frequencies within a VCF file and as a graphical overview of the distribution of the different variant/allele/haplotype frequencies. The source code of VarCap is available at https://github.com/ma2o/VarCap. In order to provide this workflow to a broad community, we implemeted VarCap on a Galaxy webserver, which is accessible at http://galaxy.csb.univie.ac.at.© 2017 Zojer et a