388 research outputs found
MetaGeneAnnotator: Detecting Species-Specific Patterns of Ribosomal Binding Site for Precise Gene Prediction in Anonymous Prokaryotic and Phage Genomes
Recent advances in DNA sequencers are accelerating genome sequencing, especially in microbes, and complete and draft genomes from various species have been sequenced in rapid succession. Here, we present a comprehensive gene prediction tool, the MetaGeneAnnotator (MGA), which precisely predicts all kinds of prokaryotic genes from a single or a set of anonymous genomic sequences having a variety of lengths. The MGA integrates statistical models of prophage genes, in addition to those of bacterial and archaeal genes, and also uses a self-training model from input sequences for predictions. As a result, the MGA sensitively detects not only typical genes but also atypical genes, such as horizontally transferred and prophage genes in a prokaryotic genome. In this paper, we also propose a novel approach for analyzing the ribosomal binding site (RBS), which enables us to detect species-specific patterns of the RBSs. The MGA has the ingenious RBS model based on this approach, and precisely predicts translation starts of genes. The MGA also succeeds in improving prediction accuracies for short sequences by using the adapted RBS models (96% sensitivity and 93% specificity for 700 bp fragments). These features of the MGA expedite wide ranges of microbial genome studies, such as genome annotations and metagenome analyses
ProTISA: a comprehensive resource for translation initiation site annotation in prokaryotic genomes
Correct annotation of translation initiation site (TIS) is essential for both experiments and bioinformatics studies of prokaryotic translation initiation mechanism as well as understanding of gene regulation and gene structure. Here we describe a comprehensive database ProTISA, which collects TIS confirmed through a variety of available evidences for prokaryotic genomes, including Swiss-Prot experiments record, literature, conserved domain hits and sequence alignment between orthologous genes. Moreover, by combining the predictions from our recently developed TIS post-processor, ProTISA provides a refined annotation for the public database RefSeq. Furthermore, the database annotates the potential regulatory signals associated with translation initiation at the TIS upstream region. As of July 2007, ProTISA includes 440 microbial genomes with more than 390 000 confirmed TISs. The database is available at http://mech.ctb.pku.edu.cn/protis
Draft Genome Sequence of the Marine Streptomyces sp. Strain PP-C42, Isolated from the Baltic Sea
Streptomyces, a branch of aerobic Gram-positive bacteria represents the largest genus of actinobacteria. The streptomycetes are characterized by a complex secondary metabolism and produce over two-thirds of the clinically used natural antibiotics today. Here we report the draft genome sequence of a Streptomyces strain PP-C42 isolated from the marine environment. A subset of unique genes and gene clusters for diverse secondary metabolites as well as antimicrobial peptides (AMPs) could be identified from the genome, showing great promise as a source for novel bioactive compound
Draft Genome Sequence of the Marine Streptomyces sp. Strain PP-C42, Isolated from the Baltic Sea
Streptomyces, a branch of aerobic Gram-positive bacteria represents the largest genus of actinobacteria. The streptomycetes are characterized by a complex secondary metabolism and produce over two-thirds of the clinically used natural antibiotics today. Here we report the draft genome sequence of a Streptomyces strain PP-C42 isolated from the marine environment. A subset of unique genes and gene clusters for diverse secondary metabolites as well as antimicrobial peptides (AMPs) could be identified from the genome, showing great promise as a source for novel bioactive compound
Draft Genome Sequencing and Comparative Analysis of Aspergillus sojae NBRC4239
We conducted genome sequencing of the filamentous fungus Aspergillus sojae NBRC4239 isolated from the koji used to prepare Japanese soy sauce. We used the 454 pyrosequencing technology and investigated the genome with respect to enzymes and secondary metabolites in comparison with other Aspergilli sequenced. Assembly of 454 reads generated a non-redundant sequence of 39.5-Mb possessing 13 033 putative genes and 65 scaffolds composed of 557 contigs. Of the 2847 open reading frames with Pfam domain scores of >150 found in A. sojae NBRC4239, 81.7% had a high degree of similarity with the genes of A. oryzae. Comparative analysis identified serine carboxypeptidase and aspartic protease genes unique to A. sojae NBRC4239. While A. oryzae possessed three copies of α-amyalse gene, A. sojae NBRC4239 possessed only a single copy. Comparison of 56 gene clusters for secondary metabolites between A. sojae NBRC4239 and A. oryzae revealed that 24 clusters were conserved, whereas 32 clusters differed between them that included a deletion of 18 508 bp containing mfs1, mao1, dmaT, and pks-nrps for the cyclopiazonic acid (CPA) biosynthesis, explaining the no productivity of CPA in A. sojae. The A. sojae NBRC4239 genome data will be useful to characterize functional features of the koji moulds used in Japanese industries
Comparative analysis of an experimental subcellular protein localization assay and in silico prediction methods
The subcellular localization of a protein can provide important information about its function within the cell. As eukaryotic cells and particularly mammalian cells are characterized by a high degree of compartmentalization, most protein activities can be assigned to particular cellular compartments. The categorization of proteins by their subcellular localization is therefore one of the essential goals of the functional annotation of the human genome. We previously performed a subcellular localization screen of 52 proteins encoded on human chromosome 21. In the current study, we compared the experimental localization data to the in silico results generated by nine leading software packages with different prediction resolutions. The comparison revealed striking differences between the programs in the accuracy of their subcellular protein localization predictions. Our results strongly suggest that the recently developed predictors utilizing multiple prediction methods tend to provide significantly better performance over purely sequence-based or homology-based predictions
Bacterial Lifestyle in a Deep-sea Hydrothermal Vent Chimney Revealed by the Genome Sequence of the Thermophilic Bacterium Deferribacter desulfuricans SSM1
The complete genome sequence of the thermophilic sulphur-reducing bacterium, Deferribacter desulfuricans SMM1, isolated from a hydrothermal vent chimney has been determined. The genome comprises a single circular chromosome of 2 234 389 bp and a megaplasmid of 308 544 bp. Many genes encoded in the genome are most similar to the genes of sulphur- or sulphate-reducing bacterial species within Deltaproteobacteria. The reconstructed central metabolisms showed a heterotrophic lifestyle primarily driven by C1 to C3 organics, e.g. formate, acetate, and pyruvate, and also suggested that the inability of autotrophy via a reductive tricarboxylic acid cycle may be due to the lack of ATP-dependent citrate lyase. In addition, the genome encodes numerous genes for chemoreceptors, chemotaxis-like systems, and signal transduction machineries. These signalling networks may be linked to this bacterium's versatile energy metabolisms and may provide ecophysiological advantages for D. desulfuricans SSM1 thriving in the physically and chemically fluctuating environments near hydrothermal vents. This is the first genome sequence from the phylum Deferribacteres
Mobile Regulatory Cassettes Mediate Modular Shuffling in T4-Type Phage Genomes
Coliphage phi1, which was isolated for phage therapy in the Republic of Georgia,
is closely related to the T-like myovirus RB49. The ∼275 open reading
frames encoded by each phage have an average level of amino acid identity of
95.8%. RB49 lacks 7 phi1 genes while 10 phi1 genes are missing from RB49. Most
of these unique genes encode functions without known homologs. Many of the
insertion, deletion, and replacement events that distinguish the two phages are
in the hyperplastic regions (HPRs) of their genomes. The HPRs are rich in both
nonessential genes and small regulatory cassettes (promoterearly
stem-loops [PeSLs]) composed of strong σ70-like promoters
and stem-loop structures, which are effective transcription terminators. Modular
shuffling mediated by recombination between PeSLs has caused much of the
sequence divergence between RB49 and phi1. We show that exchanges between nearby
PeSLs can also create small circular DNAs that are apparently encapsidated by
the virus. Such PeSL “mini-circles” may be important vectors
for horizontal gene transfer
- …