157 research outputs found

    Identification of homologs in insignificant blast hits by exploiting extrinsic gene properties

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Homology is a key concept in both evolutionary biology and genomics. Detection of homology is crucial in fields like the functional annotation of protein sequences and the identification of taxon specific genes. Basic homology searches are still frequently performed by pairwise search methods such as BLAST. Vast improvements have been made in the identification of homologous proteins by using more advanced methods that use sequence profiles. However additional improvement could be made by exploiting sources of genomic information other than the primary sequence or tertiary structure.</p> <p>Results</p> <p>We test the hypothesis that extrinsic gene properties gene length and gene order can be of help in differentiating spurious sequence similarity from homology in the gray zone. Sharing gene order and similarity in size dramatically increase the chance of a query-hit pair being homologous: gray zone query-hit pairs of similar size and with conserved gene order are homologous in 99% of all cases, while for query-hit pairs without gene order conservation and with different sizes this is only 55%.</p> <p>Conclusion</p> <p>We have shown that using gene length and gene order drastically improves the detection of homologs within the BLAST gray zone. Our findings suggest that the use of such extrinsic gene properties can also improve the performance of homology detection by more advanced methods, and our study thereby underscores the importance of true data integration for fully exploiting genomic information.</p

    LocateP: Genome-scale subcellular-location predictor for bacterial proteins

    Get PDF
    Contains fulltext : 69477.pdf ( ) (Open Access)BACKGROUND: In the past decades, various protein subcellular-location (SCL) predictors have been developed. Most of these predictors, like TMHMM 2.0, SignalP 3.0, PrediSi and Phobius, aim at the identification of one or a few SCLs, whereas others such as CELLO and Psortb.v.2.0 aim at a broader classification. Although these tools and pipelines can achieve a high precision in the accurate prediction of signal peptides and transmembrane helices, they have a much lower accuracy when other sequence characteristics are concerned. For instance, it proved notoriously difficult to identify the fate of proteins carrying a putative type I signal peptidase (SPIase) cleavage site, as many of those proteins are retained in the cell membrane as N-terminally anchored membrane proteins. Moreover, most of the SCL classifiers are based on the classification of the Swiss-Prot database and consequently inherited the inconsistency of that SCL classification. As accurate and detailed SCL prediction on a genome scale is highly desired by experimental researchers, we decided to construct a new SCL prediction pipeline: LocateP. RESULTS: LocateP combines many of the existing high-precision SCL identifiers with our own newly developed identifiers for specific SCLs. The LocateP pipeline was designed such that it mimics protein targeting and secretion processes. It distinguishes 7 different SCLs within Gram-positive bacteria: intracellular, multi-transmembrane, N-terminally membrane anchored, C-terminally membrane anchored, lipid-anchored, LPxTG-type cell-wall anchored, and secreted/released proteins. Moreover, it distinguishes pathways for Sec- or Tat-dependent secretion and alternative secretion of bacteriocin-like proteins. The pipeline was tested on data sets extracted from literature, including experimental proteomics studies. The tests showed that LocateP performs as well as, or even slightly better than other SCL predictors for some locations and outperforms current tools especially where the N-terminally anchored and the SPIase-cleaved secreted proteins are concerned. Overall, the accuracy of LocateP was always higher than 90%. LocateP was then used to predict the SCLs of all proteins encoded by completed Gram-positive bacterial genomes. The results are stored in the database LocateP-DB http://www.cmbi.ru.nl/locatep-db1. CONCLUSION: LocateP is by far the most accurate and detailed protein SCL predictor for Gram-positive bacteria currently available

    Comparative phosphoproteomics reveals evolutionary and functional conservation of phosphorylation across eukaryotes

    Get PDF
    A comparison of phosphoproteomics datasets of six eukaryotes shows significant overlap between phosphoproteomes

    Lactobacillus plantarum gene clusters encoding putative cell-surface protein complexes for carbohydrate utilization are conserved in specific gram-positive bacteria

    Get PDF
    BACKGROUND: Genomes of gram-positive bacteria encode many putative cell-surface proteins, of which the majority has no known function. From the rapidly increasing number of available genome sequences it has become apparent that many cell-surface proteins are conserved, and frequently encoded in gene clusters or operons, suggesting common functions, and interactions of multiple components. RESULTS: A novel gene cluster encoding exclusively cell-surface proteins was identified, which is conserved in a subgroup of gram-positive bacteria. Each gene cluster generally has one copy of four new gene families called cscA, cscB, cscC and cscD. Clusters encoding these cell-surface proteins were found only in complete genomes of Lactobacillus plantarum, Lactobacillus sakei, Enterococcus faecalis, Listeria innocua, Listeria monocytogenes, Lactococcus lactis ssp lactis and Bacillus cereus and in incomplete genomes of L. lactis ssp cremoris, Lactobacillus casei, Enterococcus faecium, Pediococcus pentosaceus, Lactobacillius brevis, Oenococcus oeni, Leuconostoc mesenteroides, and Bacillus thuringiensis. These genes are neither present in the genomes of streptococci, staphylococci and clostridia, nor in the Lactobacillus acidophilus group, suggesting a niche-specific distribution, possibly relating to association with plants. All encoded proteins have a signal peptide for secretion by the Sec-dependent pathway, while some have cell-surface anchors, novel WxL domains, and putative domains for sugar binding and degradation. Transcriptome analysis in L. plantarum shows that the cscA-D genes are co-expressed, supporting their operon organization. Many gene clusters are significantly up-regulated in a glucose-grown, ccpA-mutant derivative of L. plantarum, suggesting catabolite control. This is supported by the presence of predicted CRE-sites upstream or inside the up-regulated cscA-D gene clusters. CONCLUSION: We propose that the CscA, CscB, CscC and CscD proteins form cell-surface protein complexes and play a role in carbon source acquisition. Primary occurrence in plant-associated gram-positive bacteria suggests a possible role in degradation and utilization of plant oligo- or poly-saccharides

    Draft Whole-Genome Sequences of 11 Bacillus cereus Food Isolates

    Get PDF
    Bacillus cereus is a foodborne pathogen causing emetic and diarrheal-type syndromes. Here, we report the whole-genome sequences of 11 B. cereus food isolates.</p

    High-Level Heat Resistance of Spores of Bacillus amyloliquefaciens and Bacillus licheniformis Results from the Presence of a spoVA Operon in a Tn1546 Transposon

    Get PDF
    Bacterial endospore formers can produce spores that are resistant to many food processing conditions, including heat. Some spores may survive heating processes aimed at production of commercially sterile foods. Recently, it was shown that a spoVA operon, designated spoVA(2mob), present on a Tn1546 transposon in Bacillus subtilis, leads to profoundly increased wet heat resistance of B. subtilis spores. Such Tn1546 transposon elements including the spoVA(2mob) operon were also found in several strains of Bacillus amyloliquefaciens and Bacillus licheniformis, and these strains were shown to produce spores with significantly higher resistances to wet heat than their counterparts lacking this transposon. In this study, the locations and compositions of Tn1546 transposons encompassing the spoVA(2mob) operons in B. amyloliquefaciens and B. licheniformis were analyzed. Introduction of these spoVA(2mob) operons into B. subtilis 168 (producing spores that are not highly heat resistant) rendered mutant 168 strains that produced high-level heat resistant spores, demonstrating that these elements in B. amyloliquefaciens and B. licheniformis are responsible for high level heat resistance of spores. Assessment of growth of the nine strains of each species between 5.2°C and 57.7°C showed some differences between strains, especially at lower temperatures, but all strains were able to grow at 57.7°C. Strains of B. amyloliquefaciens and B. licheniformis that contain the Tn1546 elements (and produce high-level heat resistant spores) grew at temperatures similar to those of their Tn1546-negative counterparts that produce low-level heat resistant spores. The findings presented in this study allow for detection of B. amyloliquefaciens and B. licheniformis strains that produce highly heat resistant spores in the food chain

    Transcriptomics in serum and culture medium reveal shared and differential gene regulation in pathogenic and commensal Streptococcus suis

    Get PDF
    Streptococcus suis colonizes the upper respiratory tract of healthy pigs at high abundance but can also cause opportunistic respiratory and systemic disease. Disease-associated S. suis reference strains are well studied, but less is known about commensal lineages. It is not known what mechanisms enable some S. suis lineages to cause disease while others persist as commensal colonizers, or to what extent gene expression in disease-associated and commensal lineages diverge. In this study we compared the transcriptomes of 21S. suis strains grown in active porcine serum and Todd–Hewitt yeast broth. These strains included both commensal and pathogenic strains, including several strains of sequence type (ST) 1, which is responsible for most cases of human disease and is considered to be the most pathogenic S. suis lineage. We sampled the strains during their exponential growth phase and mapped RNA sequencing reads to the corresponding strain genomes. We found that the transcriptomes of pathogenic and commensal strains with large genomic divergence were unexpectedly conserved when grown in active porcine serum, but that regulation and expression of key pathways varied. Notably, we observed strong variation of expression across media of genes involved in capsule production in pathogens, and of the agmatine deiminase system in commensals. ST1 strains displayed large differences in gene expression between the two media compared to strains from other clades. Their capacity to regulate gene expression across different environmental conditions may be key to their success as zoonotic pathogens

    Extensive Study of Breast Milk and Infant Growth: Protocol of the Cambridge Baby Growth and Breastfeeding Study (CBGS-BF).

    Get PDF
    Funder: Medical Research Council; Grant(s): Unit programmes: MC_UU_12015/2 and MC_UU_00006/2Funder: Wellcome TrustGrowth and nutrition during early life have been strongly linked to future health and metabolic risks. The Cambridge Baby Growth Study (CBGS), a longitudinal birth cohort of 2229 mother-infant pairs, was set up in 2001 to investigate early life determinant factors of infant growth and body composition in the UK setting. To carry out extensive profiling of breastmilk intakes and composition in relation to infancy growth, the Cambridge Baby Growth and Breastfeeding Study (CBGS-BF) was established upon the original CBGS. The strict inclusion criteria were applied, focusing on a normal birth weight vaginally delivered infant cohort born of healthy and non-obese mothers. Crucially, only infants who were exclusively breastfed for the first 6 weeks of life were retained in the analysed study sample. At each visit from birth, 2 weeks, 6 weeks, and then at 3, 6, 12, 24, and 36 months, longitudinal anthropometric measurements and blood spot collections were conducted. Infant body composition was assessed using air displacement plethysmography (ADP) at 6 weeks and 3 months of age. Breast milk was collected for macronutrients and human milk oligosaccharides (HMO) measurements. Breast milk intake volume was also estimated, as well as sterile breastmilk and infant stool collection for microbiome study
    corecore