58 research outputs found

    Correlation between nucleotide composition and folding energy of coding sequences with special attention to wobble bases

    Get PDF
    Background: The secondary structure and complexity of mRNA influences its accessibility to regulatory molecules (proteins, micro-RNAs), its stability and its level of expression. The mobile elements of the RNA sequence, the wobble bases, are expected to regulate the formation of structures encompassing coding sequences. Results: The sequence/folding energy (FE) relationship was studied by statistical, bioinformatic methods in 90 CDS containing 26,370 codons. I found that the FE (dG) associated with coding sequences is significant and negative (407 kcal/1000 bases, mean +/- S.E.M.) indicating that these sequences are able to form structures. However, the FE has only a small free component, less than 10% of the total. The contribution of the 1st and 3rd codon bases to the FE is larger than the contribution of the 2nd (central) bases. It is possible to achieve a ~ 4-fold change in FE by altering the wobble bases in synonymous codons. The sequence/FE relationship can be described with a simple algorithm, and the total FE can be predicted solely from the sequence composition of the nucleic acid. The contributions of different synonymous codons to the FE are additive and one codon cannot replace another. The accumulated contributions of synonymous codons of an amino acid to the total folding energy of an mRNA is strongly correlated to the relative amount of that amino acid in the translated protein. Conclusion: Synonymous codons are not interchangable with regard to their role in determining the mRNA FE and the relative amounts of amino acids in the translated protein, even if they are indistinguishable in respect of amino acid coding.Comment: 14 pages including 6 figures and 1 tabl

    Regulation of Human Formyl Peptide Receptor 1 Synthesis: Role of Single Nucleotide Polymorphisms, Transcription Factors, and Inflammatory Mediators

    Get PDF
    The gene encoding the human formyl peptide receptor 1 (FPR1) is heterogeneous, containing numerous single nucleotide polymorphisms (SNPs). Here, we examine the effect of these SNPs on gene transcription and protein translation. We also identify gene promoter sequences and putative FPR1 transcription factors. To test the effect of codon bias and codon pair bias on FPR1 expression, four FPR1 genetic variants were expressed in human myeloid U937 cells fused to a reporter gene encoding firefly luciferase. No significant differences in luciferase activity were detected, suggesting that the translational regulation and protein stability of FPR1 are modulated by factors other than the SNP codon bias and the variant amino acid properties. Deletion and mutagenesis analysis of the FPR1 promoter showed that a CCAAT box is not required for gene transcription. A −88/41 promoter construct resulted in the strongest transcriptional activity, whereas a −72/41 construct showed large reduction in activity. The region between −88 and −72 contains a consensus binding site for the transcription factor PU.1. Mutagenesis of this site caused significant reduction in reporter gene expression. The PU.1 binding was confirmed in vivo by chromatin immunoprecipitation, and the binding to nucleotides −84 to −76 (TTCCTATTT) was confirmed in vitro by an electrophoretic mobility shift assay. Thus, similar to many other myeloid genes, FPR1 promoter activity requires PU.1. Two single nucleotide polymorphisms at −56 and −54 did not significantly affect FPR1 gene expression, despite differences in binding of transcription factor IRF1 in vitro. Inflammatory mediators such as interferon-γ, tumor necrosis factor-α, and lipopolysaccharide did not increase FPR1 promoter activity in myeloid cells, whereas differentiation induced by DMSO and retinoic acid enhanced the activity. This implies that the expression of FPR1 in myeloid cells is developmentally regulated, and that the differentiated cells are equipped for immediate response to microbial infections

    A Shigella boydii bacteriophage which resembles Salmonella phage ViI

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Lytic bacteriophages have been applied successfully to control the growth of various foodborne pathogens. Sequencing of their genomes is considered as an important preliminary step to ensure their safety prior to food applications.</p> <p>Results</p> <p>The lytic bacteriophage, ΦSboM-AG3, targets the important foodborne pathogen, <it>Shigella</it>. It is morphologically similar to phage ViI of <it>Salmonella enterica </it>serovar Typhi and a series of phages of <it>Acinetobacter calcoaceticus </it>and <it>Rhizobium meliloti</it>. The complete genome of ΦSboM-AG3 was determined to be 158 kb and was terminally redundant and circularly permuted. Two hundred and sixteen open reading frames (ORFs) were identified and annotated, most of which displayed homology to proteins of <it>Salmonella </it>phage ViI. The genome also included four genes specifying tRNAs.</p> <p>Conclusions</p> <p>This is the first time that a Vi-specific phage for <it>Shigella </it>has been described. There is no evidence for the presence of virulence and lysogeny-associated genes. In conclusion, the genome analysis of ΦSboM-AG3 indicates that this phage can be safely used for biocontrol purposes.</p

    Genetic Analysis of Anti-Amoebae and Anti-Bacterial Activities of the Type VI Secretion System in Vibrio cholerae

    Get PDF
    A type VI secretion system (T6SS) was recently shown to be required for full virulence of Vibrio cholerae O37 serogroup strain V52. In this study, we systematically mutagenized each individual gene in T6SS locus and characterized their functions based on expression and secretion of the hemolysin co-regulated protein (Hcp), virulence towards amoebae of Dictyostelium discoideum and killing of Escherichia coli bacterial cells. We group the 17 proteins characterized in the T6SS locus into four categories: twelve (VipA, VipB, VCA0109–VCA0115, ClpV, VCA0119, and VasK) are essential for Hcp secretion and bacterial virulence, and thus likely function as structural components of the apparatus; two (VasH and VCA0122) are regulators that are required for T6SS gene expression and virulence; another two, VCA0121 and valine-glycine repeat protein G 3 (VgrG-3), are not essential for Hcp expression, secretion or bacterial virulence, and their functions are unknown; the last group is represented by VCA0118, which is not required for Hcp expression or secretion but still plays a role in both amoebae and bacterial killing and may therefore be an effector protein. We also showed that the clpV gene product is required for Dictyostelium virulence but is less important for killing E. coli. In addition, one vgrG gene (vgrG-2) outside of the T6SS gene cluster was required for bacterial killing but another (vgrG-1) was not. However, a bacterial killing defect was observed when vgrG-1 and vgrG-3 were both deleted. Several genes encoded in the same putative operon as vgrG-1 and vgrG-2 also contribute to virulence toward Dictyostelium but have a smaller effect on bacterial killing. Our results provide new insights into the functional requirements of V. cholerae's T6SS in the context of secretion as well as killing of bacterial and eukaryotic phagocytic cells

    Identifying Cognate Binding Pairs among a Large Set of Paralogs: The Case of PE/PPE Proteins of Mycobacterium tuberculosis

    Get PDF
    We consider the problem of how to detect cognate pairs of proteins that bind when each belongs to a large family of paralogs. To illustrate the problem, we have undertaken a genomewide analysis of interactions of members of the PE and PPE protein families of Mycobacterium tuberculosis. Our computational method uses structural information, operon organization, and protein coevolution to infer the interaction of PE and PPE proteins. Some 289 PE/PPE complexes were predicted out of a possible 5,590 PE/PPE pairs genomewide. Thirty-five of these predicted complexes were also found to have correlated mRNA expression, providing additional evidence for these interactions. We show that our method is applicable to other protein families, by analyzing interactions of the Esx family of proteins. Our resulting set of predictions is a starting point for genomewide experimental interaction screens of the PE and PPE families, and our method may be generally useful for detecting interactions of proteins within families having many paralogs

    Single-nucleotide resolution analysis of the transcriptome structure of Clostridium beijerinckii NCIMB 8052 using RNA-Seq

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Clostridium beijerinckii </it>is an important solvent producing microorganism. The genome of <it>C. beijerinckii </it>NCIMB 8052 has recently been sequenced. Although transcriptome structure is important in order to reveal the functional and regulatory architecture of the genome, the physical structure of transcriptome for this strain, such as the operon linkages and transcript boundaries are not well understood.</p> <p>Results</p> <p>In this study, we conducted a single-nucleotide resolution analysis of the <it>C. beijerinckii </it>NCIMB 8052 transcriptome using high-throughput RNA-Seq technology. We identified the transcription start sites and operon structure throughout the genome. We confirmed the structure of important gene operons involved in metabolic pathways for acid and solvent production in <it>C. beijerinckii </it>8052, including <it>pta</it>-<it>ack</it>, <it>ptb</it>-<it>buk</it>, <it>hbd</it>-<it>etfA</it>-<it>etfB</it>-<it>crt </it>(<it>bcs</it>) and <it>ald</it>-<it>ctfA</it>-<it>ctfB</it>-<it>adc </it>(<it>sol</it>) operons; we also defined important operons related to chemotaxis/motility, transcriptional regulation, stress response and fatty acids biosynthesis along with others. We discovered 20 previously non-annotated regions with significant transcriptional activities and 15 genes whose translation start codons were likely mis-annotated. As a consequence, the accuracy of existing genome annotation was significantly enhanced. Furthermore, we identified 78 putative silent genes and 177 putative housekeeping genes based on normalized transcription measurement with the sequence data. We also observed that more than 30% of pseudogenes had significant transcriptional activities during the fermentation process. Strong correlations exist between the expression values derived from RNA-Seq analysis and microarray data or qRT-PCR results.</p> <p>Conclusions</p> <p>Transcriptome structural profiling in this research provided important supplemental information on the accuracy of genome annotation, and revealed additional gene functions and regulation in <it>C. beijerinckii</it>.</p

    The obesity gene, TMEM18, is of ancient origin, found in majority of neuronal cells in all major brain regions and associated with obesity in severely obese children

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>TMEM18 is a hypothalamic gene that has recently been linked to obesity and BMI in genome wide association studies. However, the functional properties of TMEM18 are obscure.</p> <p>Methods</p> <p>The evolutionary history of TMEM18 was inferred using phylogenetic and bioinformatic methods. The gene's expression profile was investigated with real-time PCR in a panel of rat and mouse tissues and with immunohistochemistry in the mouse brain. Also, gene expression changes were analyzed in three feeding-related mouse models: food deprivation, reward and diet-induced increase in body weight. Finally, we genotyped 502 severely obese and 527 healthy Swedish children for two SNPs near TMEM18 (rs6548238 and rs756131).</p> <p>Results</p> <p>TMEM18 was found to be remarkably conserved and present in species that diverged from the human lineage over 1500 million years ago. The TMEM18 gene was widely expressed and detected in the majority of cells in all major brain regions, but was more abundant in neurons than other cell types. We found no significant changes in the hypothalamic and brainstem expression in the feeding-related mouse models. There was a strong association for two SNPs (rs6548238 and rs756131) of the TMEM18 locus with an increased risk for obesity (p = 0.001 and p = 0.002).</p> <p>Conclusion</p> <p>We conclude that TMEM18 is involved in both adult and childhood obesity. It is one of the most conserved human obesity genes and it is found in the majority of all brain sites, including the hypothalamus and the brain stem, but it is not regulated in these regions in classical energy homeostatic models.</p

    Predicting protein linkages in bacteria: Which method is best depends on task

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Applications of computational methods for predicting protein functional linkages are increasing. In recent years, several bacteria-specific methods for predicting linkages have been developed. The four major genomic context methods are: Gene cluster, Gene neighbor, Rosetta Stone, and Phylogenetic profiles. These methods have been shown to be powerful tools and this paper provides guidelines for when each method is appropriate by exploring different features of each method and potential improvements offered by their combination. We also review many previous treatments of these prediction methods, use the latest available annotations, and offer a number of new observations.</p> <p>Results</p> <p>Using <it>Escherichia coli </it>K12 and <it>Bacillus subtilis</it>, linkage predictions made by each of these methods were evaluated against three benchmarks: functional categories defined by COG and KEGG, known pathways listed in EcoCyc, and known operons listed in RegulonDB. Each evaluated method had strengths and weaknesses, with no one method dominating all aspects of predictive ability studied. For functional categories, as previous studies have shown, the Rosetta Stone method was individually best at detecting linkages and predicting functions among proteins with shared KEGG categories while the Phylogenetic profile method was best for linkage detection and function prediction among proteins with common COG functions. Differences in performance under COG versus KEGG may be attributable to the presence of paralogs. Better function prediction was observed when using a weighted combination of linkages based on reliability versus using a simple unweighted union of the linkage sets. For pathway reconstruction, 99 complete metabolic pathways in <it>E. coli </it>K12 (out of the 209 known, non-trivial pathways) and 193 pathways with 50% of their proteins were covered by linkages from at least one method. Gene neighbor was most effective individually on pathway reconstruction, with 48 complete pathways reconstructed. For operon prediction, Gene cluster predicted completely 59% of the known operons in <it>E. coli </it>K12 and 88% (333/418)in <it>B. subtilis</it>. Comparing two versions of the <it>E. coli </it>K12 operon database, many of the unannotated predictions in the earlier version were updated to true predictions in the later version. Using only linkages found by both Gene Cluster and Gene Neighbor improved the precision of operon predictions. Additionally, as previous studies have shown, combining features based on intergenic region and protein function improved the specificity of operon prediction.</p> <p>Conclusion</p> <p>A common problem for computational methods is the generation of a large number of false positives that might be caused by an incomplete source of validation. By comparing two versions of a database, we demonstrated the dramatic differences on reported results. We used several benchmarks on which we have shown the comparative effectiveness of each prediction method, as well as provided guidelines as to which method is most appropriate for a given prediction task.</p

    Diversification of Genes Encoding Granule-Bound Starch Synthase in Monocots and Dicots Is Marked by Multiple Genome-Wide Duplication Events

    Get PDF
    Starch is one of the major components of cereals, tubers, and fruits. Genes encoding granule-bound starch synthase (GBSS), which is responsible for amylose synthesis, have been extensively studied in cereals but little is known about them in fruits. Due to their low copy gene number, GBSS genes have been used to study plant phylogenetic and evolutionary relationships. In this study, GBSS genes have been isolated and characterized in three fruit trees, including apple, peach, and orange. Moreover, a comprehensive evolutionary study of GBSS genes has also been conducted between both monocots and eudicots. Results have revealed that genomic structures of GBSS genes in plants are conserved, suggesting they all have evolved from a common ancestor. In addition, the GBSS gene in an ancestral angiosperm must have undergone genome duplication ∼251 million years ago (MYA) to generate two families, GBSSI and GBSSII. Both GBSSI and GBSSII are found in monocots; however, GBSSI is absent in eudicots. The ancestral GBSSII must have undergone further divergence when monocots and eudicots split ∼165 MYA. This is consistent with expression profiles of GBSS genes, wherein these profiles are more similar to those of GBSSII in eudicots than to those of GBSSI genes in monocots. In dicots, GBSSII must have undergone further divergence when rosids and asterids split from each other ∼126 MYA. Taken together, these findings suggest that it is GBSSII rather than GBSSI of monocots that have orthologous relationships with GBSS genes of eudicots. Moreover, diversification of GBSS genes is mainly associated with genome-wide duplication events throughout the evolutionary course of history of monocots and eudicots

    Genome-wide inference of regulatory networks in Streptomyces coelicolor

    Get PDF
    Background: The onset of antibiotics production in Streptomyces species is co-ordinated with differentiation events. An understanding of the genetic circuits that regulate these coupled biological phenomena is essential to discover and engineer the pharmacologically important natural products made by these species. The availability of genomic tools and access to a large warehouse of transcriptome data for the model organism, Streptomyces coelicolor, provides incentive to decipher the intricacies of the regulatory cascades and develop biologically meaningful hypotheses. Results: In this study, more than 500 samples of genome-wide temporal transcriptome data, comprising wild-type and more than 25 regulatory gene mutants of Streptomyces coelicolor probed across multiple stress and medium conditions, were investigated. Information based on transcript and functional similarity was used to update a previously-predicted whole-genome operon map and further applied to predict transcriptional networks constituting modules enriched in diverse functions such as secondary metabolism, and sigma factor. The predicted network displays a scale-free architecture with a small-world property observed in many biological networks. The networks were further investigated to identify functionally-relevant modules that exhibit functional coherence and a consensus motif in the promoter elements indicative of DNA-binding elements. Conclusions: Despite the enormous experimental as well as computational challenges, a systems approach for integrating diverse genome-scale datasets to elucidate complex regulatory networks is beginning to emerge. We present an integrated analysis of transcriptome data and genomic features to refine a whole-genome operon map and to construct regulatory networks at the cistron level in Streptomyces coelicolor. The functionally-relevant modules identified in this study pose as potential targets for further studies and verification.
    corecore