65 research outputs found

    S-MART, A Software Toolbox to Aid RNA-seq Data Analysis

    Get PDF
    High-throughput sequencing is now routinely performed in many experiments. But the analysis of the millions of sequences generated, is often beyond the expertise of the wet labs who have no personnel specializing in bioinformatics. Whereas several tools are now available to map high-throughput sequencing data on a genome, few of these can extract biological knowledge from the mapped reads. We have developed a toolbox called S-MART, which handles mapped RNA-Seq data. S-MART is an intuitive and lightweight tool which performs many of the tasks usually required for the analysis of mapped RNA-Seq reads. S-MART does not require any computer science background and thus can be used by all of the biologist community through a graphical interface. S-MART can run on any personal computer, yielding results within an hour even for Gb of data for most queries. S-MART may perform the entire analysis of the mapped reads, without any need for other ad hoc scripts. With this tool, biologists can easily perform most of the analyses on their computer for their RNA-Seq data, from the mapped data to the discovery of important loci

    Spatial patterns of transcriptional activity in the chromosome of Escherichia coli

    Get PDF
    BACKGROUND: Although genes on the chromosome are organized in a fixed order, the spatial correlations in transcription have not been systematically evaluated. We used a combination of genomic and signal processing techniques to investigate the properties of transcription in the genome of Escherichia coli K12 as a function of the position of genes on the chromosome. RESULTS: Spectral analysis of transcriptional series revealed the existence of statistically significant patterns in the spatial series of transcriptional activity. These patterns could be classified into three categories: short-range, of up to 16 kilobases (kb); medium-range, over 100-125 kb; and long-range, over 600-800 kb. We show that the significant similarities in gene activities extend beyond the length of an operon and that local patterns of coexpression are dependent on DNA supercoiling. Unlike short-range patterns, the formation of medium and long-range transcriptional patterns does not strictly depend on the level of DNA supercoiling. The long-range patterns appear to correlate with the patterns of distribution of DNA gyrase on the bacterial chromosome. CONCLUSIONS: Localization of structural components in the transcriptional signal revealed an asymmetry in the distribution of transcriptional patterns along the bacterial chromosome. The demonstration that spatial patterns of transcription could be modulated pharmacologically and genetically, along with the identification of molecular correlates of transcriptional patterns, offer for the first time strong evidence of physiologically determined higher-order organization of transcription in the bacterial chromosome

    Operon information improves gene expression estimation for cDNA microarrays

    Get PDF
    BACKGROUND: In prokaryotic genomes, genes are organized in operons, and the genes within an operon tend to have similar levels of expression. Because of co-transcription of genes within an operon, borrowing information from other genes within the same operon can improve the estimation of relative transcript levels; the estimation of relative levels of transcript abundances is one of the most challenging tasks in experimental genomics due to the high noise level in microarray data. Therefore, techniques that can improve such estimations, and moreover are based on sound biological premises, are expected to benefit the field of microarray data analysis RESULTS: In this paper, we propose a hierarchical Bayesian model, which relies on borrowing information from other genes within the same operon, to improve the estimation of gene expression levels and, hence, the detection of differentially expressed genes. The simulation studies and the analysis of experiential data demonstrated that the proposed method outperformed other techniques that are routinely used to estimate transcript levels and detect differentially expressed genes, including the sample mean and SAM t statistics. The improvement became more significant as the noise level in microarray data increases. CONCLUSION: By borrowing information about transcriptional activity of genes within classified operons, we improved the estimation of gene expression levels and the detection of differentially expressed genes

    Genome-wide localization of mobile elements: experimental, statistical and biological considerations

    Get PDF
    BACKGROUND: The distribution and location of insertion elements in a genome is an excellent tool to track the evolution of bacterial strains and a useful molecular marker to distinguish between closely related bacterial isolates. The information about the genomic locations of IS elements is available in public sequence databases. However, the locations of mobile elements may vary from strain to strain and within the population of an individual strain. Tools that allow de novo localization of IS elements and are independent of existing sequence information are essential to map insertion elements and advance our knowledge of the role that such elements play in gene regulation and genome plasticity in bacteria. RESULTS: In this study, we present an efficient and reliable method for linear mapping of mobile elements using whole-genome DNA microarrays. In addition, we describe an algorithm for analysis of microarray data that can be applied to find DNA sequences physically juxtaposed with a target sequence of interest. This approach was used to map the locations of the IS5 elements in the genome of Escherichia coli K12. All IS5 elements present in the E. coli genome known from GenBank sequence data were identified. Furthermore, previously unknown insertion sites were predicted with high sensitivity and specificity. Two variants of E. coli K-12 MG1655 within a population of this strain were predicted by our analysis. The only significant difference between these two isolates was the presence of an IS5 element upstream of the main flagella regulator, flhDC. Additional experiments confirmed this prediction and showed that these isolates were phenotypically distinct. The effect of IS5 on the transcriptional activity of motility and chemotaxis genes in the genome of E. coli strain MG1655 was examined. Comparative analysis of expression profiles revealed that the presence of IS5 results in a mild enhancement of transcription of the flagellar genes that translates into a slight increase in motility. CONCLUSION: In summary, this work presents a case study of an experimental and analytical application of DNA microarrays to map insertion elements in bacteria and gains an insight into biological processes that might otherwise be overlooked by relying solely on the available genome sequence data

    RecA can stimulate the relaxation activity of topoisomerase I: Molecular basis of topoisomerase-mediated genome-wide transcriptional responses in Escherichia coli

    Get PDF
    The superhelicity of the chromosome, which is controlled by DNA topoisomerases, modulates global gene expression. Investigations of transcriptional responses to the modulation of gyrase function have identified two types of topoisomerase-mediated transcriptional responses: (i) steady-state changes elicited by a mutation in gyrase, such as the D82G mutation in GyrA, and (ii) dynamic changes elicited by the inhibition of gyrase. We hypothesize that the steady-state effects are due to the changes in biochemical properties of gyrase, whereas the dynamic effects are due to an imbalance between supercoiling and relaxation activities, which appears to be influenced by the RecA activity. Herein, we present biochemical evidence for hypothesized mechanisms. GyrA D82G gyrase exhibits a reduced supercoiling activity. The RecA protein can influence the balance between supercoiling and relaxation activities either by interfering with the activity of DNA gyrase or by facilitating the relaxation reaction. RecA has no effect on the supercoiling activity of gyrase but stimulates the relaxation activity of topoisomerase I. This stimulation is specific and requires formation of an active RecA filament. These results suggest that the functional interaction between RecA and topoisomerase I is responsible for RecA-mediated modulation of the relaxation-dependent transcriptional activity of the Escherichia coli chromosome

    Limited functional conservation of a global regulator among related bacterial genera: Lrp in Escherichia, Proteus and Vibrio

    Get PDF
    Abstract Background Bacterial genome sequences are being determined rapidly, but few species are physiologically well characterized. Predicting regulation from genome sequences usually involves extrapolation from better-studied bacteria, using the hypothesis that a conserved regulator, conserved target gene, and predicted regulator-binding site in the target promoter imply conserved regulation between the two species. However many compared organisms are ecologically and physiologically diverse, and the limits of extrapolation have not been well tested. In E. coli K-12 the leucine-responsive regulatory protein (Lrp) affects expression of ~400 genes. Proteus mirabilis and Vibrio cholerae have highly-conserved lrp orthologs (98% and 92% identity to E. coli lrp). The functional equivalence of Lrp from these related species was assessed. Results Heterologous Lrp regulated gltB, livK and lrp transcriptional fusions in an E. coli background in the same general way as the native Lrp, though with significant differences in extent. Microarray analysis of these strains revealed that the heterologous Lrp proteins significantly influence only about half of the genes affected by native Lrp. In P. mirabilis, heterologous Lrp restored swarming, though with some pattern differences. P. mirabilis produced substantially more Lrp than E. coli or V. cholerae under some conditions. Lrp regulation of target gene orthologs differed among the three native hosts. Strikingly, while Lrp negatively regulates its own gene in E. coli, and was shown to do so even more strongly in P. mirabilis, Lrp appears to activate its own gene in V. cholerae. Conclusion The overall similarity of regulatory effects of the Lrp orthologs supports the use of extrapolation between related strains for general purposes. However this study also revealed intrinsic differences even between orthologous regulators sharing \u3e90% overall identity, and 100% identity for the DNA-binding helix-turn-helix motif, as well as differences in the amounts of those regulators. These results suggest that predicting regulation of specific target genes based on genome sequence comparisons alone should be done on a conservative basis

    Mariprofundus ferrooxydans PV-1 the First Genome of a Marine Fe(II) Oxidizing Zetaproteobacterium

    Get PDF
    © The Author(s), 2011. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in PLoS One 6 (2011): e25386, doi:10.1371/journal.pone.0025386.Mariprofundus ferrooxydans PV-1 has provided the first genome of the recently discovered Zetaproteobacteria subdivision. Genome analysis reveals a complete TCA cycle, the ability to fix CO2, carbon-storage proteins and a sugar phosphotransferase system (PTS). The latter could facilitate the transport of carbohydrates across the cell membrane and possibly aid in stalk formation, a matrix composed of exopolymers and/or exopolysaccharides, which is used to store oxidized iron minerals outside the cell. Two-component signal transduction system genes, including histidine kinases, GGDEF domain genes, and response regulators containing CheY-like receivers, are abundant and widely distributed across the genome. Most of these are located in close proximity to genes required for cell division, phosphate uptake and transport, exopolymer and heavy metal secretion, flagellar biosynthesis and pilus assembly suggesting that these functions are highly regulated. Similar to many other motile, microaerophilic bacteria, genes encoding aerotaxis as well as antioxidant functionality (e.g., superoxide dismutases and peroxidases) are predicted to sense and respond to oxygen gradients, as would be required to maintain cellular redox balance in the specialized habitat where M. ferrooxydans resides. Comparative genomics with other Fe(II) oxidizing bacteria residing in freshwater and marine environments revealed similar content, synteny, and amino acid similarity of coding sequences potentially involved in Fe(II) oxidation, signal transduction and response regulation, oxygen sensation and detoxification, and heavy metal resistance. This study has provided novel insights into the molecular nature of Zetaproteobacteria.Funding has been provided by the NSF Microbial Observatories Program (KJE, DE), NSF’s Science and Technology Program, by the Gordon and Betty Moore Foundation (KJE), the College of Letters, Arts, and Sciences at the University of Southern California (KJE), and by the NASA Astrobiology Institute (KJE, DE). Advanced Light Source analyses at the Lawrence Berkeley National Lab are supported by the Office of Science, Basic Energy Sciences, Division of Materials Science of the United States Department of Energy (DE-AC02-05CH11231)

    Algebraic Comparison of Partial Lists in Bioinformatics

    Get PDF
    The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset
    • …
    corecore