12 research outputs found

    neo4jsbml: import systems biology markup language data into the graph database Neo4j

    Get PDF
    Systems Biology Markup Language (SBML) has emerged as a standard for representing biological models, facilitating model sharing and interoperability. It stores many types of data and complex relationships, complicating data management and analysis. Traditional database management systems struggle to effectively capture these complex networks of interactions within biological systems. Graph-oriented databases perform well in managing interactions between different entities. We present neo4jsbml, a new solution that bridges the gap between the Systems Biology Markup Language data and the Neo4j database, for storing, querying and analyzing data. The Systems Biology Markup Language organizes biological entities in a hierarchical structure, reflecting their interdependencies. The inherent graphical structure represents these hierarchical relationships, offering a natural and efficient means of navigating and exploring the model’s components. Neo4j is an excellent solution for handling this type of data. By representing entities as nodes and their relationships as edges, Cypher, Neo4j’s query language, efficiently traverses this type of graph representing complex biological networks. We have developed neo4jsbml, a Python library for importing Systems Biology Markup Language data into a Neo4j database using a user-defined schema. By leveraging Neo4j’s graphical database technology, exploration of complex biological networks becomes intuitive and information retrieval efficient. Neo4jsbml is a tool designed to import Systems Biology Markup Language data into a Neo4j database. Only the desired data is loaded into the Neo4j database. neo4jsbml is user-friendly and can become a useful new companion for visualizing and analyzing metabolic models through the Neo4j graphical database. neo4jsbml is open source software and available at https://github.com/brsynth/neo4jsbml

    <i>Staphylococcus aureus </i>Transcriptome Architecture:From Laboratory to Infection-Mimicking Conditions

    Get PDF
    Staphylococcus aureus is a major pathogen that colonizes about 20% of the human population. Intriguingly, this Gram-positive bacterium can survive and thrive under a wide range of different conditions, both inside and outside the human body. Here, we investigated the transcriptional adaptation of S. aureus HG001, a derivative of strain NCTC 8325, across experimental conditions ranging from optimal growth in vitro to intracellular growth in host cells. These data establish an extensive repertoire of transcription units and non-coding RNAs, a classification of 1412 promoters according to their dependence on the RNA polymerase sigma factors SigA or SigB, and allow identification of new potential targets for several known transcription factors. In particular, this study revealed a relatively low abundance of antisense RNAs in S. aureus, where they overlap only 6% of the coding genes, and only 19 antisense RNAs not co-transcribed with other genes were found. Promoter analysis and comparison with Bacillus subtilis links the small number of antisense RNAs to a less profound impact of alternative sigma factors in S. aureus. Furthermore, we revealed that Rho-dependent transcription termination suppresses pervasive antisense transcription, presumably originating from abundant spurious transcription initiation in this A+T-rich genome, which would otherwise affect expression of the overlapped genes. In summary, our study provides genome-wide information on transcriptional regulation and non-coding RNAs in S. aureus as well as new insights into the biological function of Rho and the implications of spurious transcription in bacteria

    Genoscapist: online exploration of quantitative profiles along genomes via interactively customized graphical representations

    No full text
    International audienceGenoscapist is a web-tool generating high-quality images for interactive visualization of hundreds of quantitative profiles along a reference genome together with various annotations. Relevance is demonstrated by deployment of two websites dedicated to large condition-dependent transcriptome datasets available for Bacillus subtilis and Staphylococcus aureus

    Direct comparison of spatial transcriptional heterogeneity across diverse Bacillus subtilis biofilm communities

    No full text
    Abstract Bacillus subtilis can form various types of spatially organised communities on surfaces, such as colonies, pellicles and submerged biofilms. These communities share similarities and differences, and phenotypic heterogeneity has been reported for each type of community. Here, we studied spatial transcriptional heterogeneity across the three types of surface-associated communities. Using RNA-seq analysis of different regions or populations for each community type, we identified genes that are specifically expressed within each selected population. We constructed fluorescent transcriptional fusions for 17 of these genes, and observed their expression in submerged biofilms using time-lapse confocal laser scanning microscopy (CLSM). We found mosaic expression patterns for some genes; in particular, we observed spatially segregated cells displaying opposite regulation of carbon metabolism genes (gapA and gapB), indicative of distinct glycolytic or gluconeogenic regimes coexisting in the same biofilm region. Overall, our study provides a direct comparison of spatial transcriptional heterogeneity, at different scales, for the three main models of B. subtilis surface-associated communities

    Transcriptional landscape reconstruction leads to a new annotation of the <i>S</i>. <i>aureus</i> HG001 genome.

    No full text
    <p>Panels <b>(A-G)</b> show examples of the different categories of transcription segments outside annotated CDSs and RNA genes. Each panel shows from top-to-bottom (i) the original GenBank annotation, (ii) a selection of 30 representative expression profiles (horizontal black lines show for each strand the chromosome median, and the associated 5-fold and 10-fold cut-offs) colored according to the position of the hybridization in 3D PCA, (iii) the detected up-shifts, the associated transcription units, and the down-shift positions, (iv) the new annotation with unannotated expressed segments colored according to the classification based on the transcriptional context. The different categories of terminal regions are <b>5’UTR</b> (green boxes) and three classes of 3’ regions: <b>3’UTR</b> (red) ending with a defined termination site, <b>3’NT</b> (orange) without defined termination site, and <b>3’PT</b> (old yellow) downstream a site of partial termination. Two categories of intergenic regions are distinguished: <b>intra</b> (dark blue) for strictly intracistronic regions, and <b>inter</b> (light blue) for regions where the downstream gene can be transcribed from its own promoter. Finally, depending of the presence or absence of a defined termination site, independent segments decompose into two categories: <b>indep</b> (black) and <b>indep-NT</b> (brown). Transcription segments overlapping (≥100bp or ≥50%) GenBank annotated genes on the opposite strand are referred to as antisense (<b>AS</b>).</p

    Promoter tree, sigma-factor and TFBS predictions.

    No full text
    <p>From top to bottom, the figure includes the following elements: A promoter tree built by hierarchical clustering of promoter activities across RNA samples based on pairwise correlations. The classification of up-shifts according to the type of sigma-factor binding sites identified (black bars for SigA, orange bars for SigB, gray bars for lack of sigma-factor binding site identification). The clusters of size ≥15 promoters obtained when splitting the tree at an average Pearson correlation coefficient 0.6. The TFBSs identified by MAST search. Here, the different transcription factors are listed on the left-hand side of the plot along with the counts for three different categories of up-shifts. The color codes used for counts and symbols are: blue for sites predicted by MAST search and included in the training (RegPrecise) set, red for sites in the training set but not identified by our MAST search, green for sites predicted by MAST search but not listed in the training set thus representing newly identified potential TFBSs.</p

    Context and impact of elevated antisense expression levels in the Δ<i>rho</i> mutant.

    No full text
    <p><b>(A)</b> Transcription profiles for selected regions showing different effects of <i>rho</i> deletion, from left to right: 1) flattening of the downstream drift expression patterns typical of regions lacking defined termination sites (3’PT and 3’NT); 2) & 3) expression of regions for which no promoters are detected in the wild-type; 4) & 5) transcriptional read-through at defined termination sites; 6) higher transcript levels of coding genes. For the first two regions we show the transcription profiles for the four growth conditions examined, with wild-type profiles in black and Δ<i>rho</i> mutant profiles in condition-specific colors, as well as the 30 representative wild-type profiles. As seen in these examples, the impact of <i>rho</i> deletion tends to be stronger in RPMI than in TSB medium and in exponential growth than in stationary phase. Some degree of decrease of sense transcript levels, which may be caused by elevated antisense levels, is seen in these examples. <b>(B)</b> Sense versus antisense transcription levels in the wild-type and in the Δ<i>rho</i> mutant for exponential growth in RPMI. Each annotated gene is represented by a point. There is a strong negative correlation between sense and antisense expression in the Δ<i>rho</i> mutant (Pearson correlation coefficient r = -0.73) that is also visible but much weaker in the wild-type (r = -0.30). Indeed, antisense levels tend to increase genome-wide in the Δ<i>rho</i> mutant, except for the antisense strand of the most highly expressed genes. The most down-regulated genes (expression level in the Δ<i>rho</i> mutant is ≤50% of the wild-type) are highlighted in blue; they face antisense transcripts with particularly elevated levels in the Δ<i>rho</i> mutant. Horizontal and vertical lines indicate the medians (global in gray, most down-regulated genes in blue).</p
    corecore