8 research outputs found

    Local Renyi entropic profiles of DNA sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the RĂ©nyi entropy of probability density estimation (pdf) using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM). Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs.</p> <p>Results</p> <p>The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at <url>http://kdbio.inesc-id.pt/~svinga/ep/</url>.</p> <p>Conclusion</p> <p>The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.</p

    SIGffRid: A tool to search for sigma factor binding sites in bacterial genomes using comparative approach and biologically driven statistics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many programs have been developed to identify transcription factor binding sites. However, most of them are not able to infer two-word motifs with variable spacer lengths. This case is encountered for RNA polymerase Sigma (<it>σ</it>) Factor Binding Sites (SFBSs) usually composed of two boxes, called -35 and -10 in reference to the transcription initiation point. Our goal is to design an algorithm detecting SFBS by using combinational and statistical constraints deduced from biological observations.</p> <p>Results</p> <p>We describe a new approach to identify SFBSs by comparing two related bacterial genomes. The method, named SIGffRid (SIGma Factor binding sites Finder using R'MES to select Input Data), performs a simultaneous analysis of pairs of promoter regions of orthologous genes. SIGffRid uses a prior identification of over-represented patterns in whole genomes as selection criteria for potential -35 and -10 boxes. These patterns are then grouped using pairs of short seeds (of which one is possibly gapped), allowing a variable-length spacer between them. Next, the motifs are extended guided by statistical considerations, a feature that ensures a selection of motifs with statistically relevant properties. We applied our method to the pair of related bacterial genomes of <it>Streptomyces coelicolor </it>and <it>Streptomyces avermitilis</it>. Cross-check with the well-defined SFBSs of the SigR regulon in <it>S. coelicolor </it>is detailed, validating the algorithm. SFBSs for HrdB and BldN were also found; and the results suggested some new targets for these <it>σ </it>factors. In addition, consensus motifs for BldD and new SFBSs binding sites were defined, overlapping previously proposed consensuses. Relevant tests were carried out also on bacteria with moderate GC content (i.e. <it>Escherichia coli</it>/<it>Salmonella typhimurium </it>and <it>Bacillus subtilis</it>/<it>Bacillus licheniformis </it>pairs). Motifs of house-keeping <it>σ </it>factors were found as well as other SFBSs such as that of SigW in <it>Bacillus </it>strains.</p> <p>Conclusion</p> <p>We demonstrate that our approach combining statistical and biological criteria was successful to predict SFBSs. The method versatility autorizes the recognition of other kinds of two-box regulatory sites.</p

    SIGffRid: A tool to search for sigma factor binding sites in bacterial genomes using comparative approach and biologically driven statistics

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many programs have been developed to identify transcription factor binding sites. However, most of them are not able to infer two-word motifs with variable spacer lengths. This case is encountered for RNA polymerase Sigma (<it>σ</it>) Factor Binding Sites (SFBSs) usually composed of two boxes, called -35 and -10 in reference to the transcription initiation point. Our goal is to design an algorithm detecting SFBS by using combinational and statistical constraints deduced from biological observations.</p> <p>Results</p> <p>We describe a new approach to identify SFBSs by comparing two related bacterial genomes. The method, named SIGffRid (SIGma Factor binding sites Finder using R'MES to select Input Data), performs a simultaneous analysis of pairs of promoter regions of orthologous genes. SIGffRid uses a prior identification of over-represented patterns in whole genomes as selection criteria for potential -35 and -10 boxes. These patterns are then grouped using pairs of short seeds (of which one is possibly gapped), allowing a variable-length spacer between them. Next, the motifs are extended guided by statistical considerations, a feature that ensures a selection of motifs with statistically relevant properties. We applied our method to the pair of related bacterial genomes of <it>Streptomyces coelicolor </it>and <it>Streptomyces avermitilis</it>. Cross-check with the well-defined SFBSs of the SigR regulon in <it>S. coelicolor </it>is detailed, validating the algorithm. SFBSs for HrdB and BldN were also found; and the results suggested some new targets for these <it>σ </it>factors. In addition, consensus motifs for BldD and new SFBSs binding sites were defined, overlapping previously proposed consensuses. Relevant tests were carried out also on bacteria with moderate GC content (i.e. <it>Escherichia coli</it>/<it>Salmonella typhimurium </it>and <it>Bacillus subtilis</it>/<it>Bacillus licheniformis </it>pairs). Motifs of house-keeping <it>σ </it>factors were found as well as other SFBSs such as that of SigW in <it>Bacillus </it>strains.</p> <p>Conclusion</p> <p>We demonstrate that our approach combining statistical and biological criteria was successful to predict SFBSs. The method versatility autorizes the recognition of other kinds of two-box regulatory sites.</p

    Towards a Real-time Mitigation of High Temperature while Drilling using a Multi-agent System

    No full text
    International audienceIn oilfield wells, while drilling for several kilometers below surface, high temperature damages the drilling tools. This costs money and time for tripping operations to change the damaged tool. Existing temperature mitigation techniques have several drawbacks including a long response time, analogue signal issues and human intervention. In this work, we empower the down-hole tools with a coordination mechanism to mitigate high temperature in soft real time by controlling a down-hole actuator through a voting process. The tools are represented by agents that control the sensors and actuators embedded in these tools. To implement the proposed system properly, a model of the drilling domain is constructed with all drilling mechanics and parameters, along with the well trajectory and temperature equations taken into consideration. The proposed model is implemented and tested using AgentOil, a multi-agent-based simulation tool, and the results are evaluated. Furthermore , the requirements of a real-time temperature mitigation system for Oil&Gas drilling operations are identified and the constraints of such systems are analyzed

    AgentOil: A Multiagent-Based Simulation of the Drilling Process in Oilfields

    No full text
    International audienceOil&Gas have become the world’s most important source of energy since the mid-1950’s. For instance; Britain oilfields produce each year about 76 million tonnes of oil equivalent. This provides 76% of the UK’s total primary energy [5]. In oilfields wells, a drilling rig is used to create a bore-hole in the earth’s sub-surface with a Bottom Hole Assembly (BHA), which is a composition of several drilling tools with various functionalities, searching for natural resources

    A Cyber-Physical System for Semi-autonomous Oil & Gas Drilling Operations

    No full text
    International audienceIn Oil&Gas drilling operations and after reaching deep drilled depths, high temperature increases significantly enough to damage the down-hole drilling tools, and the existing mitigation process is insufficient. In this paper, we propose a Cyber-Physical System (CPS) where agents are used to represent the collaborating entities in Oil\&Gas fields both up-hole and down-hole. With the proposed CPS, down-hole tools respond to high temperature autonomously with a decentralized collective voting based on the tools' internal decision model while waiting for the cooling performed up-hole by the field engineer. This decision model, driven by the tools' specifications, aims to withstand high temperature. The proposed CPS is implemented using a multiagent simulation environment, and the results show that it mitigates high temperature properly with both the voting and the cooling mechanisms

    Using RSAT oligo-analysis and dyad-analysis tools to discover regulatory signals in nucleic sequences.

    No full text
    This protocol explains how to discover functional signals in genomic sequences by detecting over- or under-represented oligonucleotides (words) or spaced pairs thereof (dyads) with the Regulatory Sequence Analysis Tools (http://rsat.ulb.ac.be/rsat/). Two typical applications are presented: (i) predicting transcription factor-binding motifs in promoters of coregulated genes and (ii) discovering phylogenetic footprints in promoters of orthologous genes. The steps of this protocol include purging genomic sequences to discard redundant fragments, discovering over-represented patterns and assembling them to obtain degenerate motifs, scanning sequences and drawing feature maps. The main strength of the method is its statistical ground: the binomial significance provides an efficient control on the rate of false positives. In contrast with optimization-based pattern discovery algorithms, the method supports the detection of under- as well as over-represented motifs. Computation times vary from seconds (gene clusters) to minutes (whole genomes). The execution of the whole protocol should take approximately 1 h.Comparative StudyJournal ArticleResearch Support, Non-U.S. Gov'tSCOPUS: ar.jinfo:eu-repo/semantics/publishe
    corecore