46 research outputs found

    Superhelical Duplex Destabilization and the Recombination Position Effect

    Get PDF
    The susceptibility to recombination of a plasmid inserted into a chromosome varies with its genomic position. This recombination position effect is known to correlate with the average G+C content of the flanking sequences. Here we propose that this effect could be mediated by changes in the susceptibility to superhelical duplex destabilization that would occur. We use standard nonparametric statistical tests, regression analysis and principal component analysis to identify statistically significant differences in the destabilization profiles calculated for the plasmid in different contexts, and correlate the results with their measured recombination rates. We show that the flanking sequences significantly affect the free energy of denaturation at specific sites interior to the plasmid. These changes correlate well with experimentally measured variations of the recombination rates within the plasmid. This correlation of recombination rate with superhelical destabilization properties of the inserted plasmid DNA is stronger than that with average G+C content of the flanking sequences. This model suggests a possible mechanism by which flanking sequence base composition, which is not itself a context-dependent attribute, can affect recombination rates at positions within the plasmid

    Promoter prediction and annotation of microbial genomes based on DNA sequence and structural responses to superhelical stress

    Get PDF
    BACKGROUND: In our previous studies, we found that the sites in prokaryotic genomes which are most susceptible to duplex destabilization under the negative superhelical stresses that occur in vivo are statistically highly significantly associated with intergenic regions that are known or inferred to contain promoters. In this report we investigate how this structural property, either alone or together with other structural and sequence attributes, may be used to search prokaryotic genomes for promoters. RESULTS: We show that the propensity for stress-induced DNA duplex destabilization (SIDD) is closely associated with specific promoter regions. The extent of destabilization in promoter-containing regions is found to be bimodally distributed. When compared with DNA curvature, deformability, thermostability or sequence motif scores within the -10 region, SIDD is found to be the most informative DNA property regarding promoter locations in the E. coli K12 genome. SIDD properties alone perform better at detecting promoter regions than other programs trained on this genome. Because this approach has a very low false positive rate, it can be used to predict with high confidence the subset of promoters that are strongly destabilized. When SIDD properties are combined with -10 motif scores in a linear classification function, they predict promoter regions with better than 80% accuracy. When these methods were tested with promoter and non-promoter sequences from Bacillus subtilis, they achieved similar or higher accuracies. We also present a strictly SIDD-based predictor for annotating promoter sequences in complete microbial genomes. CONCLUSION: In this report we show that the propensity to undergo stress-induced duplex destabilization (SIDD) is a distinctive structural attribute of many prokaryotic promoter sequences. We have developed methods to identify promoter sequences in prokaryotic genomes that use SIDD either as a sole predictor or in combination with other DNA structural and sequence properties. Although these methods cannot predict all the promoter-containing regions in a genome, they do find large sets of potential regions that have high probabilities of being true positives. This approach could be especially valuable for annotating those genomes about which there is limited experimental data

    Coupling models of cattle and farms with models of badgers for predicting the dynamics of bovine tuberculosis (TB)

    Get PDF
    Bovine TB is a major problem for the agricultural industry in several countries. TB can be contracted and spread by species other than cattle and this can cause a problem for disease control. In the UK and Ireland, badgers are a recognised reservoir of infection and there has been substantial discussion about potential control strategies. We present a coupling of individual based models of bovine TB in badgers and cattle, which aims to capture the key details of the natural history of the disease and of both species at approximately county scale. The model is spatially explicit it follows a very large number of cattle and badgers on a different grid size for each species and includes also winter housing. We show that the model can replicate the reported dynamics of both cattle and badger populations as well as the increasing prevalence of the disease in cattle. Parameter space used as input in simulations was swept out using Latin hypercube sampling and sensitivity analysis to model outputs was conducted using mixed effect models. By exploring a large and computationally intensive parameter space we show that of the available control strategies it is the frequency of TB testing and whether or not winter housing is practised that have the most significant effects on the number of infected cattle, with the effect of winter housing becoming stronger as farm size increases. Whether badgers were culled or not explained about 5%, while the accuracy of the test employed to detect infected cattle explained less than 3% of the variance in the number of infected cattle

    The distribution of inverted repeat sequences in the Saccharomyces cerevisiae genome

    Get PDF
    Although a variety of possible functions have been proposed for inverted repeat sequences (IRs), it is not known which of them might occur in vivo. We investigate this question by assessing the distributions and properties of IRs in the Saccharomyces cerevisiae (SC) genome. Using the IRFinder algorithm we detect 100,514 IRs having copy length greater than 6 bp and spacer length less than 77 bp. To assess statistical significance we also determine the IR distributions in two types of randomization of the S. cerevisiae genome. We find that the S. cerevisiae genome is significantly enriched in IRs relative to random. The S. cerevisiae IRs are significantly longer and contain fewer imperfections than those from the randomized genomes, suggesting that processes to lengthen and/or correct errors in IRs may be operative in vivo. The S. cerevisiae IRs are highly clustered in intergenic regions, while their occurrence in coding sequences is consistent with random. Clustering is stronger in the 3′ flanks of genes than in their 5′ flanks. However, the S. cerevisiae genome is not enriched in those IRs that would extrude cruciforms, suggesting that this is not a common event. Various explanations for these results are considered
    corecore