202,684 research outputs found

    Relationship between promoter sequence and its strength in gene expression

    Full text link
    In this study, through various tests one theoretical model is presented to describe the relationship between promoter strength and its nucleotide sequence. Our analysis shows that, promoter strength is greatly influenced by nucleotide groups with three adjacent nucleotides in its sequence. Meanwhile, nucleotides in different regions of promoter sequence have different effects on promoter strength. Based on experimental data for {\it E. coli} promoters, our calculations indicate, nucleotides in -10 region, -35 region, and the discriminator region of promoter sequence are more important than those in spacing region for determining promoter strength. With model parameter values obtained by fitting to experimental data, four promoter libraries are theoretically built for the corresponding experimental environments under which data for promoter strength in gene expression has been measured previously

    A novel method for prokaryotic promoter prediction based on DNA stability

    Get PDF
    Background: In the post-genomic era, correct gene prediction has become one of the biggest challenges in genome annotation. Improved promoter prediction methods can be one step towards developing more reliable ab initio gene prediction methods. This work presents a novel prokaryotic promoter prediction method based on DNA stability.Results: The promoter region is less stable and hence more prone to melting as compared to other genomic regions. Our analysis shows that a method of promoter prediction based on the differences in the stability of DNA sequences in the promoter and non-promoter region works much better compared to existing prokaryotic promoter prediction programs, which are based on sequence motif searches. At present the method works optimally for genomes such as that of Escherichia coli, which have near 50% G+C composition and also performs satisfactorily in case of other prokaryotic promoters.Conclusions: Our analysis clearly shows that the change in stability of DNA seems to provide a much better clue than usual sequence motifs, such as Pribnow box and -35 sequence, for differentiating promoter region from non-promoter regions. To a certain extent, it is more general and is likely to be applicable across organisms. Hence incorporation of such features in addition to the signature motifs can greatly improve the presently available promoter prediction programs

    Investigation of Streptomyces promoters

    Get PDF
    Bibliography: leaves 221-[234].[The work described here had multiple aims: to create a promoter probe that was suitable for the isolation of developmentally regulated Strepcomyces promoters, to isolate such promoters, to develop a computer assisted analysis system whereby potential promoter sequences could be determined and to use this in the analysis of the cloned promoters. Initially the suitability of the Streptomyces antibioticus me1C operon for use as a reporter system in Streptomyces was investigated. It was established that late-expressed promoters could be identified and that it was possible to use the me1C2 gene alone for this purpose. However, it was shown that the use of both me1C1 and me1C2 resulted in a more sensitive reporter system. High copy number promoter probe vectors were constructed and tested. A low copy number promoter probe (which used the Streptomyces penemefaciens pSPN1 origin of replication) was also constructed. The characteristics (copy number, stability and mobility) of the probe were established. The conditions in which sporulation was induced by phosphate limitation were identified. Under such conditions late expressing, phosphate dependent promoters were isolated, using the promoter probes previously developed. The expression of these promoters was tested in Streptomyces coelicolor bldA mutants, and the bldA dependent promoters identified. These were sequenced. Computer assisted analysis of DNA sequence bias was conducted, with the intention of using bias patterns to identify potential regulatory regions. The initial approach of using the sequence bias of protein coding regions (based on the premise that regulatory sites are likely to be under represented in these regions) was unsuccessful. Further analysis in which the positional preference of sequences that were over represented in regulatory regions was conducted. Based on this the known promoters of Streptomyces were partially classified. The sequence bias of protein coding DNA regions was used to develop a novel method to identify the protein coding regions of Streptomyces DNA. The computer programs were then used to identify protein coding and potential regulatory regions

    Genomic analysis of gene regulation complexity

    Get PDF
    With multiple metazoan genomes in each family being sequenced promoter analysis is becoming a useful tool in genomic analysis. Aligning the promoter regions in the DNA of C. elegans and C. briggsae identifies conserved promoter elements. While not all promoter elements are conserved and not all conserved regions are promoter elements, we find that conservation is a useful method for determining promoter complexity. Promoter complexity identifies which genes have particularly interesting regulation, identifying gene groups with a strong promoter complexity signal and cases where a gene\u27s promoter complexity differs from the group\u27s promoter complexity. We identify potential promoter sequence by several local sequence alignment methods. Instead of studying individual promoter elements we are looking at patterns of promoter complexity; the total conserved sequence for each gene gives us a measure for promoter complexity. Monte Carlo random sampling is used to identify Gene Ontology and KEGG Pathway annotated gene groups that appear to have significantly low or high complexity. Developmental genes were found to have low complexity while growth genes have high complexity. Other groups that we expected to have high significance show none at all or had low promoter complexity. Genes contributing to the extracellular region scored high in promoter complexity while basal transcription factors often scored low in complexity. Genes annotated with GO terms transcription factors, signalling genes, genes with multiple alternative splice products, and developmental genes had significant promoter scores. We examined gene expression in the published C. elegans microarray experiments and found a strong positive correlation between gene group expression variation and promoter complexity. Promoter complexity tends to be an accurate predictor of the complexity of a gene\u27s pattern of expression and also gives us another tool to find anomalous genes

    Integrating genomic resources to present full gene and putative promoter capture probe sets for bread wheat

    Get PDF
    BACKGROUND: Whole-genome shotgun resequencing of wheat is expensive because of its large, repetitive genome. Moreover, sequence data can fail to map uniquely to the reference genome, making it difficult to unambiguously assign variation. Resequencing using target capture enables sequencing of large numbers of individuals at high coverage to reliably identify variants associated with important agronomic traits. Previous studies have implemented complementary DNA/exon or gene-based probe sets in which the promoter and intron sequence is largely missing alongside newly characterized genes from the recent improved reference sequences. RESULTS: We present and validate 2 gold standard capture probe sets for hexaploid bread wheat, a gene and a putative promoter capture, which are designed using recently developed genome sequence and annotation resources. The captures can be combined or used independently. We demonstrate that the capture probe sets effectively enrich the high-confidence genes and putative promoter regions that were identified in the genome alongside a large proportion of the low-confidence genes and associated promoters. Finally, we demonstrate successful sample multiplexing that allows generation of adequate sequence coverage for single-nucleotide polymorphism calling while significantly reducing cost per sample for gene and putative promoter capture. CONCLUSIONS: We show that a capture design employing an "island strategy" can enable analysis of the large gene/putative promoter space of wheat with only 2 × 160 Mbp probe sets. Furthermore, these assays extend the regions of the wheat genome that are amenable to analyses beyond its exome, providing tools for detailed characterization of these regulatory regions in large populations

    Transcriptional regulation of the hepatitis B virus large surface antigen gene

    Get PDF
    Hepatitis B virus (HBV) is a hepatotropic virus of highly restricted host range and tissue specificity. Although the mechanisms governing this tropism are not fully understood, it is likely that restrictions occur at multiple steps in the viral life cycle. The liver-specific regulation of HBV gene expression suggests that transcription may be an important factor in the hepatotropism of the virus. An analysis of tissue- or cell- line-specific regulation of the HBV promoters may elucidate the role of transcriptional regulation in the hepatotropism of the virus. The major aim of this project was to characterize the transcriptional regulation of the large surface antigen gene of hepatitis B virus. To achieve this, the regions of the HBV genome involved in the regulation of the expression of the large surface antigen gene were identified using a transient transfection system in mammalian cell lines. The transcriptional activities of the four HBV promoters were compared in the human differentiated hepatoma cell lines Hep3B, PLC/PRF/5, HepG2 and Huh7, a human dedifferentiated hepatoma cell line HepG2.1, and the nonhepatoma cell lines HeLa S3 and NIH 3T3. To determine the relative transcriptional activities of the four HBV promoters, reporter gene plasmids were generated such that the expression of the firefly luciferase gene was under the control of each of the HBV promoters in the context of the complete genome. The nucleocapsid promoter and large surface antigen promoter displayed higher relative activities in the differentiated hepatoma cell lines, indicating that these promoters are preferentially active in these cell lines. A series of large surface antigen promoter deletion plasmids were constructed to identify the important regulatory regions of the large surface antigen promoter. The deletion analysis demonstrated that the region responsible for the high relative activity in differentiated hepatoma cell lines is located between -90 and -76 relative to the transcription initiation site (-*■!) located at map position 2809. This sequence element contains the binding site (GTTAATCATTACT) for the liver-enriched transcription factor hepatocyte nuclear factor I, HNF1. A eukaryotic expression vector containing the HNFI cDNA under the control of the mouse metallothionein I promoter was cotransfected with the HBV promoter constructs in Huh7 and HepG2.1 cells, and the relative levels of activity were determined. The Huh7 cell line was used because it is one of the cell lines in which HBV replication and particle production can occur and may represent the tissue culture system closest to the natural environment for the HBV life cycle, the liver cell. The cloned transcription factor HNFI activated transcription from the large surface antigen promoter, but not from any of the other HBV promoters. Cotransfection experiments using the HNFI cDNA expression vector and large surface antigen promoter deletion constructs demonstrated that this transactivation was mediated through the HNFI binding site located between -90 and -76 in the large surface antigen promoter. A series of deletion mutants of the cDNA in the HNFI expression vector was generated to determine the transcriptional activation domain of the HNFI polypeptide. The major domain of the HNFI polypeptide involved in transcriptional activation of the large surface antigen promoter in the human hepatoma cell line HepG2.1 was mapped to a region rich in glutamine and proline residues (9 of 18 residues). To demonstrate directly that the HNFI polypeptide produced by the expression of the HNFI cDNA could bind the large surface antigen promoter HNFI recognition sequence, and to determine whether a protein present in the differentiated hepatoma cell line Huh7 bound the HNFI element, gel mobility shift analysis was performed. This analysis demonstrated that a protein present in nuclear extracts from Huh7 cells formed a specific complex with the HNFI binding site which had similar migration properties to the complex formed between exogenously expressed HNFI and the HNFI recognition sequence. DNase I footprinting analysis demonstrated the binding of a protein present in the differentiated hepatoma cell line Huh7 to the HNFI recognition sequence in the large surface antigen promoter. DNase I footprinting also showed that purified TATA binding protein binds the TATA box element located between -31 and -23 in the large surface antigen promoter. The analysis of synthetic promoter constructs suggested that the HNFI and TATA box elements were the only elements necessary for maximal activity from the large surface antigen promoter, and analysis of clustered point mutations in the large surface antigen minimal promoter region demonstrated that sequences between the HNFI and TATA box elements were not required for the HNF1- dependent activity of the large surface antigen promoter. These studies suggested that the liver-enriched transcription factor HNFI plays a critical role in the cell-line and tissue-specific regulation of the HBV large surface antigen promoter

    Weighted Alignment Free Dissimilarity Metric for Promoter Sequence Comparison

    Get PDF
    Comparative sequence analysis has been a powerful tool in bioinformatics which interprets knowledge about the functionality of a sequence, making use of its structural information. Among the non coding regions of DNA,   the comparison of promoter sequences has received a great deal of attention in medical science as promoter regions play a crucial role in gene regulation. In this work we propose an alignment free sequence comparison metric for comparison of promoter sequences. We use the binary and decimal position specific motif matrices (PSMM) of the promoters which were created for our experiments using the TFSEARCH tool. Simple weighted algorithm is used to compute the dissimilarity between the PSMMs of promoter sequences, thereby analyzing its underlying homology and functionality. The NCBI database was used to obtain the promoter sequences of 500 nucleotides upstream the transcription start site (TSS) of enzyme pyruvate kinase (PKLR) from the glycolysis pathway of different organisms for one experiment and all the enzymes from the glycolysis pathway of organism human for the other. The proposed dissimilarity metric is successful in bringing out differences on both the datasets and the results regarding similarities and differences in promoter sequences could be essential to have a clear knowledge of transcription regulation process in different organisms.The results reveal some useful findings which can be extended for a broader investigation

    Multiple non-collinear TF-map alignments of promoter regions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The analysis of the promoter sequence of genes with similar expression patterns is a basic tool to annotate common regulatory elements. Multiple sequence alignments are on the basis of most comparative approaches. The characterization of regulatory regions from co-expressed genes at the sequence level, however, does not yield satisfactory results in many occasions as promoter regions of genes sharing similar expression programs often do not show nucleotide sequence conservation.</p> <p>Results</p> <p>In a recent approach to circumvent this limitation, we proposed to align the maps of predicted transcription factors (referred as TF-maps) instead of the nucleotide sequence of two related promoters, taking into account the label of the corresponding factor and the position in the primary sequence. We have now extended the basic algorithm to permit multiple promoter comparisons using the progressive alignment paradigm. In addition, non-collinear conservation blocks might now be identified in the resulting alignments. We have optimized the parameters of the algorithm in a small, but well-characterized collection of human-mouse-chicken-zebrafish orthologous gene promoters.</p> <p>Conclusion</p> <p>Results in this dataset indicate that TF-map alignments are able to detect high-level regulatory conservation at the promoter and the 3'UTR gene regions, which cannot be detected by the typical sequence alignments. Three particular examples are introduced here to illustrate the power of the multiple TF-map alignments to characterize conserved regulatory elements in absence of sequence similarity. We consider this kind of approach can be extremely useful in the future to annotate potential transcription factor binding sites on sets of co-regulated genes from high-throughput expression experiments.</p
    corecore