5 research outputs found

    ARGO: a web system for the detection of degenerate motifs and large-scale recognition of eukaryotic promoters

    Get PDF
    Reliable recognition of the promoters in eukaryotic genomes remains an open issue. This is largely owing to the poor understanding of the features of the structural–functional organization of the eukaryotic promoters essential for their function and recognition. However, it was demonstrated that detection of ensembles of regulatory signals characteristic of specific promoter groups increases the accuracy of promoter recognition and prediction of specific expression features of the queried genes. The ARGO_Motifs package was developed for the detection of sets of region-specific degenerate oligonucleotide motifs in the regulatory regions of the eukaryotic genes. The ARGO_Viewer package was developed for the recognition of tissue-specific gene promoters based on the presence and distribution of oligonucleotide motifs obtained by the ARGO_Motifs program. Analysis and recognition of tissue-specific promoters in five gene samples demonstrated high quality of promoter recognition. The public version of the ARGO system is available at and

    Flanking monomer repeats define lower context complexity of sites containing single nucleotide polymorphisms in the human genome

    Get PDF
    We have investigated a mutation frequency within the human genome for the set of known single nucleotide polymorphisms (SNPs) from the “1000 genomes” project. We have developed and applied novel statistical computational methods to analyze genetic text based on its complexity. A complexity profiling in a sliding window is applied to the sites containing single nucleotide polymorphisms within the human genome. A local decrease in text complexity level in SNP-containing sites has been shown. Analysis of the complexity profiles for SNPcontaining sites shows that flanking monomer repeats define a lower context complexity of sites containing SNPs within the human genome. An effect of local decrease in text complexity in SNP-containing sites is confirmed by analysis of polymorphisms in the rat and mouse genomes. We have found context differences between coding and regulatory sequences. These differences reflect a complexity of SNP-containing loci. The changes in point mutation frequency were shown previously for microsatellite containing sequences. Using enhanced mathematical tools and larger data sets this work shows enrichment of polytracks and simple sequence repeats in local genome surroundings of SNP containing sites. We have found high-frequency oligonucleotides within genomic regions containing SNPs. Such oligonucleotides are related to nucleotide polytracks. The presence of poly-A tracks might be associated with an increased probability of double helix DNA breaks around mutable loci and following fixation of nucleotide changes. The complexity estimates were computed using a previously developed program tool. This tool allows for both (i) complexity estimation of phased samples, and (ii) rapid and effective identification of the frequency spectrum of oligonucleotides with fixed lengths, and a comparison of oligonucleotide frequencies in different sample

    Computer analysis of co-localization of transcription factor binding sites in genome by ChIP-seq data

    Get PDF
    Statistical features of the distribution of transcription factor binding sites in the mouse genome that are obtained by ChIP-seq experiments in embryonic stem cells have been considered. Clusters of sites that contain four or more different transcription factor binding sites in the mouse genome have been defined, also their location relatively to the regulatory regions of genes has been described. The presence of two types of site co-localization has been shown: clusters containing binding sites for factors Oct4, Nanog, Sox2, located in the distal regions, and clusters containing binding sites n-Myc, c-Myc, mainly located in the promoter regions of mouse genes. Analysis of new ChIPseq data about binding of transcription factors Nr5a2, Tbx3 in the same cell type has confirmed the division of clusters of transcription factors binding sites into two types: those containing the binding sites of regulators of pluripotency (Oct4, Nanog, and others) and those not. The computer program of the statistical data processing of gene location and chromatin domains that analyzes experimental data of site localization obtained by ChIP-seq in the mouse genome and the human genome has been developed. The presence of preferences at position of transcription factor binding sites of various types has been revealed, the distances between the nearest groups of TF binding sites Oct4, Nanog, Sox2 and TF binding sites n-Myc and c-Myc have been calculated using this program. The presence of nucleotide motifs of transcription factor binding sites in the selected areas of ChIP-seq has been estimated, nucleotide motifs have been refined. A correlation between the presence of motifs and the intensity of ChIPseq binding has been shown. Computer methods for estimating the clustering of different transcription factors binding sites for new data ChIP-seq have been developed. Programs are available upon the request to the authors

    Evolution of CpG-islands by means of tandem duplications

    Get PDF
    CG-rich islands (CpG-islands, or CGI) are important functional elements in a genome of vertebrates. In particular, they: a) initiate transcription as promoters in most (> 50 %) genes of vertebrates, in some cases bi-directional, due to self-complement feature of cg dinucleotides; b) form a global methylation landscape; c) act as a transcription “switch” via methylation. The degenerate nature of CpG-island (elevated CG composition) implies an increase in the probability of tandem repeats and palindromes within CpG- island. This work is devoted to the identification of tandem duplications of complete CpG-islands, i. e. considering mega monomers of size 400–5 000 bp, in the human genome. We found a range of inter- and intragenic tandem duplications of CpG-islands. Intergenic CpGi duplication mediates through CG-rich telomeric satellites, as well as elements of the SINE. One of the most pronounced tandems are located in chromosome 19, known for its abundance of segment duplications and gene expansion. We also underline the unique genomic segment, which is DXZ4 mega satellite, in q arm of chromosome X, also falling into the category of CpG-islands which evolved by tandem duplications rounds

    Characterization and functional analysis of the P2Yâ‚‚R gene promoter

    Get PDF
    The entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file.Title from title screen of research.pdf file (viewed on April 21, 2009)Includes bibliographical references.Thesis (M.S.) University of Missouri-Columbia 2006.Dissertations, Academic -- University of Missouri--Columbia -- Biochemistry (Agriculture)Extracellular nucleotides can bind to the P2Yâ‚‚R and modulate proliferation and migration of smooth muscle cells, which is known to be involved in intimal hyperplasia that accompanies atherosclerosis and post-angioplasty restenosis. Moreover, the P2Yâ‚‚R is upregulated in vascular smooth muscle cells and endothelial cells in response to tissue injury. These findings suggest that the P2Yâ‚‚R is a potential target for the pharmacological control of progression of atherosclerosis and post-angioplasty restenosis. However, the mechanisms governing P2Yâ‚‚R up-regulation remain unknown. In this study, we have cloned a 2071 bp 5'-flanking region of the P2Yâ‚‚R gene in a reporter vector and carried out a serial deletion analysis. The deletion of a 175 bp region completely abolished promoter function and results further indicate that the P2Yâ‚‚R gene promoter uses an array of positive and negative response elements in the regulation of gene expression. Furthermore, other results show that the cytokine IL-1[beta] may be involved in down-regulation of P2Yâ‚‚R activity in human coronary artery endothelial cells. Further studies will potentially lead to the identification of novel pathways involved in the regulation of P2Yâ‚‚R gene expression, information that might be useful to suppress neointimal hyperplasia in atherosclerosis and the restenosis of angioplasty
    corecore