553 research outputs found
An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs
Background: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty.
Results: We propose SCintuit, a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SCintuit is designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SCintuit is computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SCintuit to identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed.
Conclusions: The results show that SCintuit improves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SCintuit can be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven
Quantitative model for inferring dynamic regulation of the tumour suppressor gene p53
Background: The availability of various "omics" datasets creates a prospect of performing the study of genome-wide genetic regulatory networks. However, one of the major challenges of using mathematical models to infer genetic regulation from microarray datasets is the lack of information for protein concentrations and activities. Most of the previous researches were based on an assumption that the mRNA levels of a gene are consistent with its protein activities, though it is not always the case. Therefore, a more sophisticated modelling framework together with the corresponding inference methods is needed to accurately estimate genetic regulation from "omics" datasets.
Results: This work developed a novel approach, which is based on a nonlinear mathematical model, to infer genetic regulation from microarray gene expression data. By using the p53 network as a test system, we used the nonlinear model to estimate the activities of transcription factor (TF) p53 from the expression levels of its target genes, and to identify the activation/inhibition status of p53 to its target genes. The predicted top 317 putative p53 target genes were supported by DNA sequence analysis. A comparison between our prediction and the other published predictions of p53 targets suggests that most of putative p53 targets may share a common depleted or enriched sequence signal on their upstream non-coding region.
Conclusions: The proposed quantitative model can not only be used to infer the regulatory relationship between TF and its down-stream genes, but also be applied to estimate the protein activities of TF from the expression levels of its target genes
Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data
Background: MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown. Methodology: We performed Pol II chromatin immunoprecipitation (ChIP)-chip using a custom array surrounding regions of known miRNA genes. To identify the true core transcription start sites of the miRNA genes we developed a new tool (CPPP). We showed that miRNA genes can be transcribed from promoters located several kilobases away and that their promoters share the same general features as those of protein coding genes. Finally, we found evidence that as many as 26% of the intragenic miRNAs may be transcribed from their own unique promoters. Conclusion: miRNA promoters have similar features to those of protein coding genes, but miRNA transcript organization is more complex. © 2009 Corcoran et al
Systematic analysis of the ability of Nitric Oxide donors to dislodge biofilms formed by Salmonella enterica and Escherichia coli O157:H7
Biofilms in the industrial environment could be problematic. Encased in extracellular polymeric substances, pathogens within biofilms are significantly more resistant to chlorine and other disinfectants. Recent studies suggest that compounds capable of manipulating nitric oxide-mediated signaling in bacteria could induce dispersal of sessile bacteria and provide a foundation for novel approaches to controlling biofilms formed by some microorganisms. In this work, we compared the ability of five nitric oxide donors (molsidomine, MAHMA NONOate, diethylamine NONOate, diethylamine NONOate diethylammonium salt, spermine NONOate) to dislodge biofilms formed by non-typhoidal Salmonella enterica and pathogenic E. coli on plastic and stainless steel surfaces at different temperatures. All five nitric oxide donors induced significant (35-80%) dispersal of biofilms, however, the degree of dispersal and the optimal dispersal conditions varied. MAHMA NONOate and molsidomine were strong dispersants of the Salmonella biofilms formed on polystyrene. Importantly, molsidomine induced dispersal of up to 50% of the pre-formed Salmonella biofilm at 4 degrees C, suggesting that it could be effective even under refrigerated conditions. Biofilms formed by E. coli O157:H7 were also significantly dispersed. Nitric oxide donor molecules were highly active within 6 hours of application. To better understand mode of action of these compounds, we identified Salmonella genomic region recA-hydN, deletion of which led to an insensitivity to the nitric oxide donors
SOX2 Co-Occupies Distal Enhancer Elements with Distinct POU Factors in ESCs and NPCs to Specify Cell State
SOX2 is a master regulator of both pluripotent embryonic stem cells (ESCs) and multipotent neural progenitor cells (NPCs); however, we currently lack a detailed understanding of how SOX2 controls these distinct stem cell populations. Here we show by genome-wide analysis that, while SOX2 bound to a distinct set of gene promoters in ESCs and NPCs, the majority of regions coincided with unique distal enhancer elements, important cis-acting regulators of tissue-specific gene expression programs. Notably, SOX2 bound the same consensus DNA motif in both cell types, suggesting that additional factors contribute to target specificity. We found that, similar to its association with OCT4 (Pou5f1) in ESCs, the related POU family member BRN2 (Pou3f2) co-occupied a large set of putative distal enhancers with SOX2 in NPCs. Forced expression of BRN2 in ESCs led to functional recruitment of SOX2 to a subset of NPC-specific targets and to precocious differentiation toward a neural-like state. Further analysis of the bound sequences revealed differences in the distances of SOX and POU peaks in the two cell types and identified motifs for additional transcription factors. Together, these data suggest that SOX2 controls a larger network of genes than previously anticipated through binding of distal enhancers and that transitions in POU partner factors may control tissue-specific transcriptional programs. Our findings have important implications for understanding lineage specification and somatic cell reprogramming, where SOX2, OCT4, and BRN2 have been shown to be key factors
Occupancy maps of 208 chromatin-associated proteins in one human cell type
Transcription factors are DNA-binding proteins that have key roles in gene regulation. Genome-wide occupancy maps of transcriptional regulators are important for understanding gene regulation and its effects on diverse biological processes. However, only a minority of the more than 1,600 transcription factors encoded in the human genome has been assayed. Here we present, as part of the ENCODE (Encyclopedia of DNA Elements) project, data and analyses from chromatin immunoprecipitation followed by high-throughput sequencing (ChIP–seq) experiments using the human HepG2 cell line for 208 chromatin-associated proteins (CAPs). These comprise 171 transcription factors and 37 transcriptional cofactors and chromatin regulator proteins, and represent nearly one-quarter of CAPs expressed in HepG2 cells. The binding profiles of these CAPs form major groups associated predominantly with promoters or enhancers, or with both. We confirm and expand the current catalogue of DNA sequence motifs for transcription factors, and describe motifs that correspond to other transcription factors that are co-enriched with the primary ChIP target. For example, FOX family motifs are enriched in ChIP–seq peaks of 37 other CAPs. We show that motif content and occupancy patterns can distinguish between promoters and enhancers. This catalogue reveals high-occupancy target regions at which many CAPs associate, although each contains motifs for only a minority of the numerous associated transcription factors. These analyses provide a more complete overview of the gene regulatory networks that define this cell type, and demonstrate the usefulness of the large-scale production efforts of the ENCODE Consortium
A new analysis approach of epidermal growth factor receptor pathway activation patterns provides insights into cetuximab resistance mechanisms in head and neck cancer
The pathways downstream of the epidermal growth factor receptor (EGFR) have often been implicated to play crucial roles in the development and progression of various cancer types. Different authors have proposed models in cell lines in which they study the modes of pathway activities after perturbation experiments. It is prudent to believe that a better understanding of these pathway activation patterns might lead to novel treatment concepts for cancer patients or at least allow a better stratification of patient collectives into different risk groups or into groups that might respond to different treatments. Traditionally, such analyses focused on the individual players of the pathways. More recently in the field of systems biology, a plethora of approaches that take a more holistic view on the signaling pathways and their downstream transcriptional targets has been developed. Fertig et al. have recently developed a new method to identify patterns and biological process activity from transcriptomics data, and they demonstrate the utility of this methodology to analyze gene expression activity downstream of the EGFR in head and neck squamous cell carcinoma to study cetuximab resistance. Please see related article: http://www.biomedcentral.com/1471-2164/13/16
UCbase & miRfunc: a database of ultraconserved sequences and microRNA function
Four hundred and eighty-one ultraconserved sequences (UCRs) longer than 200 bases were discovered in the genomes of human, mouse and rat. These are DNA sequences showing 100% identity among the three species. UCRs are frequently located at genomic regions involved in cancer, differentially expressed in human leukemias and carcinomas and in some instances regulated by microRNAs (miRNAs). Here we present UCbase & miRfunc, the first database which provides ultraconserved sequences data and shows miRNA function. Also, it links UCRs and miRNAs with the related human disorders and genomic properties. The current release contains over 2000 sequences from three species (human, mouse and rat). As a web application, UCbase & miRfunc is platform independent and it is accessible at http://microrna.osu.edu/.UCbase4
YEASTRACT: providing a programmatic access to curated transcriptional regulatory associations in Saccharomyces cerevisiae through a web services interface
The YEAst Search for Transcriptional Regulators And Consensus Tracking (YEASTRACT) information system (http://www.yeastract.com) was developed to support the analysis of transcription regulatory associations in Saccharomyces cerevisiae. Last updated in June 2010, this database contains over 48 200 regulatory associations between transcription factors (TFs) and target genes, including 298 specific DNA-binding sites for 110 characterized TFs. All regulatory associations stored in the database were revisited and detailed information on the experimental evidences that sustain those associations was added and classified as direct or indirect evidences. The inclusion of this new data, gathered in response to the requests of YEASTRACT users, allows the user to restrict its queries to subsets of the data based on the existence or not of experimental evidences for the direct action of the TFs in the promoter region of their target genes. Another new feature of this release is the availability of all data through a machine readable web-service interface. Users are no longer restricted to the set of available queries made available through the existing web interface, and can use the web service interface to query, retrieve and exploit the YEASTRACT data using their own implementation of additional functionalities. The YEASTRACT information system is further complemented with several computational tools that facilitate the use of the curated data when answering a number of important biological questions. Since its first release in 2006, YEASTRACT has been extensively used by hundreds of researchers from all over the world. We expect that by making the new data and services available, the system will continue to be instrumental for yeast biologists and systems biology researchers
- …
