78 research outputs found

    Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site

    Get PDF
    We introduce a novel method to screen the promoters of a set of genes with shared biological function, against a precompiled library of motifs, and find those motifs which are statistically over-represented in the gene set. The gene sets were obtained from the functional Gene Ontology (GO) classification; for each set and motif we optimized the sequence similarity score threshold, independently for every location window (measured with respect to the TSS), taking into account the location dependent nucleotide heterogeneity along the promoters of the target genes. We performed a high throughput analysis, searching the promoters (from 200bp downstream to 1000bp upstream the TSS), of more than 8000 human and 23,000 mouse genes, for 134 functional Gene Ontology classes and for 412 known DNA motifs. When combined with binding site and location conservation between human and mouse, the method identifies with high probability functional binding sites that regulate groups of biologically related genes. We found many location-sensitive functional binding events and showed that they clustered close to the TSS. Our method and findings were put to several experimental tests. By allowing a "flexible" threshold and combining our functional class and location specific search method with conservation between human and mouse, we are able to identify reliably functional TF binding sites. This is an essential step towards constructing regulatory networks and elucidating the design principles that govern transcriptional regulation of expression. The promoter region proximal to the TSS appears to be of central importance for regulation of transcription in human and mouse, just as it is in bacteria and yeast.Comment: 31 pages, including Supplementary Information and figure

    Reverse Engineering the Yeast RNR1 Transcriptional Control System

    Get PDF
    Transcription is controlled by multi-protein complexes binding to short non-coding regions of genomic DNA. These complexes interact combinatorially. A major goal of modern biology is to provide simple models that predict this complex behavior. The yeast gene RNR1 is transcribed periodically during the cell cycle. Here, we present a pilot study to demonstrate a new method of deciphering the logic behind transcriptional regulation. We took regular samples from cell cycle synchronized cultures of Saccharomyces cerevisiae and extracted nuclear protein. We tested these samples to measure the amount of protein that bound to seven different 16 base pair sequences of DNA that have been previously identified as protein binding locations in the promoter of the RNR1 gene. These tests were performed using surface plasmon resonance. We found that the surface plasmon resonance signals showed significant variation throughout the cell cycle. We correlated the protein binding data with previously published mRNA expression data and interpreted this to show that transcription requires protein bound to a particular site and either five different sites or one additional sites. We conclude that this demonstrates the feasibility of this approach to decipher the combinatorial logic of transcription

    Impact of disaster-related mortality on gross domestic product in the WHO African Region

    Get PDF
    BACKGROUND: Disaster-related mortality is a growing public health concern in the African Region. These deaths are hypothesized to have a significantly negative effect on per capita gross domestic product (GDP). The objective of this study was to estimate the loss in GDP attributable to natural and technological disaster-related mortality in the WHO African Region. METHODS: The impact of disaster-related mortality on GDP was estimated using double-log econometric model and cross-sectional data on various Member States in the WHO African Region. The analysis was based on 45 of the 46 countries in the Region. The data was obtained from various UNDP and World Bank publications. RESULTS: The coefficients for capital (K), educational enrolment (EN), life expectancy (LE) and exports (X) had a positive sign; while imports (M) and disaster mortality (DS) were found to impact negatively on GDP. The above-mentioned explanatory variables were found to have a statistically significant effect on GDP at 5% level in a t-distribution test. Disaster mortality of a single person was found to reduce GDP by US$0.01828. CONCLUSIONS: We have demonstrated that disaster-related mortality has a significant negative effect on GDP. Thus, as policy-makers strive to increase GDP through capital investment, export promotion and increased educational enrolment, they should always keep in mind that investments made in the strengthening of national capacity to mitigate the effects of national disasters expeditiously and effectively will yield significant economic returns

    An Integrated Approach to Identifying Cis-Regulatory Modules in the Human Genome

    Get PDF
    In eukaryotic genomes, it is challenging to accurately determine target sites of transcription factors (TFs) by only using sequence information. Previous efforts were made to tackle this task by considering the fact that TF binding sites tend to be more conserved than other functional sites and the binding sites of several TFs are often clustered. Recently, ChIP-chip and ChIP-sequencing experiments have been accumulated to identify TF binding sites as well as survey the chromatin modification patterns at the regulatory elements such as promoters and enhancers. We propose here a hidden Markov model (HMM) to incorporate sequence motif information, TF-DNA interaction data and chromatin modification patterns to precisely identify cis-regulatory modules (CRMs). We conducted ChIP-chip experiments on four TFs, CREB, E2F1, MAX, and YY1 in 1% of the human genome. We then trained a hidden Markov model (HMM) to identify the labels of the CRMs by incorporating the sequence motifs recognized by these TFs and the ChIP-chip ratio. Chromatin modification data was used to predict the functional sites and to further remove false positives. Cross-validation showed that our integrated HMM had a performance superior to other existing methods on predicting CRMs. Incorporating histone signature information successfully penalized false prediction and improved the whole performance. The dataset we used and the software are available at http://nash.ucsd.edu/CIS/

    Assigning Backbone NMR Resonances for Full Length Tau Isoforms: Efficient Compromise between Manual Assignments and Reduced Dimensionality

    Get PDF
    Tau protein is the longest disordered protein for which nearly complete backbone NMR resonance assignments have been reported. Full-length tau protein was initially assigned using a laborious combination of bootstrapping assignments from shorter tau fragments and conventional triple resonance NMR experiments. Subsequently it was reported that assignments of comparable quality could be obtained in a fully automated fashion from data obtained using reduced dimensionality NMR (RDNMR) experiments employing a large number of indirect dimensions. Although the latter strategy offers many advantages, it presents some difficulties if manual intervention, confirmation, or correction of the assignments is desirable, as may often be the case for long disordered and degenerate polypeptide sequences. Here we demonstrate that nearly complete backbone resonance assignments for full-length tau isoforms can be obtained without resorting either to bootstrapping from smaller fragments or to very high dimensionality experiments and automation. Instead, a set of RDNMR triple resonance experiments of modest dimensionality lend themselves readily to efficient and unambiguous manual assignments. An analysis of the backbone chemical shifts obtained in this fashion indicates several regions in full length tau with a notable propensity for helical or strand-like structure that are in good agreement with previous observations

    Integrating Phosphorylation Network with Transcriptional Network Reveals Novel Functional Relationships

    Get PDF
    Phosphorylation and transcriptional regulation events are critical for cells to transmit and respond to signals. In spite of its importance, systems-level strategies that couple these two networks have yet to be presented. Here we introduce a novel approach that integrates the physical and functional aspects of phosphorylation network together with the transcription network in S.cerevisiae, and demonstrate that different network motifs are involved in these networks, which should be considered in interpreting and integrating large scale datasets. Based on this understanding, we introduce a HeRS score (hetero-regulatory similarity score) to systematically characterize the functional relevance of kinase/phosphatase involvement with transcription factor, and present an algorithm that predicts hetero-regulatory modules. When extended to signaling network, this approach confirmed the structure and cross talk of MAPK pathways, inferred a novel functional transcription factor Sok2 in high osmolarity glycerol pathway, and explained the mechanism of reduced mating efficiency upon Fus3 deletion. This strategy is applicable to other organisms as large-scale datasets become available, providing a means to identify the functional relationships between kinases/phosphatases and transcription factors

    Microarray Profiling of Phage-Display Selections for Rapid Mapping of Transcription Factor–DNA Interactions

    Get PDF
    Modern computational methods are revealing putative transcription-factor (TF) binding sites at an extraordinary rate. However, the major challenge in studying transcriptional networks is to map these regulatory element predictions to the protein transcription factors that bind them. We have developed a microarray-based profiling of phage-display selection (MaPS) strategy that allows rapid and global survey of an organism's proteome for sequence-specific interactions with such putative DNA regulatory elements. Application to a variety of known yeast TF binding sites successfully identified the cognate TF from the background of a complex whole-proteome library. These factors contain DNA-binding domains from diverse families, including Myb, TEA, MADS box, and C2H2 zinc-finger. Using MaPS, we identified Dot6 as a trans-active partner of the long-predicted orphan yeast element Polymerase A & C (PAC). MaPS technology should enable rapid and proteome-scale study of bi-molecular interactions within transcriptional networks

    A Feature-Based Approach to Modeling Protein–DNA Interactions

    Get PDF
    Transcription factor (TF) binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position specific scoring matrix (PSSM), which assumes independence between binding positions. However, in many cases, this simplifying assumption does not hold. Here, we present feature motif models (FMMs), a novel probabilistic method for modeling TF–DNA interactions, based on log-linear models. Our approach uses sequence features to represent TF binding specificities, where each feature may span multiple positions. We develop the mathematical formulation of our model and devise an algorithm for learning its structural features from binding site data. We also developed a discriminative motif finder, which discovers de novo FMMs that are enriched in target sets of sequences compared to background sets. We evaluate our approach on synthetic data and on the widely used TF chromatin immunoprecipitation (ChIP) dataset of Harbison et al. We then apply our algorithm to high-throughput TF ChIP data from mouse and human, reveal sequence features that are present in the binding specificities of mouse and human TFs, and show that FMMs explain TF binding significantly better than PSSMs. Our FMM learning and motif finder software are available at http://genie.weizmann.ac.il/

    WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

    Get PDF
    BACKGROUND: This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. RESULTS: We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. CONCLUSION: Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes

    Sequential Logic Model Deciphers Dynamic Transcriptional Control of Gene Expressions

    Get PDF
    Cellular signaling involves a sequence of events from ligand binding to membrane receptors through transcription factors activation and the induction of mRNA expression. The transcriptional-regulatory system plays a pivotal role in the control of gene expression. A novel computational approach to the study of gene regulation circuits is presented here.Based on the concept of finite state machine, which provides a discrete view of gene regulation, a novel sequential logic model (SLM) is developed to decipher control mechanisms of dynamic transcriptional regulation of gene expressions. The SLM technique is also used to systematically analyze the dynamic function of transcriptional inputs, the dependency and cooperativity, such as synergy effect, among the binding sites with respect to when, how much and how fast the gene of interest is expressed. expression and additional activities of binding sites are required. Further analyses suggest detailed mechanism of R switch activity where indirect dependency occurs in between UI activity and R switch during specification to differentiation stage. is a promising step for further application of the proposed method
    • …
    corecore