104,407 research outputs found

    Characterization of chromatin accessibility with a transposome hypersensitive sites sequencing (THS-seq) assay.

    Get PDF
    Chromatin accessibility captures in vivo protein-chromosome binding status, and is considered an informative proxy for protein-DNA interactions. DNase I and Tn5 transposase assays require thousands to millions of fresh cells for comprehensive chromatin mapping. Applying Tn5 tagmentation to hundreds of cells results in sparse chromatin maps. We present a transposome hypersensitive sites sequencing assay for highly sensitive characterization of chromatin accessibility. Linear amplification of accessible DNA ends with in vitro transcription, coupled with an engineered Tn5 super-mutant, demonstrates improved sensitivity on limited input materials, and accessibility of small regions near distal enhancers, compared with ATAC-seq

    Extensive chromatin fragmentation improves enrichment of protein binding sites in chromatin immunoprecipitation experiments

    Get PDF
    Extensive sonication of formaldehyde-crosslinked chromatin can generate DNA fragments averaging 200 bp in length (range 75–300 bp). Fragmentation is largely random with respect to genomic region and nucleosome position. ChIP experiments employing such extensively fragmented samples show 2- to 4-fold increased enrichment of protein binding sites over control genomic regions, when compared to samples sonicated to a more conventional size range (300–500 bp). The basis of improved fold enrichments is that immunoprecipitation of protein-bound regions is unaffected by fragment size, whereas immunoprecipitation of control genomic regions decreases progressively along with reduced fragment size due to fewer nonspecific binding sites. The use of extensively sonicated samples improves mapping of protein binding sites, and it extends the dynamic range for quantitative measurements of histone density. We show that many yeast promoter regions are virtually devoid of histones

    Study of macromolecular interactions using computational solvent mapping

    Full text link
    The term "binding hot spots" refers to regions of a protein surface with large contributions to the binding free energy. Computational solvent mapping serves as an analog to the major experimental techniques developed for the identification of such hot spots using X-ray and nuclear magnetic resonance (NMR) methods. Applications of the fast Fourier-transform-based mapping algorithm FTMap show that similar binding hot spots also occur in DNA molecules and interact with small molecules that bind to DNA with high affinity. Solvent mapping results on B-DNA, with or without Hoogsteen (HG) base pairing, have revealed the significance of "HG breathing" on the reactivity of DNA with formaldehyde. Extending the method to RNA molecules, I applied the FTMap algorithm to flexible structures of HIV-1 transactivation response element (TAR) RNA and Tau exon 10 RNA. Results show that despite the extremely flexible nature of these small RNA molecules, nucleic acid bases that interact with ligands consistently have high hit rates, and thus binding sites can be successfully identified. Based on this experience as well as the prior work on DNA, I extended the FTMap algorithm to mapping nucleic acids and implemented it in an automated online server available to the research community. FTSite, a related server for finding binding sites of proteins, was also extended to develop PeptiMap, an accurate and robust protocol that can determine peptide binding sites on proteins. Analyses of structural ensembles of ligand-free proteins using solvent mapping have shown that such ensembles contain pre-existing binding hot spots, and that such hot spots can be identified without any a priori knowledge of the ligand-bound structure. Furthermore, the structures in the ensemble having the highest binding-site hit rate are closest to the ligand-bound structure, and a higher hit rate implies improved structural similarity between the unbound protein and its bound state, resulting in high correlation coefficient between the two measures. These advances should greatly enhance researchers' ability to identify functionally important interactions among biomolecules in silico

    A workflow for genome-wide mapping of archaeal transcription factors with ChIP-seq

    Get PDF
    Deciphering the structure of gene regulatory networks across the tree of life remains one of the major challenges in postgenomic biology. We present a novel ChIP-seq workflow for the archaea using the model organism Halobacterium salinarum sp. NRC-1 and demonstrate its application for mapping the genome-wide binding sites of natively expressed transcription factors. This end-to-end pipeline is the first protocol for ChIP-seq in archaea, with methods and tools for each stage from gene tagging to data analysis and biological discovery. Genome-wide binding sites for transcription factors with many binding sites (TfbD) are identified with sensitivity, while retaining specificity in the identification the smaller regulons (bacteriorhodopsin-activator protein). Chromosomal tagging of target proteins with a compact epitope facilitates a standardized and cost-effective workflow that is compatible with high-throughput immunoprecipitation of natively expressed transcription factors. The Pique package, an open-source bioinformatics method, is presented for identification of binding events. Relative to ChIP-Chip and qPCR, this workflow offers a robust catalog of protein–DNA binding events with improved spatial resolution and significantly decreased cost. While this study focuses on the application of ChIP-seq in H. salinarum sp. NRC-1, our workflow can also be adapted for use in other archaea and bacteria with basic genetic tools

    Enhanced maps of transcription factor binding sites improve regulatory networks learned from accessible chromatin data

    Get PDF
    Determining where transcription factors (TFs) bind in genomes provides insight into which transcriptional programs are active across organs, tissue types, and environmental conditions. Recent advances in high-throughput profiling of regulatory DNA have yielded large amounts of information about chromatin accessibility. Interpreting the functional significance of these data sets requires knowledge of which regulators are likely to bind these regions. This can be achieved by using information about TF-binding preferences, or motifs, to identify TF-binding events that are likely to be functional. Although different approaches exist to map motifs to DNA sequences, a systematic evaluation of these tools in plants is missing. Here, we compare four motif-mapping tools widely used in the Arabidopsis (Arabidopsis thaliana) research community and evaluate their performance using chromatin immunoprecipitation data sets for 40 TFs. Downstream gene regulatory network (GRN) reconstruction was found to be sensitive to the motif mapper used. We further show that the low recall of Find Individual Motif Occurrences, one of the most frequently used motif-mapping tools, can be overcome by using an Ensemble approach, which combines results from different mapping tools. Several examples are provided demonstrating how the Ensemble approach extends our view on transcriptional control for TFs active in different biological processes. Finally, a protocol is presented to effectively derive more complete cell type-specific GRNs through the integrative analysis of open chromatin regions, known binding site information, and expression data sets. This approach will pave the way to increase our understanding of GRNs in different cellular conditions

    Mapping the druggable allosteric space of G-protein coupled receptors: a fragment-based molecular dynamics approach.

    Get PDF
    To address the problem of specificity in G-protein coupled receptor (GPCR) drug discovery, there has been tremendous recent interest in allosteric drugs that bind at sites topographically distinct from the orthosteric site. Unfortunately, structure-based drug design of allosteric GPCR ligands has been frustrated by the paucity of structural data for allosteric binding sites, making a strong case for predictive computational methods. In this work, we map the surfaces of the beta1 (beta1AR) and beta2 (beta2AR) adrenergic receptor structures to detect a series of five potentially druggable allosteric sites. We employ the FTMAP algorithm to identify 'hot spots' with affinity for a variety of organic probe molecules corresponding to drug fragments. Our work is distinguished by an ensemble-based approach, whereby we map diverse receptor conformations taken from molecular dynamics (MD) simulations totaling approximately 0.5 micros. Our results reveal distinct pockets formed at both solvent-exposed and lipid-exposed cavities, which we interpret in light of experimental data and which may constitute novel targets for GPCR drug discovery. This mapping data can now serve to drive a combination of fragment-based and virtual screening approaches for the discovery of small molecules that bind at these sites and which may offer highly selective therapies

    Identifying Interaction Sites in "Recalcitrant" Proteins: Predicted Protein and Rna Binding Sites in Rev Proteins of Hiv-1 and Eiav Agree with Experimental Data

    Get PDF
    Protein-protein and protein nucleic acid interactions are vitally important for a wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses. We have developed machine learning approaches for predicting which amino acids of a protein participate in its interactions with other proteins and/or nucleic acids, using only the protein sequence as input. In this paper, we describe an application of classifiers trained on datasets of well-characterized protein-protein and protein-RNA complexes for which experimental structures are available. We apply these classifiers to the problem of predicting protein and RNA binding sites in the sequence of a clinically important protein for which the structure is not known: the regulatory protein Rev, essential for the replication of HIV-1 and other lentiviruses. We compare our predictions with published biochemical, genetic and partial structural information for HIV-1 and EIAV Rev and with our own published experimental mapping of RNA binding sites in EIAV Rev. The predicted and experimentally determined binding sites are in very good agreement. The ability to predict reliably the residues of a protein that directly contribute to specific binding events - without the requirement for structural information regarding either the protein or complexes in which it participates - can potentially generate new disease intervention strategies.Comment: Pacific Symposium on Biocomputing, Hawaii, In press, Accepted, 200

    From Nonspecific DNA–Protein Encounter Complexes to the Prediction of DNA–Protein Interactions

    Get PDF
    ©2009 Gao, Skolnick. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.doi:10.1371/journal.pcbi.1000341DNA–protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA–protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA–protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA–protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA–protein interaction modes exhibit some similarity to specific DNA–protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Ca deviation from native is up to 5 Å from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA–protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein

    Evaluation of protein surface roughness index using its heat denatured aggregates

    Get PDF
    Recent research works on potential of different protein surface describing parameters to predict protein surface properties gained significance for its possible implication in extracting clues on protein's functional site. In this direction, Surface Roughness Index, a surface topological parameter, showed its potential to predict SCOP-family of protein. The present work stands on the foundation of these works where a semi-empirical method for evaluation of Surface Roughness Index directly from its heat denatured protein aggregates (HDPA) was designed and demonstrated successfully. The steps followed consist, the extraction of a feature, Intensity Level Multifractal Dimension (ILMFD) from the microscopic images of HDPA, followed by the mapping of ILMFD into Surface Roughness Index (SRI) through recurrent backpropagation network (RBPN). Finally SRI for a particular protein was predicted by clustering of decisions obtained through feeding of multiple data into RBPN, to obtain general tendency of decision, as well as to discard the noisy dataset. The cluster centre of the largest cluster was found to be the best match for mapping of Surface Roughness Index of each protein in our study. The semi-empirical approach adopted in this paper, shows a way to evaluate protein's surface property without depending on its already evaluated structure
    corecore