46,610 research outputs found
From Nonspecific DNA–Protein Encounter Complexes to the Prediction of DNA–Protein Interactions
©2009 Gao, Skolnick. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.doi:10.1371/journal.pcbi.1000341DNA–protein interactions are involved in many essential biological activities. Because there is no simple mapping code between DNA base pairs and protein amino acids, the prediction of DNA–protein interactions is a challenging problem. Here, we present a novel computational approach for predicting DNA-binding protein residues and DNA–protein interaction modes without knowing its specific DNA target sequence. Given the structure of a DNA-binding protein, the method first generates an ensemble of complex structures obtained by rigid-body docking with a nonspecific canonical B-DNA. Representative models are subsequently selected through clustering and ranking by their DNA–protein interfacial energy. Analysis of these encounter complex models suggests that the recognition sites for specific DNA binding are usually favorable interaction sites for the nonspecific DNA probe and that nonspecific DNA–protein interaction modes exhibit some similarity to specific DNA–protein binding modes. Although the method requires as input the knowledge that the protein binds DNA, in benchmark tests, it achieves better performance in identifying DNA-binding sites than three previously established methods, which are based on sophisticated machine-learning techniques. We further apply our method to protein structures predicted through modeling and demonstrate that our method performs satisfactorily on protein models whose root-mean-square Ca deviation from native is up to 5 Å from their native structures. This study provides valuable structural insights into how a specific DNA-binding protein interacts with a nonspecific DNA sequence. The similarity between the specific DNA–protein interaction mode and nonspecific interaction modes may reflect an important sampling step in search of its specific DNA targets by a DNA-binding protein
Predicting variation of DNA shape preferences in protein-DNA interaction in cancer cells with a new biophysical model
DNA shape readout is an important mechanism of target site recognition by
transcription factors, in addition to the sequence readout. Several models of
transcription factor-DNA binding which consider DNA shape have been developed
in recent years. We present a new biophysical model of protein-DNA interaction
by considering the DNA shape features, which is based on a neighbour
dinucleotide dependency model BayesPI2. The parameters of the new model are
restricted to a subspace spanned by the 2-mer DNA shape features, which
allowing a biophysical interpretation of the new parameters as
position-dependent preferences towards certain values of the features. Using
the new model, we explore the variation of DNA shape preferences in several
transcription factors across cancer cell lines and cellular conditions. We find
evidence of DNA shape variations at FOXA1 binding sites in MCF7 cells after
treatment with steroids. The new model is useful for elucidating finer details
of transcription factor-DNA interaction. It may be used to improve the
prediction of cancer mutation effects in the future
Local pre-processing for node classification in networks : application in protein-protein interaction
Network modelling provides an increasingly popular conceptualisation in a wide range of domains, including the analysis of protein structure. Typical approaches to analysis model parameter values at nodes within the network. The spherical locality around a node provides a microenvironment that can be used to characterise an area of a network rather than a particular point within it. Microenvironments that centre on the nodes in a protein chain can be used to quantify parameters that are related to protein functionality. They also permit particular patterns of such parameters in node-centred microenvironments to be used to locate sites of particular interest. This paper evaluates an approach to index generation that seeks to rapidly construct microenvironment data. The results show that index generation performs best when the radius of microenvironments matches the granularity of the index. Results are presented to show that such microenvironments improve the utility of protein chain parameters in classifying the structural characteristics of nodes using both support vector machines and neural networks
A flexible integrative approach based on random forest improves prediction of transcription factor binding sites
Transcription factor binding sites (TFBSs) are DNA sequences of 6-15 base pairs. Interaction of these TFBSs with transcription factors (TFs) is largely responsible for most spatiotemporal gene expression patterns. Here, we evaluate to what extent sequence-based prediction of TFBSs can be improved by taking into account the positional dependencies of nucleotides (NPDs) and the nucleotide sequence-dependent structure of DNA. We make use of the random forest algorithm to flexibly exploit both types of information. Results in this study show that both the structural method and the NPD method can be valuable for the prediction of TFBSs. Moreover, their predictive values seem to be complementary, even to the widely used position weight matrix (PWM) method. This led us to combine all three methods. Results obtained for five eukaryotic TFs with different DNA-binding domains show that our method improves classification accuracy for all five eukaryotic TFs compared with other approaches. Additionally, we contrast the results of seven smaller prokaryotic sets with high-quality data and show that with the use of high-quality data we can significantly improve prediction performance. Models developed in this study can be of great use for gaining insight into the mechanisms of TF binding
RosettaBackrub--a web server for flexible backbone protein structure modeling and design.
The RosettaBackrub server (http://kortemmelab.ucsf.edu/backrub) implements the Backrub method, derived from observations of alternative conformations in high-resolution protein crystal structures, for flexible backbone protein modeling. Backrub modeling is applied to three related applications using the Rosetta program for structure prediction and design: (I) modeling of structures of point mutations, (II) generating protein conformational ensembles and designing sequences consistent with these conformations and (III) predicting tolerated sequences at protein-protein interfaces. The three protocols have been validated on experimental data. Starting from a user-provided single input protein structure in PDB format, the server generates near-native conformational ensembles. The predicted conformations and sequences can be used for different applications, such as to guide mutagenesis experiments, for ensemble-docking approaches or to generate sequence libraries for protein design
Recommended from our members
Bind-n-Seq: high-throughput analysis of in vitro protein-DNA interactions using massively parallel sequencing.
Transcription factor-DNA interactions are some of the most important processes in biology because they directly control hereditary information. The targets of most transcription factor are unknown. In this report, we introduce Bind-n-Seq, a new high-throughput method for analyzing protein-DNA interactions in vitro, with several advantages over current methods. The procedure has three steps (i) binding proteins to randomized oligonucleotide DNA targets, (ii) sequencing the bound oligonucleotide with massively parallel technology and (iii) finding motifs among the sequences. De novo binding motifs determined by this method for the DNA-binding domains of two well-characterized zinc-finger proteins were similar to those described previously. Furthermore, calculations of the relative affinity of the proteins for specific DNA sequences correlated significantly with previous studies (R(2 )= 0.9). These results present Bind-n-Seq as a highly rapid and parallel method for determining in vitro binding sites and relative affinities
- …