214 research outputs found
Data-Driven Theory Refinement Algorithms for Bioinformatics
Bioinformatics and related applications call for efficient algorithms for knowledge intensive learning and data driven knowledge refinement. Knowledge based artificial neural networks offer an attractive approach to extending or modifying incomplete knowledge bases or domain theories. We present results of experiments with several such algorithms for data driven knowledge discovery and theory refinement in some simple bioinformatics applications. Results of experiments on the ribosome binding site and promoter site identification problems indicate that the performance of KBDistAl and Tiling Pyramid algorithms compares quite favorably with those of substantially more computationally demanding techniques
A motif-based method for predicting interfacial residues in both the RNA and protein components of protein-RNA complexes
Efforts to predict interfacial residues in protein-RNA complexes have largely focused on predicting RNA-binding residues in proteins. Computational methods for predicting protein-binding residues in RNA sequences, however, are a problem that has received relatively little attention to date. Although the value of sequence motifs for classifying and annotating protein sequences is well established, sequence motifs have not been widely applied to predicting interfacial residues in macromolecular complexes. Here, we propose a novel sequence motif-based method for āpartner-specificā interfacial residue prediction. Given a specific protein-RNA pair, the goal is to simultaneously predict RNA binding residues in the protein sequence and protein-binding residues in the RNA sequence. In 5-fold cross validation experiments, our method, PS-PRIP, achieved 92% Specificity and 61% Sensitivity, with a Matthews correlation coefficient (MCC) of 0.58 in predicting RNA-binding sites in proteins. The method achieved 69% Specificity and 75% Sensitivity, but with a low MCC of 0.13 in predicting protein binding sites in RNAs. Similar performance results were obtained when PS-PRIP was tested on two independent āblindā datasets of experimentally validated protein- RNA interactions, suggesting the method should be widely applicable and valuable for identifying potential interfacial residues in protein-RNA complexes for which structural information is not available. The PS-PRIP webserver and datasets are available at: http://pridb.gdcb.iastate.edu/PSPRIP/
Comparing Kernels For Predicting Protein Binding Sites From Amino Acid Sequence
The ability to identify protein binding sites and to detect specific amino acid residues that contribute to the specificity and affinity of protein interactions has important implications for problems ranging from rational drug design to analysis of metabolic and signal transduction networks. Support vector machines (SVM) and related kernel methods offer an attractive approach to predicting protein binding sites. An appropriate choice of the kernel function is critical to the performance of SVM. Kernel functions offer a way to incorporate domain-specific knowledge into the classifier. We compare the performance of 3 types of kernels functions: identity kernel, sequence-alignment kernel, and amino acid substitution matrix kernel for predicting protein-protein, protein-DNA and protein-RNA binding sites. The results show that the identity kernel is quite effective in on all three tasks, with the substitution kernel based on amino acid substitution matrices that take into account structural or evolutionary conservation or physicochemical properties of amino acids yields modest improvement in the performance of the resulting SVM classifiers for predicting protein-protein, protein-DNA and protein-RNA binding sites
BRST-antifield-treatment of metric-affine gravity
The metric-affine gauge theory of gravity provides a broad framework in which
gauge theories of gravity can be formulated. In this article we fit
metric-affine gravity into the covariant BRST--antifield formalism in order to
obtain gauge fixed quantum actions. As an example the gauge fixing of a general
two-dimensional model of metric-affine gravity is worked out explicitly. The
result is shown to contain the gauge fixed action of the bosonic string in
conformal gauge as a special case.Comment: 19 pages LATEX, to appear in Phys. Rev.
Sequence-Based Prediction of RNA-Binding Residues in Proteins
Identifying individual residues in the interfaces of proteināRNA complexes is important for understanding the molecular determinants of proteināRNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in proteināRNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known proteināRNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner
Template-based proteināprotein docking exploiting pairwise interfacial residue restraints
Although many advanced and sophisticated ab initio approaches for modeling proteināprotein complexes have been proposed in past decades, template-based modeling (TBM) remains the most accurate and widely used approach, given a reliable template is available. However, there are many different ways to exploit template information in the modeling process. Here, we systematically evaluate and benchmark a TBM method that uses conserved interfacial residue pairs as docking distance restraints [referred to as alpha carbonāalpha carbon (CA-CA)-guided docking]. We compare it with two other template-based proteināprotein modeling approaches, including a conserved non-pairwise interfacial residue restrained docking approach [referred to as the ambiguous interaction restraint (AIR)-guided docking] and a simple superposition-based modeling approach. Our results show that, for most cases, the CA-CA-guided docking method outperforms both superposition with refinement and the AIR-guided docking method. We emphasize the superiority of the CA-CA-guided docking on cases with medium to large conformational changes, and interactions mediated through loops, tails or disordered regions. Our results also underscore the importance of a proper refinement of superimposition models to reduce steric clashes. In summary, we provide a benchmarked TBM protocol that uses conserved pairwise interface distance as restraints in generating realistic 3D proteināprotein interaction models, when reliable templates are available. The described CA-CA-guided docking protocol is based on the HADDOCK platform, which allows users to incorporate additional prior knowledge of the target system to further improve the quality of the resulting models
Identifying Interaction Sites in "Recalcitrant" Proteins: Predicted Protein and Rna Binding Sites in Rev Proteins of Hiv-1 and Eiav Agree with Experimental Data
Protein-protein and protein nucleic acid interactions are vitally important
for a wide range of biological processes, including regulation of gene
expression, protein synthesis, and replication and assembly of many viruses. We
have developed machine learning approaches for predicting which amino acids of
a protein participate in its interactions with other proteins and/or nucleic
acids, using only the protein sequence as input. In this paper, we describe an
application of classifiers trained on datasets of well-characterized
protein-protein and protein-RNA complexes for which experimental structures are
available. We apply these classifiers to the problem of predicting protein and
RNA binding sites in the sequence of a clinically important protein for which
the structure is not known: the regulatory protein Rev, essential for the
replication of HIV-1 and other lentiviruses. We compare our predictions with
published biochemical, genetic and partial structural information for HIV-1 and
EIAV Rev and with our own published experimental mapping of RNA binding sites
in EIAV Rev. The predicted and experimentally determined binding sites are in
very good agreement. The ability to predict reliably the residues of a protein
that directly contribute to specific binding events - without the requirement
for structural information regarding either the protein or complexes in which
it participates - can potentially generate new disease intervention strategies.Comment: Pacific Symposium on Biocomputing, Hawaii, In press, Accepted, 200
DockRank: Ranking docked conformations using partner-specific sequence homology-based protein interface prediction
Selecting near-native conformations from the immense number of conformations generated by docking programs remains a major challenge in molecular docking. We introduce DockRank, a novel approach to scoring docked conformations based on the degree to which the interface residues of the docked conformation match a set of predicted interface residues. Dock-Rank uses interface residues predicted by partner-specific sequence homology-based proteināprotein interface predictor (PS-HomPPI), which predicts the interface residues of a query protein with a specific interaction partner. We compared the performance of DockRank with several state-of-the-art docking scoring functions using Success Rate (the percentage of cases that have at least one near-native conformation among the top m conformations) and Hit Rate (the percentage of near-native conformations that are included among the top m conformations). In cases where it is possible to obtain partner-specific (PS) interface predictions from PS-HomPPI, DockRank consistently outperforms both (i) ZRank and IRAD, two state-of-the-art energy-based scoring functions (improving Success Rate by up to 4-fold); and (ii) Variants of DockRank that use predicted interface residues obtained from several protein interface predictors that do not take into account the binding partner in making interface predictions (improving success rate by up to 39-fold). The latter result underscores the importance of using partner-specific interface residues in scoring docked conformations. We show that DockRank, when used to re-rank the conformations returned by ClusPro, improves upon the original ClusPro rankings in terms of both Success Rate and Hit Rate. DockRank is available as a server at http://einstein.cs.iastate.edu/DockRank/.
Zinc Finger Targeter (ZiFiT): an engineered zinc finger/target site design tool
Zinc Finger Targeter (ZiFiT) is a simple and intuitive web-based tool that facilitates the design of zinc finger proteins (ZFPs) that can bind to specific DNA sequences. The current version of ZiFiT is based on a widely employed method of ZFP design, the āmodular assemblyā approach, in which pre-existing individual zinc fingers are linked together to recognize desired target DNA sequences. Several research groups have described experimentally characterized zinc finger modules that bind many of the 64 possible DNA triplets. ZiFiT leverages the combined capabilities of three of the largest and best characterized module archives by enabling users to select fingers from any of these sets. ZiFiT searches a query DNA sequence for target sites for which a ZFP can be designed using modules available in one or more of the three archives. In addition, ZiFiT output facilitates identification of specific zinc finger modules that are publicly available from the Zinc Finger Consortium. ZiFiT is freely available at http://bindr.gdcb.iastate.edu/ZiFiT/
Computational modeling suggests dimerization of equine infectious anemia virus Rev is required for RNA binding
Background
The lentiviral Rev protein mediates nuclear export of intron-containing viral RNAs that encode structural proteins or serve as the viral genome. Following translation, HIV-1 Rev localizes to the nucleus and binds its cognate sequence, termed the Rev-responsive element (RRE), in incompletely spliced viral RNA. Rev subsequently multimerizes along the viral RNA and associates with the cellular Crm1 export machinery to translocate the RNA-protein complex to the cytoplasm. Equine infectious anemia virus (EIAV) Rev is functionally homologous to HIV-1 Rev, but shares very little sequence similarity and differs in domain organization. EIAV Rev also contains a bipartite RNA binding domain comprising two short arginine-rich motifs (designated ARM-1 and ARM-2) spaced 79 residues apart in the amino acid sequence. To gain insight into the topology of the bipartite RNA binding domain, a computational approach was used to model the tertiary structure of EIAV Rev. Results
The tertiary structure of EIAV Rev was modeled using several protein structure prediction and model quality assessment servers. Two types of structures were predicted: an elongated structure with an extended central alpha helix, and a globular structure with a central bundle of helices. Assessment of models on the basis of biophysical properties indicated they were of average quality. In almost all models, ARM-1 and ARM-2 were spatially separated by \u3e15 Ć
, suggesting that they do not form a single RNA binding interface on the monomer. A highly conserved canonical coiled-coil motif was identified in the central region of EIAV Rev, suggesting that an RNA binding interface could be formed through dimerization of Rev and juxtaposition of ARM-1 and ARM-2. In support of this, purified Rev protein migrated as a dimer in Blue native gels, and mutation of a residue predicted to form a key coiled-coil contact disrupted dimerization and abrogated RNA binding. In contrast, mutation of residues outside the predicted coiled-coil interface had no effect on dimerization or RNA binding. Conclusions
Our results suggest that EIAV Rev binding to the RRE requires dimerization via a coiled-coil motif to juxtapose two RNA binding motifs, ARM-1 and ARM-2
- ā¦