69 research outputs found
Unraveling proteins: A molecular mechanics study
An internal coordinate molecular mechanics study of unfolding peptide chains by external stretching has been carried out to predict the type of force spectra that may be expected from single-molecule manipulation experiments currently being prepared. Rather than modeling the stretching of a given protein, we have looked at the behavior of simple secondary structure elements (alpha-helix, beta-ribbon, and interacting alpha-helices) to estimate the magnitude of the forces involved in their unfolding or separation and the dependence of these forces on the way pulling is carried out as well as on the length of the structural elements. The results point to a hierarchy of forces covering a surprisingly large range and to important orientational effects in the response to external stress
Protein Peeling 2: a web server to convert protein structures into series of protein units
Protein Peeling 2 (PP2) is a web server for the automatic identification of protein units (PUs) given the 3D coordinates of a protein. PUs are an intermediate level of protein structure description between protein domains and secondary structures. It is a new tool to better understand and analyze the organization of protein structures. PP2 uses only the matrices of protein contact probabilities and cuts the protein structures optimally using Matthews' coefficient correlation. An index assesses the compactness quality of each PU. Results are given both textually and graphically using JMol and PyMol softwares. The server can be accessed from
Aminopeptidase B, a glucagon-processing enzyme: site directed mutagenesis of the Zn2+-binding motif and molecular modelling
<p>Abstract</p> <p>Background</p> <p>Aminopeptidase B (Ap-B; EC 3.4.11.6) catalyzes the cleavage of basic residues at the N-terminus of peptides and processes glucagon into miniglucagon. The enzyme exhibits, <it>in vitro</it>, a residual ability to hydrolyze leukotriene A<sub>4 </sub>into the pro-inflammatory lipid mediator leukotriene B<sub>4</sub>. The potential bi-functional nature of Ap-B is supported by close structural relationships with LTA<sub>4 </sub>hydrolase (LTA<sub>4</sub>H ; EC 3.3.2.6). A structure-function analysis is necessary for the detailed understanding of the enzymatic mechanisms of Ap-B and to design inhibitors, which could be used to determine the complete <it>in vivo </it>functions of the enzyme.</p> <p>Results</p> <p>The rat Ap-B cDNA was expressed in <it>E. coli </it>and the purified recombinant enzyme was characterized. 18 mutants of the H<sup>325</sup>EXXHX<sub>18</sub>E<sup>348 </sup>Zn<sup>2+</sup>-binding motif were constructed and expressed. All mutations were found to abolish the aminopeptidase activity. A multiple alignment of 500 sequences of the M1 family of aminopeptidases was performed to identify 3 sub-families of exopeptidases and to build a structural model of Ap-B using the x-ray structure of LTA<sub>4</sub>H as a template. Although the 3D structures of the two enzymes resemble each other, they differ in certain details. The role that a loop, delimiting the active center of Ap-B, plays in discriminating basic substrates, as well as the function of consensus motifs, such as RNP1 and Armadillo domain are discussed. Examination of electrostatic potentials and hydrophobic patches revealed important differences between Ap-B and LTA<sub>4</sub>H and suggests that Ap-B is involved in protein-protein interactions.</p> <p>Conclusion</p> <p>Alignment of the primary structures of the M1 family members clearly demonstrates the existence of different sub-families and highlights crucial residues in the enzymatic activity of the whole family. <it>E. coli </it>recombinant enzyme and Ap-B structural model constitute powerful tools for investigating the importance and possible roles of these conserved residues in Ap-B, LTA<sub>4</sub>H and M1 aminopeptidase catalytic sites and to gain new insight into their physiological functions. Analysis of Ap-B structural model indicates that several interactions between Ap-B and proteins can occur and suggests that endopeptidases might form a complex with Ap-B during hormone processing.</p
Quality measures for protein alignment benchmarks
Multiple protein sequence alignment methods are central to many applications in molecular biology. These methods are typically assessed on benchmark datasets including BALIBASE, OXBENCH, PREFAB and SABMARK, which are important to biologists in making informed choices between programs. In this article, annotations of domain homology and secondary structure are used to define new measures of alignment quality and are used to make the first systematic, independent evaluation of these benchmarks. These measures indicate sensitivity and specificity while avoiding the ambiguous residue correspondences and arbitrary distance cutoffs inherent to structural superpositions. Alignments by selected methods that indicate high-confidence columns (ALIGN-M, DIALIGN-T, FSA and MUSCLE) are also assessed. Fold space coverage and effective benchmark database sizes are estimated by reference to domain annotations, and significant redundancy is found in all benchmarks except SABMARK. Questionable alignments are found in all benchmarks, especially in BALIBASE where 87% of sequences have unknown structure, 20% of columns contain different folds according to SUPERFAMILY and 30% of ‘core block’ columns have conflicting secondary structure according to DSSP. A careful analysis of current protein multiple alignment benchmarks calls into question their ability to determine reliable algorithm rankings
svmPRAT: SVM-based Protein Residue Annotation Toolkit
<p>Abstract</p> <p>Background</p> <p>Over the last decade several prediction methods have been developed for determining the structural and functional properties of individual protein residues using sequence and sequence-derived information. Most of these methods are based on support vector machines as they provide accurate and generalizable prediction models.</p> <p>Results</p> <p>We present a general purpose protein residue annotation toolkit (<it>svm</it><monospace>PRAT</monospace>) to allow biologists to formulate residue-wise prediction problems. <it>svm</it><monospace>PRAT</monospace> formulates the annotation problem as a classification or regression problem using support vector machines. One of the key features of <it>svm</it><monospace>PRAT</monospace> is its ease of use in incorporating any user-provided information in the form of feature matrices. For every residue <it>svm</it><monospace>PRAT</monospace> captures local information around the reside to create fixed length feature vectors. <it>svm</it><monospace>PRAT</monospace> implements accurate and fast kernel functions, and also introduces a flexible window-based encoding scheme that accurately captures signals and pattern for training effective predictive models.</p> <p>Conclusions</p> <p>In this work we evaluate <it>svm</it><monospace>PRAT</monospace> on several classification and regression problems including disorder prediction, residue-wise contact order estimation, DNA-binding site prediction, and local structure alphabet prediction. <it>svm</it><monospace>PRAT</monospace> has also been used for the development of state-of-the-art transmembrane helix prediction method called TOPTMH, and secondary structure prediction method called YASSPP. This toolkit developed provides practitioners an efficient and easy-to-use tool for a wide variety of annotation problems.</p> <p><it>Availability</it>: <url>http://www.cs.gmu.edu/~mlbio/svmprat</url></p
Automated Alphabet Reduction for Protein Datasets
<p>Abstract</p> <p>Background</p> <p>We investigate automated and generic alphabet reduction techniques for protein structure prediction datasets. Reducing alphabet cardinality without losing key biochemical information opens the door to potentially faster machine learning, data mining and optimization applications in structural bioinformatics. Furthermore, reduced but informative alphabets often result in, e.g., more compact and human-friendly classification/clustering rules. In this paper we propose a robust and sophisticated alphabet reduction protocol based on mutual information and state-of-the-art optimization techniques.</p> <p>Results</p> <p>We applied this protocol to the prediction of two protein structural features: contact number and relative solvent accessibility. For both features we generated alphabets of two, three, four and five letters. The five-letter alphabets gave prediction accuracies statistically similar to that obtained using the full amino acid alphabet. Moreover, the automatically designed alphabets were compared against other reduced alphabets taken from the literature or human-designed, outperforming them. The differences between our alphabets and the alphabets taken from the literature were quantitatively analyzed. All the above process had been performed using a primary sequence representation of proteins. As a final experiment, we extrapolated the obtained five-letter alphabet to reduce a, much richer, protein representation based on evolutionary information for the prediction of the same two features. Again, the performance gap between the full representation and the reduced representation was small, showing that the results of our automated alphabet reduction protocol, even if they were obtained using a simple representation, are also able to capture the crucial information needed for state-of-the-art protein representations.</p> <p>Conclusion</p> <p>Our automated alphabet reduction protocol generates competent reduced alphabets tailored specifically for a variety of protein datasets. This process is done without any domain knowledge, using information theory metrics instead. The reduced alphabets contain some unexpected (but sound) groups of amino acids, thus suggesting new ways of interpreting the data.</p
Assignment of PolyProline II Conformation and Analysis of Sequence – Structure Relationship
International audienceBACKGROUND: Secondary structures are elements of great importance in structural biology, biochemistry and bioinformatics. They are broadly composed of two repetitive structures namely α-helices and β-sheets, apart from turns, and the rest is associated to coil. These repetitive secondary structures have specific and conserved biophysical and geometric properties. PolyProline II (PPII) helix is yet another interesting repetitive structure which is less frequent and not usually associated with stabilizing interactions. Recent studies have shown that PPII frequency is higher than expected, and they could have an important role in protein - protein interactions. METHODOLOGY/PRINCIPAL FINDINGS: A major factor that limits the study of PPII is that its assignment cannot be carried out with the most commonly used secondary structure assignment methods (SSAMs). The purpose of this work is to propose a PPII assignment methodology that can be defined in the frame of DSSP secondary structure assignment. Considering the ambiguity in PPII assignments by different methods, a consensus assignment strategy was utilized. To define the most consensual rule of PPII assignment, three SSAMs that can assign PPII, were compared and analyzed. The assignment rule was defined to have a maximum coverage of all assignments made by these SSAMs. Not many constraints were added to the assignment and only PPII helices of at least 2 residues length are defined. CONCLUSIONS/SIGNIFICANCE: The simple rules designed in this study for characterizing PPII conformation, lead to the assignment of 5% of all amino as PPII. Sequence - structure relationships associated with PPII, defined by the different SSAMs, underline few striking differences. A specific study of amino acid preferences in their N and C-cap regions was carried out as their solvent accessibility and contact patterns. Thus the assignment of PPII can be coupled with DSSP and thus opens a simple way for further analysis in this field
Describing protein structure: a general algorithm yielding complete helicoidal parameters and a unique overall axis.
International audienceWe present a general and mathematically rigorous algorithm which allows the helicoidal structure of a protein to be calculated starting from the atomic coordinates of its peptide backbone. This algorithm yields a unique curved axis which quantifies the folding of the backbone and a full set of helicoidal parameters describing the location of each peptide unit. The parameters obtained form a complete and independent set and can therefore be used for analyzing, comparing, or reconstructing protein backbone geometry. This algorithm has been implemented in a computer program named P-Curve. Several examples of its possible applications are discussed.We present a general and mathematically rigorous algorithm which allows the helicoidal structure of a protein to be calculated starting from the atomic coordinates of its peptide backbone. This algorithm yields a unique curved axis which quantifies the folding of the backbone and a full set of helicoidal parameters describing the location of each peptide unit. The parameters obtained form a complete and independent set and can therefore be used for analyzing, comparing, or reconstructing protein backbone geometry. This algorithm has been implemented in a computer program named P-Curve. Several examples of its possible applications are discussed
On the orientation of a designed transmembrane peptide: toward the right tilt angle?
The orientation of the transmembrane peptide WALP23 under small hydrophobic mismatch has been assessed through long-time-scale molecular dynamics simulations of hundreds of nanoseconds. Each simulation gives systematically large tilt angles (>30°). In addition, the peptide visits various azimuthal rotations that mostly depend on the initial conditions and converge very slowly. In contrast, small tilt angles as well as a well-defined azimuthal rotation were suggested by recent solid-state 2H NMR studies on the same system. To optimally compare our simulations with NMR data, we concatenated the different trajectories in order to increase the sampling. The agreement with 2H NMR quadrupolar splittings is spectacularly better when these latter are back-calculated from the concatenated trajectory than from any individual simulation. From these ensembled-average quadrupolar splittings, we then applied the GALA method as described by Strandberg et al. (Biophys J. 2004, 86, 3709-3721), which basically derives the peptide orientation (tilt and azimuth) from the splittings. We find small tilt angles (6.5°), whereas the real observed tilt in the concatenated trajectory presents a higher value (33.5°). We thus propose that the small tilt angles estimated by the GALA method are the result of averaging effects, provided that the peptide visits many states of different azimuthal rotations. We discuss how to improve the method and suggest some other experiments to confirm this hypothesis. This work also highlights the need to run several and rather long trajectories in order to predict the peptide orientation from computer simulations
- …