23 research outputs found
A Simple Extension to the CMASA Method for the Prediction of Catalytic Residues in the Presence of Single Point Mutations
<div><p>The automatic identification of catalytic residues still remains an important challenge in structural bioinformatics. Sequence-based methods are good alternatives when the query shares a high percentage of identity with a well-annotated enzyme. However, when the homology is not apparent, which occurs with many structures from the structural genome initiative, structural information should be exploited. A local structural comparison is preferred to a global structural comparison when predicting functional residues. CMASA is a recently proposed method for predicting catalytic residues based on a local structure comparison. The method achieves high accuracy and a high value for the Matthews correlation coefficient. However, point substitutions or a lack of relevant data strongly affect the performance of the method. In the present study, we propose a simple extension to the CMASA method to overcome this difficulty. Extensive computational experiments are shown as proof of concept instances, as well as for a few real cases. The results show that the extension performs well when the catalytic site contains mutated residues or when some residues are missing. The proposed modification could correctly predict the catalytic residues of a mutant thymidylate synthase, 1EVF. It also successfully predicted the catalytic residues for 3HRC despite the lack of information for a relevant side chain atom in the PDB file.</p></div
Comparison of local catalytic structures in 3HRC and 2OIC.
<p>The template of 2OIC has a catalytic site structure similar to that of 3HRC, with the exception of residue A315. The proposed xCMASA is able to detect the catalytic residues D205-K207-N210-T245. Superposition of the catalytic residues in the query structure 3HRC (the colors differ for each element, and the identifiers are underlined) and the residues of the associated 20IC (the elements are in grey, and the identifiers are not underlined).</p
Computing the number of comparisons to handle single point mutations.
<p>Using the input shown in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0108513#pone-0108513-g002" target="_blank">Figure 2</a> and the substitution matrix approach, the residues of type H can be interchanged with E, D, and K. These substitutions generate the combinations shown in (A). In contrast, xCMASA does not require additional information because the sub-templates are derived from <i>t<sub>i</sub></i>, as shown in (B).</p
Examples of the predictions for catalytic sites that were evaluated as FNs by CMASA-SM and as TPs by xCMASA.
<p>Examples of the predictions for catalytic sites that were evaluated as FNs by CMASA-SM and as TPs by xCMASA.</p
Performance measures as a function of CMAD.
<p>The graph shows the relations among the sensitivity, accuracy, and MCC of CMASA and xCMASA for scenarios with and without single point mutations.</p
Performance criteria for CMASA (MT) and xCMASA (ET) with CMAD = 1.2. Mutated (M) and non-mutated (NoM) queries.
<p>Performance criteria for CMASA (MT) and xCMASA (ET) with CMAD = 1.2. Mutated (M) and non-mutated (NoM) queries.</p
Comparison of local catalytic structures in 1EVF and 1BQ1.
<p>Residue number 167 in 1EVF is mutated from SER to THR, and the proposed extension could detect all of the remaining catalytic residues, which were not detected by CMASA or CMASA-SM. Superposition of the catalytic residues in the query 1EVF (colors by element with the identifiers underlined) and the residues of the associated templates 1BQ1 (the elements are in grey, and the identifiers are not underlined). The similarity of the local structures of the non-mutated residues is shown.</p
Emulation of local structures in CMASA.
<p>The query protein <i>Q</i> is hypothetical; the template <i>t<sub>i</sub></i> is associated with the catalytic site of protein 1ADO to emulate the local structures. (A) Input to the method: <i>Q</i> is the sequence of residues in the query structure; <i>t<sub>i</sub></i> (<i>E</i>, <i>D</i>, and <i>K</i>) is the template used to emulate the local structures in <i>Q</i>. (B) Emulated structures of <i>t<sub>i</sub></i> in <i>Q</i>; there are four possible combinations (<i>lq<sub>1</sub></i>, <i>lq<sub>2</sub></i>, <i>lq<sub>3</sub></i>, <i>lq<sub>4</sub></i>) that may match the template <i>t<sub>i</sub></i>.</p
Comparison of local catalytic structures in 3HRC and 1UU9.
<p>Although the proteins 3HRC (the colors differ by element, and the identifiers are underlined) and 1UU9 have the variants with A209E (in grey, and the identifiers are not underlined), 3HRC lacks an atom in the side chain (<i>O</i>ε<i><sub>2</sub></i>). CMASA fails in this situation, but xCMASA is able to detect the non-mutated residues.</p
Flowchart of CMASA and the proposed extension.
<p>The diagrams with a dashed background indicate added characteristics, the grey shadowed regions correspond to the process of introducing the SM, and the dashed arrows indicate optional flow.</p