How accurate and statistically robust are catalytic site predictions based on closeness centrality?

A Armon; A del Sol; A Gutteridge; AG Murzin; AH Elcock; AR Panchenko; B Thibert; CA Innis; D La; D La; Dennis R Livesay; DR Livesay; DR Livesay; DR Livesay; Eric Chea; F Pazos; F Pazos; G Cheng; GJ Bartlett; GM Alter; H Yao; JD Watson; KC Usher; KV Brinda; LC Kurz; LH Greene; M Vendruscolo; MA del Sol; MJ Ondrechen; MT Neves-Petersen; NV Dokholyan; O Lichtarge; OS Soyer; P Aloy; PJ Bickel; PP Wangikar; R Landgraf; RJ Russell; S Jones; S Madabushi; W Kabsch

How accurate and statistically robust are catalytic site predictions based on closeness centrality?

Authors: A Armon
A del Sol
A Gutteridge
AG Murzin
AH Elcock
AR Panchenko
B Thibert
CA Innis
D La
D La
Dennis R Livesay
DR Livesay
DR Livesay
DR Livesay
Eric Chea
F Pazos
F Pazos
G Cheng
GJ Bartlett
GM Alter
H Yao
JD Watson
KC Usher
KV Brinda
LC Kurz
LH Greene
M Vendruscolo
MA del Sol
MJ Ondrechen
MT Neves-Petersen
NV Dokholyan
O Lichtarge
OS Soyer
P Aloy
PJ Bickel
PP Wangikar
R Landgraf
RJ Russell
S Jones
S Madabushi
W Kabsch
Publication date: 1 January 2007
Publisher: BioMed Central
Doi

Abstract

Abstract Background We examine the accuracy of enzyme catalytic residue predictions from a network representation of protein structure. In this model, amino acid α-carbons specify vertices within a graph and edges connect vertices that are proximal in structure. Closeness centrality, which has shown promise in previous investigations, is used to identify important positions within the network. Closeness centrality, a global measure of network centrality, is calculated as the reciprocal of the average distance between vertex <it>i </it>and all other vertices. Results We benchmark the approach against 283 structurally unique proteins within the Catalytic Site Atlas. Our results, which are inline with previous investigations of smaller datasets, indicate closeness centrality predictions are statistically significant. However, unlike previous approaches, we specifically focus on residues with the very best scores. Over the top five closeness centrality scores, we observe an average true to false positive rate ratio of 6.8 to 1. As demonstrated previously, adding a solvent accessibility filter significantly improves predictive power; the average ratio is increased to 15.3 to 1. We also demonstrate (for the first time) that filtering the predictions by residue identity improves the results even more than accessibility filtering. Here, we simply eliminate residues with physiochemical properties unlikely to be compatible with catalytic requirements from consideration. Residue identity filtering improves the average true to false positive rate ratio to 26.3 to 1. Combining the two filters together has little affect on the results. Calculated p-values for the three prediction schemes range from 2.7E-9 to less than 8.8E-134. Finally, the sensitivity of the predictions to structure choice and slight perturbations is examined. Conclusion Our results resolutely confirm that closeness centrality is a viable prediction scheme whose predictions are statistically significant. Simple filtering schemes substantially improve the method's predicted power. Moreover, no clear effect on performance is observed when comparing ligated and unligated structures. Similarly, the CC prediction results are robust to slight structural perturbations from molecular dynamics simulation.</p

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Springer - Publisher Connector

Last time updated on 05/06/2019

Springer - Publisher Connector

Last time updated on 01/05/2017

Crossref

Last time updated on 11/12/2019

Directory of Open Access Journals

oai:doaj.org/article:1c68eb9b0...

Last time updated on 18/12/2014