Search CORE

8 research outputs found

Correlation of Features with Principal Components.

Author: Amruta C. Mahadik (565032)
Brian W. Beck (565034)
Isha Mehta (565033)
Sanjana Sudarshan (565030)
Sasi B. Kodathala (565031)
Publication venue
Publication date
Field of study

Loading plots of the eigenvector coefficients of each feature analyzed by PCA show the influence and correlations of each variable to the principal components. Eight features were analyzed to identify the set of features that could represent ∼80% of data variation in the first two principal components (see text for feature descriptions). (a) 80.3% of the total variance of all eight features could be accounted for with just the first two PCs, though R2_ΔΔG (red) had demonstrably smaller coefficients. (b) Exclusion of R2_ΔΔG produced a PCA over 7 features whose PC1 and PC2 accounted for 87.9% of the variance. (c) After removal of 49 interfaces predicted to be FLIP in the first PCA, a second round of PCA using the same seven features but with only data for the remaining 110 protein interfaces was calculated. This PCA produced eigenvectors that had 84.2% of the variance in the first two PCs. [Figure generated using JMP <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0097115#pone.0097115-Chakrabarti1" target="_blank">[46]</a> and Microsoft Excel, 2008].</p

FigShare

Distribution of alanine substitution energies in FLIP and FunC interfaces.

Author: Amruta C. Mahadik (565032)
Brian W. Beck (565034)
Isha Mehta (565033)
Sanjana Sudarshan (565030)
Sasi B. Kodathala (565031)
Publication venue
Publication date
Field of study

(a) and (b) show a histogrammed contour plot colored blue-to-red of the ΔΔGala of substitution to alanine of interfacial residues (blue: more favorable values, red: more disruptive values). The plot axes are the first two principal components of the geometric distribution of alanine Cα positions. PCA was used to align the interface along the X- and Y-axes. Axes are normalized. (a) ΔΔGala of the FunC interface from PDBid: 1c02, chains A&B. (b) ΔΔGala of the FLIP interface from PDBid: 1b5e_AB, chains A&B. (c) Linear regressions of ΔΔGala vs. Distance from interface center. Regressions for the interfaces in the FLIPdb training set with the 10 most positive [1acy_HP, 1biq_AB, 2cii_AC, 1b5e_AB, 1edh_AB, 1pky_BD, 1tx4_AB, 1hjc_AC, x1bsf8_AJ, 1bo5_OZ] and 10 most negative [1tzi_AV, 1acy_LP, x1ppf2_EZ, x1dv82_AC, x1wtl_BZ, x1py94_AE, x1erv2_AC, x1gaf2_LY, 1scu_BD, 1c02_AB] intercepts. FLIP are shown in green and blue [1tzi_AV, 1acy_LP]. FunC are shown in red and yellow [x1bsf8_AJ, 1bo5_OZ]. ΔΔGala are normalized to MAX(ABS(ΔΔGala)), while distances of each residue's Cα from the mean of the Cα positions (Center of Interface) are normalized to MAX(distance). All 3 plots generally show that FLIP interfaces are more centralized and radially symmetric than FunC interfaces. 80% of shown positive intercepts are FLIP and 80% of shown negative intercepts are FunC. [Figures (a,b) generated using JMP <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0097115#pone.0097115-Chakrabarti1" target="_blank">[46]</a>. Figure (c) generated using Microsoft Excel, 2008]</p

FigShare

Protein-Protein Interface Detection Using the Energy Centrality Relationship (ECR) Characteristic of Proteins

Author: Amruta C. Mahadik (565032)
Brian W. Beck (565034)
Isha Mehta (565033)
Sanjana Sudarshan (565030)
Sasi B. Kodathala (565031)
Publication venue
Publication date: 15/05/2014
Field of study

<div>Specific protein interactions are responsible for most biological functions. Distinguishing Functionally Linked Interfaces of Proteins (FLIPs), from Functionally uncorrelated Contacts (FunCs), is therefore important to characterizing these interactions. To achieve this goal, we have created a database of protein structures called FLIPdb, containing proteins belonging to various functional sub-categories. Here, we use geometric features coupled with Kortemme and Baker's computational alanine scanning method to calculate the energetic sensitivity of each amino acid at the interface to substitution, identify hotspots, and identify other factors that may contribute towards an interface being FLIP or FunC. Using Principal Component Analysis and K-means clustering on a training set of 160 interfaces, we could distinguish FLIPs from FunCs with an accuracy of 76%. When these methods were applied to two test sets of 18 and 170 interfaces, we achieved similar accuracies of 78% and 80%. We have identified that FLIP interfaces have a stronger central organizing tendency than FunCs, due, we suggest, to greater specificity. We also observe that certain functional sub-categories, such as enzymes, antibody-heavy-light, antibody-antigen, and enzyme-inhibitors form distinct sub-clusters. The antibody-antigen and enzyme-inhibitors interfaces have patterns of physical characteristics similar to those of FunCs, which is in agreement with the fact that the selection pressures of these interfaces is differently evolutionarily driven. As such, our ECR model also successfully describes the impact of evolution and natural selection on protein-protein interfaces. Finally, we indicate how our ECR method may be of use in reducing the false positive rate of docking calculations.</div

Directory of Open Access Journals

PubMed Central

FigShare

The Energy Centrality Relationship (ECR) for interface evolution.

Author: Amruta C. Mahadik (565032)
Brian W. Beck (565034)
Isha Mehta (565033)
Sanjana Sudarshan (565030)
Sasi B. Kodathala (565031)
Publication venue
Publication date
Field of study

The ECR hypothesis is that upon initial fortuitous protein-protein association, residues in a nascent interface have a selective pressure to maintain or improve the affinity arising from the initial contact, while simultaneously having a similar pressure on residues surrounding that contact. (a) and (b) show a conceptual PPI that has a radially symmetric distribution of ‘hot’ (energetically favorable, red) and ‘cold’ (energetically unfavorable, blue) residues in a FLIP, while (c) and (d) are example residue energy distributions of weaker (c) and stronger (d) affinity FunC. Over evolutionary time (c–f), selective activity, affinity, and specificity pressures on residues in a FunC produce a radially symmetric pattern in the energetics of the interface. The resulting interface should demonstrate “stronger” energies near the “older” regions of the interface. These “older” regions may or may not demonstrate sequence conservation as the pressure is on energy, not identity. As natural interfaces are generally more punctate than the ideal model, we expect that while both FLIP and FunC interfaces may demonstrate multiple contacts, only FLIP interfaces will maintain overall centrality (e–f).</p

FigShare

Summary of protein and protein interface counts in FLIPdb.

Author: Amruta C. Mahadik (565032)
Brian W. Beck (565034)
Isha Mehta (565033)
Sanjana Sudarshan (565030)
Sasi B. Kodathala (565031)
Publication venue
Publication date
Field of study

* Proteins chains are common to multiple sub-categories though the interfaces are distinct.‡ Interfaces are constructed from existing FLIPs through coordinate transformations arising from the symmetry of the source X-ray crystal structure (XFunCs).FLIPdb contains 160 interfaces in 94 structures involving 219 individual protein chains. These interfaces have been assigned to FLIP or FunC functional categories and 9 functional sub-categories based on a review of the literature (see Supplement <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0097115#pone.0097115.s002" target="_blank">Table S1</a>). Due to the reuse of some chains, the totals represented in the first two columns do not sum across sub-categories.</p

FigShare

Accuracy of clustering in Training and Test-18 sets.

Author: Amruta C. Mahadik (565032)
Brian W. Beck (565034)
Isha Mehta (565033)
Sanjana Sudarshan (565030)
Sasi B. Kodathala (565031)
Publication venue
Publication date
Field of study

†) TP: FLIP found in Cluster 1TN: FUNC found in Cluster 2FP: FUNC found in Cluster 1FN: FLIP found in Cluster 2The accuracy and Matthews correlation coefficient (MCC, a measure of the quality of a binary classification) of the results of the clusterings shown in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0097115#pone-0097115-g004" target="_blank">Figure 4</a> are indicated. The overall accuracy is 76% and 78% for both training Test-18 sets, respectively. TPs are quite readily identified in both training and Test-18 sets (80% and 69% sensitivity, respectively). The majority of TPs are enzymes and immunoglobin heavy chain-light chain interactions. TNs are less well identified (70% and 56% negative predictive values, respectively). MCCs of 0.50 and 0.62 indicate that our simple two-category approach is generally appropriate.</p

FigShare

PCA and K-means clustering of Training and Test-18 sets.

Author: Amruta C. Mahadik (565032)
Brian W. Beck (565034)
Isha Mehta (565033)
Sanjana Sudarshan (565030)
Sasi B. Kodathala (565031)
Publication venue
Publication date
Field of study

Principal component analysis followed by K-means clustering was performed on the residues in the 100 FLIP and 60 FunC interfaces in the FLIPdb. The same 7 features identified in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0097115#pone-0097115-g003" target="_blank">Figure 3</a> are used here and the number of clusters was set to k = 2. Green (“cluster 1”) and red (“cluster 2”) ovals represent 1 standard deviation for Euclidean distances around the cluster centroid marked by “x”. Interfaces are indicated with symbols representing their functional sub-category. Green and Blue symbols are FLIP structures, but blue symbols are specifically AbAg and Inhibitor sub-categories. Red symbols are FunCs. (a) and (b): training set. (c) and (d): Test-18 testing set. (a) 49 FLIP interfaces (mostly enzymes and immunoglobin Heavy-Light chains) and 1 FunC are identified in cluster 1 (98% precision). (b) After removal of these 50 interfaces, a second PCA analysis of the remaining 110 interfaces produces new clusters with 48 and 62 members, respectively. PCA 2 Cluster 1 is 64% FLIP and cluster 2 is 68% FunC. Overall accuracy across both (a)+(b) is 76%. (c) and (d) show the projection of the 7 feature values 18 unrelated PPIs in the Test-18 set through the principal components developed on the training set. Enzymes and immunoglobin Heavy-Light again dominate cluster 1 (100%) and overall accuracy in both clusterings is 78%. [Figure generated with JMP <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0097115#pone.0097115-Chakrabarti1" target="_blank">[46]</a> and Microsoft Excel, 2008].</p

FigShare

Protein-Protein Interface Detection Using the Energy Centrality Relationship (ECR) Characteristic of Proteins

Author: A Gavin
A Valencia
A Zanzoni
AA Bogan
AM Ball
Amruta C. Mahadik
AS Aytuna
BK Shoichet
Brian W. Beck
CA Orengo
CT Wells
DE Kim
DG Schatz
DW Ritchie
E Krissinel
ED Levy
Elena Papaleo
EM Phizicky
GD Bader
GW Schwartz
H Zhu
HM Berman
I Massova
I Mihalek
I Xenarios
IM Nooren
Isha Mehta
J De Las Rivas
J Janin
J Janin
J Mintseris
JA Hartigan
JJ Havranek
JM Duarte
K Pearson
KE Henrick
KH Young
KS Thorn
KT Kim
NAG Meenan
P Chakrabarti
P Uetz
PH Henrick
RP Bahadur
RP Bahadur
RP Bahadur
S Dey
S Dey
S Jones
S Liu
Sanjana Sudarshan
Sasi B. Kodathala
SF Altschul
SJ Fleishman
T Kortemme
T Williams
TJ Hubbard
WL DeLano
WS Valdar
WS Valdar
Y Ho
Y Ofran
Y Ofran
Publication venue: 'Public Library of Science (PLoS)'
Publication date
Field of study

Crossref