This thesis presents an analysis of the relationship between single nucleotide
polymorphism (SNPs) and protein–protein interactions. The aim of the thesis is to
investigate the distribution of non-synonymous single nucleotide polymorphism
(nsSNPs) in terms of their locations in the protein core, at the protein–protein interface
sites and on the other areas on the protein surface. The analysis used experimentally
verified human protein–protein interactions and nsSNPs from the UniProt humsavar
database. A further investigation was performed on a larger SNP dataset from the 1000
Genomes Project (1KGP). Both investigations identified a significant preference for
disease-causing SNPs to occur at the protein interface compared to other areas on the
protein surface. The three-dimensional structures of protein–protein interfaces were
examined in order to propose stereo-chemical explanations for the disease-causing
effect of nsSNPs in the humsavar dataset. In addition, three methodologies (i.e., usage of
SNP server, structural analysis and usage of GMAF) that could help identify pathogenic
variants were presented. Structural analysis was also performed on non-diseasecausing
SNPs in order to investigate their possible effects on protein–protein
interactions. The result showed that some of the previously classified non-diseasecausing
SNPs could potentially be disease-causing SNPs. The myVARIANT program was
developed. The program obtains SNPs from 1KGP, maps them to structures, evaluates
their distribution on structures and performs a structural analysis. In conclusion, the
thesis demonstrates the important role that protein–protein interactions play in disease
pathogenesis.Open Acces