2 research outputs found
CaMELS : In silicoprediction of calmodulin binding proteins and their binding sites
Due to Ca2+âdependent binding and the sequence diversity of Calmodulin (CaM) binding proteins, identifying CaM interactions and binding sites in the wetâlab is tedious and costly. Therefore, computational methods for this purpose are crucial to the design of such wetâlab experiments. We present an algorithm suite called CaMELS (CalModulin intEraction Learning System) for predicting proteins that interact with CaM as well as their binding sites using sequence information alone. CaMELS offers state of the art accuracy for both CaM interaction and binding site prediction and can aid biologists in studying CaM binding proteins. For CaM interaction prediction, CaMELS uses protein sequence features coupled with a largeâmargin classifier. CaMELS models the binding site prediction problem using multiple instance machine learning with a custom optimization algorithm which allows more effective learning over imprecisely annotated CaMâbinding sites during training. CaMELS has been extensively benchmarked using a variety of data sets, mutagenic studies, proteomeâwide Gene Ontology enrichment analyses and protein structures. Our experiments indicate that CaMELS outperforms simple motifâbased search and other existing methods for interaction and binding site prediction. We have also found that the whole sequence of a protein, rather than just its binding site, is important for predicting its interaction with CaM. Using the machine learning model in CaMELS, we have identified important features of protein sequences for CaM interaction prediction as well as characteristic amino acid subâsequences and their relative position for identifying CaM binding sites. Python code for training and evaluating CaMELS together with a webserver implementation is available at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#camels