Shape Analysis in Bioinformatics

Abstract

In this thesis we explore two main themes, both of which involve proteins. The first area of research focuses on the analyses of proteins displayed as spots on 2-dimensional planes. The second area of research focuses on a specific protein and how interactions with this protein can naturally prevent or, in the presence of a pesticide, cause toxicity. The first area of research builds on previously developed EM methodology to infer the matching and transformation necessary to superimpose two partially labelled point configurations, focusing on the application to 2D protein images. We modify the methodology to account for the possibility of missing and misallocated markers, where markers make up the labelled proteins manually located across images. We provide a way to account for the likelihood of an increased edge variance within protein images. We find that slight marker misallocations do not greatly influence the final output superimposition when considering data simulated to mimic the given dataset. The methodology is also successfully used to automatically locate and remove a grossly misallocated marker within the given dataset before further analyses is carried out. We develop a method to create a union of replicate images, which can then be used alone in further analyses to reduce computational expense. We describe how the data can be modelled to enable the inference on the quality of a dataset, a property often overlooked in protein image analysis. To complete this line of research we provide a method to rank points that are likely to be present in one group of images but absent in a second group. The produced score is used to highlight the proteins that are not present in both image sets representing control or diseased tissue, therefore providing biological indicators which are vitally important to improve the accuracy of diagnosis. In the second area of research, we test the hypothesis that pesticide toxicity is related to the shape similarity between the pesticide molecule itself and the natural ligand of the protein to which a pesticide will bind (and ultimately cause toxicity). A ligand of aprotein is simply a small molecule that will bind to that protein. It seems intuitive that the similarities between a naturally formed ligand and a synthetically developed ligand (the pesticide) may be an indicator of how well a pesticide and the protein bind, as well as provide an indicator of pesticide toxicity. A graphical matching algorithm is used to infer the atomic matches across ligands, with Procrustes methodology providing the final superimposition before a measure of shape similarity is defined considering the aligned molecules. We find evidence that the measure of shape similarity does provide a significant indicator of the associated pesticide toxicity, as well as providing a more significant indicator than previously found biological indicators. Previous research has found that the properties of a molecule in its bioactive form are more suitable indicators of an associated activity. Here, these findings dictate that the docked conformation of a pesticide within the protein will provide more accurate indicators of the associated toxicity. So next we use a docking program to predict the docked conformation of a pesticide. We provide a technique to calculate the similarity between the docks of both the pesticide and the natural ligand. A similar technique is used to provide a measure for the closeness of fit between a pesticide and the protein. Both measures are then considered as independent variables for the prediction of toxicity. In this case the results show potential for the calculated variables to be useful toxicity predictors, though further analysis is necessary to properly explore their significance

Similar works

This paper was published in White Rose E-theses Online.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.