25 research outputs found

    Fragment Based Protein Active Site Analysis Using Markov Random Field Combinations of Stereochemical Feature-Based Classifications

    Get PDF
    Recent improvements in structural genomics efforts have greatly increased the number of hypothetical proteins in the Protein Data Bank. Several computational methodologies have been developed to determine the function of these proteins but none of these methods have been able to account successfully for the diversity in the sequence and structural conformations observed in proteins that have the same function. An additional complication is the flexibility in both the protein active site and the ligand. In this dissertation, novel approaches to deal with both the ligand flexibility and the diversity in stereochemistry have been proposed. The active site analysis problem is formalized as a classification problem in which, for a given test protein, the goal is to predict the class of ligand most likely to bind the active site based on its stereochemical nature and thereby define its function. Traditional methods that have adapted a similar methodology have struggled to account for the flexibility observed in large ligands. Therefore, I propose a novel fragment-based approach to dealing with larger ligands. The advantage of the fragment-based methodology is that considering the protein-ligand interactions in a piecewise manner does not affect the active site patterns, and it also provides for a way to account for the problems associated with flexible ligands. I also propose two feature-based methodologies to account for the diversity observed in sequences and structural conformations among proteins with the same function. The feature-based methodologies provide detailed descriptions of the active site stereochemistry and are capable of identifying stereochemical patterns within the active site despite the diversity. Finally, I propose a Markov Random Field approach to combine the individual ligand fragment classifications (based on the stereochemical descriptors) into a single multi-fragment ligand class. This probabilistic framework combines the information provided by stereochemical features with the information regarding geometric constraints between ligand fragments to make a final ligand class prediction. The feature-based fragment identification methodology had an accuracy of 84% across a diverse set of ligand fragments and the mrf analysis was able to succesfully combine the various ligand fragments (identified by feature-based analysis) into one final ligand based on statistical models of ligand fragment distances. This novel approach to protein active site analysis was additionally tested on 3 proteins with very low sequence and structural similarity to other proteins in the PDB (a challenge for traditional methods) and in each of these cases, this approach successfully identified the cognate ligand. This approach addresses the two main issues that affect the accuracy of current automated methodologies in protein function assignment

    Automated Diagnosis of Retinal Images Using Evidential Reasoning

    No full text

    TEXTAL™: Artificial Intelligence Techniques for Automated Protein Structure Determination

    No full text
    X-ray crystallography is the most widely used method for determining the three-dimensional structures of proteins and other macromolecules. One of the most difficult steps in crystallography is interpreting the 3D image of the electron density cloud surrounding the protein. This is often done manually by crystallographers and is very time-consuming and error-prone. The difficulties stem from the fact that the domain knowledge required for interpreting electron density data is uncertain. Thus crystallographers often have to resort to intuitions and heuristics for decision-making. The problem is compounded by the fact that in most cases, data available is noisy and blurred. TEXTAL is a system designed to automate this challenging process of inferring the atomic structure of proteins from electron density data
    corecore