13 research outputs found

    Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Human genetic variations primarily result from single nucleotide polymorphisms (SNPs) that occur approximately every 1000 bases in the overall human population. The non-synonymous SNPs (nsSNPs) that lead to amino acid changes in the protein product may account for nearly half of the known genetic variations linked to inherited human diseases. One of the key problems of medical genetics today is to identify nsSNPs that underlie disease-related phenotypes in humans. As such, the development of computational tools that can identify such nsSNPs would enhance our understanding of genetic diseases and help predict the disease.</p> <p>Results</p> <p>We propose a method, named Parepro (Predicting the amino acid replacement probability), to identify nsSNPs having either deleterious or neutral effects on the resulting protein function. Two independent datasets, HumVar and NewHumVar, taken from the PhD-SNP server, were applied to train the model and test the robustness of Parepro. Using a 20-fold cross validation test on the HumVar dataset, Parepro achieved a Matthews correlation coefficient (MCC) of 50% and an overall accuracy (Q2) of 76%, both of which were higher than those predicted by the methods, such as PolyPhen, SIFT, and HydridMeth. Further analysis on an additional dataset (NewHumVar) using Parepro yielded similar results.</p> <p>Conclusion</p> <p>The performance of Parepro indicates that it is a powerful tool for predicting the effect of nsSNPs on protein function and would be useful for large-scale analysis of genomic nsSNP data.</p

    Predicting changes in protein thermostability brought about by single- or multi-site mutations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>An important aspect of protein design is the ability to predict changes in protein thermostability arising from single- or multi-site mutations. Protein thermostability is reflected in the change in free energy (ΔΔ<it>G</it>) of thermal denaturation.</p> <p>Results</p> <p>We have developed predictive software, Prethermut, based on machine learning methods, to predict the effect of single- or multi-site mutations on protein thermostability. The input vector of Prethermut is based on known structural changes and empirical measurements of changes in potential energy due to protein mutations. Using a 10-fold cross validation test on the M-dataset, consisting of 3366 mutants proteins from ProTherm, the classification accuracy of random forests and the regression accuracy of random forest regression were slightly better than support vector machines and support vector regression, whereas the overall accuracy of classification and the Pearson correlation coefficient of regression were 79.2% and 0.72, respectively. Prethermut performs better on proteins containing multi-site mutations than those with single mutations.</p> <p>Conclusions</p> <p>The performance of Prethermut indicates that it is a useful tool for predicting changes in protein thermostability brought about by single- or multi-site mutations and will be valuable in the rational design of proteins.</p

    Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines-1

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines"</p><p>http://www.biomedcentral.com/1471-2105/8/450</p><p>BMC Bioinformatics 2007;8():450-450.</p><p>Published online 16 Nov 2007</p><p>PMCID:PMC2216041.</p><p></p>ws correlation coefficient (MCC)

    Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines-2

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines"</p><p>http://www.biomedcentral.com/1471-2105/8/450</p><p>BMC Bioinformatics 2007;8():450-450.</p><p>Published online 16 Nov 2007</p><p>PMCID:PMC2216041.</p><p></p>is based on the NumVar dataset

    INTEGRATIVE SYSTEM BIOLOGY STUDIES ON HIGH THROUGHPUT GENOMICS AND PROTEOMICS DATASET

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)The post genomic era has propelled us to the view that the biological systems are complex network of interacting genes, proteins and small molecules that give rise to biological form and function. The past decade has seen the advent of number of new technologies designed to study the biological systems on a genome wide scale. These new technologies offers an insight in to the activity of thousands of genes and proteins in cell thereby changed the conventional reductionist view of the systems. However the deluge of data surpasses the analytical and critical abilities of the researches and thereby demands the development of new computational methods. The challenge no longer lies in the acquisition of expression profiles, but rather in the interpretation for the results to gain insights into biological mechanisms. In three different case studies, we applied various system biology techniques on publicly available and in-house genomics and proteomics data set to identify sub-network signatures. In First study, we integrated prior knowledge from gene signatures, GSEA and gene/protein network modeling to identify pathways involved in colorectal cancer, while in second, we identified plasma based network signatures for Alzheimer's disease by combining various feature selection and classification approach. In final study, we did an integrated miRNA-mRNA analysis to identify the role of Myeloid Derived Stem Cells (MDSCs) in T-Cell suppression

    Computational and Experimental Approaches to Reveal the Effects of Single Nucleotide Polymorphisms with Respect to Disease Diagnostics

    Get PDF
    DNA mutations are the cause of many human diseases and they are the reason for natural differences among individuals by affecting the structure, function, interactions, and other properties of DNA and expressed proteins. The ability to predict whether a given mutation is disease-causing or harmless is of great importance for the early detection of patients with a high risk of developing a particular disease and would pave the way for personalized medicine and diagnostics. Here we review existing methods and techniques to study and predict the effects of DNA mutations from three different perspectives: in silico, in vitro and in vivo. It is emphasized that the problem is complicated and successful detection of a pathogenic mutation frequently requires a combination of several methods and a knowledge of the biological phenomena associated with the corresponding macromolecules

    Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines-3

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines"</p><p>http://www.biomedcentral.com/1471-2105/8/450</p><p>BMC Bioinformatics 2007;8():450-450.</p><p>Published online 16 Nov 2007</p><p>PMCID:PMC2216041.</p><p></p>ribute sets are constructed using the PSAP information in combination with the RD, MI, and IE properties of the amino acids. Finally, the complex vector of Parepro is integrated and used to predict the effect of an nsSNP

    Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines-0

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines"</p><p>http://www.biomedcentral.com/1471-2105/8/450</p><p>BMC Bioinformatics 2007;8():450-450.</p><p>Published online 16 Nov 2007</p><p>PMCID:PMC2216041.</p><p></p>ribute sets are constructed using the PSAP information in combination with the RD, MI, and IE properties of the amino acids. Finally, the complex vector of Parepro is integrated and used to predict the effect of an nsSNP

    Incidence and regulatory implications of single Nucleotide polymorphisms among established ovarian cancer genes

    Get PDF
    Magister Scientiae - MScOVARIAN cancer research focuses on answering important questions related to the disease, determining whether new approaches are feasible to contribute towards improving current treatments or discovering new ones. This study focused on the transcriptional regulation of genes that have been implicated in ovarian cancer, based on the occurrences of single nucleotide polymorphisms (SNPs) within transcription factor binding sites (TFBSs). Through the application of several in silico tools, databases and custom programs, this research aimed to contribute toward the identification of potentially bio-medically important genes or SNPs for pre-diagnosis and subsequent treatment planning of ovarian cancer. A total of 379 candidate genes that have been experimentally associated with ovarian cancer were analyzed. This led to the identification of 121 SNPs that were found to coincide with putative TFBSs potentially influencing a total of 57 transcription factors that would normally bind to these TFBSs. These SNPs with potential phenotypic effect were then evaluated among several population groups, defined by the International HapMap consortium resulting in the identification of three SNPs present in five or more of the eleven population groups that have been sampled.South Afric

    Mapping the genotype-phenotype relationship in complex disease.

    Full text link
    Computational methods to identify harmful variations in humans perform well for rare diseases such as Huntington\u27s but not for common diseases like hypertension or diabetes. A modelling approach that takes protein context into account was illustrated to identify harmful variants involved in complex diseases
    corecore