18 research outputs found

    Functional impacts of non-synonymous single nucleotide polymorphisms: Selective constraint and structural environments

    Get PDF
    AbstractIn this work, we studied the correlations between selective constraint, structural environments and functional impacts of non-synonymous single nucleotide polymorphisms (nsSNPs). We found that the relation between solvent accessibility and functional impacts of nsSNPs is not as simple as generally thought. Finer structural classifications need to be taken into account to reveal the complex relations between the characteristics of a structure environment and its influence on the functional impacts of nsSNPs. We introduced two parameters for each structural environment, consensus residue percentage and residue distribution distance, to characterize the selective constraint imposed by the environment. Both parameters significantly correlate with the functional bias of nsSNPs across the structural environments. This result shows that selective constraint underlies the bias of a structural environment towards a certain type of nsSNPs (disease-associated or benign)

    Analyzing Effects of Naturally Occurring Missense Mutations

    Get PDF
    Single-point mutation in genome, for example, single-nucleotide polymorphism (SNP) or rare genetic mutation, is the change of a single nucleotide for another in the genome sequence. Some of them will produce an amino acid substitution in the corresponding protein sequence (missense mutations); others will not. This paper focuses on genetic mutations resulting in a change in the amino acid sequence of the corresponding protein and how to assess their effects on protein wild-type characteristics. The existing methods and approaches for predicting the effects of mutation on protein stability, structure, and dynamics are outlined and discussed with respect to their underlying principles. Available resources, either as stand-alone applications or webservers, are pointed out as well. It is emphasized that understanding the molecular mechanisms behind these effects due to these missense mutations is of critical importance for detecting disease-causing mutations. The paper provides several examples of the application of 3D structure-based methods to model the effects of protein stability and protein-protein interactions caused by missense mutations as well

    Computational Analysis of Missense Mutations Causing Snyder-Robinson Syndrome

    Get PDF
    The Snyder-Robinson syndrome is caused by missense mutations in the spermine sythase gene that encodes a protein (SMS) of 529 amino acids. Here we investigate, in silico, the molecular effect of three missense mutations, c.267G\u3eA (p.G56S), c.496T\u3eG (p.V132G), and c.550T\u3eC (p.I150T) in SMS that were clinically identified to cause the disease. Single-point energy calculations, molecular dynamics simulations, and pKa calculations revealed the effects of these mutations on SMS\u27s stability, flexibility, and interactions. It was predicted that the catalytic residue, Asp276, should be protonated prior binding the substrates. The pKa calculations indicated the p.I150T mutation causes pKa changes with respect to the wild-type SMS, which involve titratable residues interacting with the S-methyl-5′-thioadenosine (MTA) substrate. The p.I150T missense mutation was also found to decrease the stability of the C-terminal domain and to induce structural changes in the vicinity of the MTA binding site. The other two missense mutations, p.G56S and p.V132G, are away from active site and do not perturb its wild-type properties, but affect the stability of both the monomers and the dimer. Specifically, the p.G56S mutation is predicted to greatly reduce the affinity of monomers to form a dimer, and therefore should have a dramatic effect on SMS function because dimerization is essential for SMS activity. Hum Mutat 31:1043–1049, 2010

    Exploring the genomic basis of traits relevant to evolution and ecology of chestnuts (Castanea) using high-throughput DNA sequencing and bioinformatics.

    Get PDF
    Introduced pests and pathogens have devastated forest ecosystems in the temperate zone; in eastern North America, introduced pests and pathogens have led to the elimination of most mature elms (Ulmus), ashes (Fraxinus), hemlocks (Tsuga) and chestnuts (Castanea) over large areas where these genera were formerly abundant and important for local ecosystems. The restoration of species affected by introduced pests and pathogens requires the development and propagation of trees that possess heritable resistance. High-throughput DNA sequencing and genomics provide opportunities for researchers to identify resistance gene candidates, screen germplasm, and develop markers for marker-assisted selection in breeding programs, with the goal of restoring ecologically important wild trees to the landscape. American chestnut (Castanea dentata) is currently the focus of a major research effort that intends to restore the species by incorporating blight resistance from Chinese chestnut (Castanea mollissima), a species that is generally resistant to chestnut blight. I investigated several aspects of chestnut genomics and blight resistance with the goal of aiding the blight resistance breeding program for American chestnut. I tested a detached-leaf assay for chestnut blight resistance and learned that it may not be useful for screening advanced backcross (BC3) progeny in chestnut blight resistance breeding programs (Chapter 2). Utilizing a recent draft assembly of the Chinese chestnut reference genome, I analyzed patterns of genetic variation across regions associated with chestnut blight resistance, and found that several loci associated with blight resistance show markedly elevated nucleotide diversity in the most resistant Chinese chestnuts relative to more susceptible trees. At other blight-associated loci, genetic diversity was low in all C. mollissima (Chapter 3). This indicates that while maintaining high allelic diversity at blight resistance loci is desirable for a resistance breeding program, it may not be essential. Assessing potential unintended effects of hybrid breeding on the ecological behavior of restored chestnuts, I found that several genetic loci in third backcross (BC3) chestnut appear to affect caching decisions by squirrels due to inheritance of C. mollissima alleles that influence seed traits (Chapter 4). The reason for backcrossing in the American chestnut breeding program is to avoid the short, branchy mature form of C. mollissima. By sequencing the genomes of wild and orchard-derived Chinese chestnuts, I showed that some genomic loci under selection in orchard chestnuts (i.e., artificially selected by humans) may influence crown form (Chapter 5). This work should provide the basis for further investigations that validate the phenotypic effects of the proposed candidate genes, and utilize information on genetic polymorphisms identified here to accelerate chestnut improvement programs

    Doctor of Philosophy

    Get PDF
    dissertationRapidly evolving technologies such as chip arrays and next-generation sequencing are uncovering human genetic variants at an unprecedented pace. Unfortunately, this ever growing collection of gene sequence variation has limited clinical utility without clear association to disease outcomes. As electronic medical records begin to incorporate genetic information, gene variant classification and accurate interpretation of gene test results plays a critical role in customizing patient therapy. To verify the functional impact of a given gene variant, laboratories rely on confirming evidence such as previous literature reports, patient history and disease segregation in a family. By definition variants of uncertain significance (VUS) lack this supporting evidence and in such cases, computational tools are often used to evaluate the predicted functional impact of a gene mutation. This study evaluates leveraging high quality genotype-phenotype disease variant data from 20 genes and 3986 variants, to develop gene-specific predictors utilizing a combination of changes in primary amino acid sequence, amino acid properties as descriptors of mutation severity and Naïve Bayes classification. A Primary Sequence Amino Acid Properties (PSAAP) prediction algorithm was then combined with well established predictors in a weighted Consensus sum in context of gene-specific reference intervals for known phenotypes. PSAAP and Consensus were also used to evaluate known variants of uncertain significance in the RET proto-oncogene as a model gene. The PSAAP algorithm was successfully extended to many genes and diseases. Gene-specific algorithms typically outperform generalized prediction tools. Characteristic mutation properties of a given gene and disease may be lost when diluted into genomewide data sets. A reliable computational phenotype classification framework with quantitative metrics and disease specific reference ranges allows objective evaluation of novel or uncertain gene variants and augments decision making when confirming clinical information is limited

    SNPs3D: Candidate gene and SNP selection for association studies

    Get PDF
    The relationship between disease susceptibility and genetic variation is complex, and many different types of data are relevant. We describe a web resource and database that provides and integrates as much information as possible on disease/gene relationships at the molecular level. The resource http://www.SNPs3D.org has three primary modules. One module identifies which genes are candidates for involvement in a specified disease. A second module provides information about the relationships between sets of candidate genes. The third module analyzes the likely impact of non-synonymous SNPs on protein function. Disease/candidate gene relationships and gene-gene relationships are derived from the literature using simple but effective text profiling. SNP/protein function relationships are derived by two methods, one using principles of protein structure and stability, the other based on sequence conservation. Entries for each gene include a number of links to other data, such as expression profiles, pathway context, mouse knockout information and papers. Gene-gene interactions are presented in an interactive graphical interface, providing rapid access to the underlying information, as well as convenient navigation through the network. Use of the resource is illustrated with aspects of the inflammatory response and hypertension. The combination of SNP impact analysis, a knowledge based network of gene relationships and candidate genes, and access to a wide range of data and literature allow a user to quickly assimilate available information, and so develop models of gene-pathway-disease interaction.https://doi.org/10.1186/1471-2105-7-16

    SNPs3D: Candidate gene and SNP selection for association studies

    Get PDF
    The relationship between disease susceptibility and genetic variation is complex, and many different types of data are relevant. We describe a web resource and database that provides and integrates as much information as possible on disease/gene relationships at the molecular level. The resource http://www.SNPs3D.org has three primary modules. One module identifies which genes are candidates for involvement in a specified disease. A second module provides information about the relationships between sets of candidate genes. The third module analyzes the likely impact of non-synonymous SNPs on protein function. Disease/candidate gene relationships and gene-gene relationships are derived from the literature using simple but effective text profiling. SNP/protein function relationships are derived by two methods, one using principles of protein structure and stability, the other based on sequence conservation. Entries for each gene include a number of links to other data, such as expression profiles, pathway context, mouse knockout information and papers. Gene-gene interactions are presented in an interactive graphical interface, providing rapid access to the underlying information, as well as convenient navigation through the network. Use of the resource is illustrated with aspects of the inflammatory response and hypertension. The combination of SNP impact analysis, a knowledge based network of gene relationships and candidate genes, and access to a wide range of data and literature allow a user to quickly assimilate available information, and so develop models of gene-pathway-disease interaction.https://doi.org/10.1186/1471-2105-7-16

    Contextual Analysis of Large-Scale Biomedical Associations for the Elucidation and Prioritization of Genes and their Roles in Complex Disease

    Get PDF
    Vast amounts of biomedical associations are easily accessible in public resources, spanning gene-disease associations, tissue-specific gene expression, gene function and pathway annotations, and many other data types. Despite this mass of data, information most relevant to the study of a particular disease remains loosely coupled and difficult to incorporate into ongoing research. Current public databases are difficult to navigate and do not interoperate well due to the plethora of interfaces and varying biomedical concept identifiers used. Because no coherent display of data within a specific problem domain is available, finding the latent relationships associated with a disease of interest is impractical. This research describes a method for extracting the contextual relationships embedded within associations relevant to a disease of interest. After applying the method to a small test data set, a large-scale integrated association network is constructed for application of a network propagation technique that helps uncover more distant latent relationships. Together these methods are adept at uncovering highly relevant relationships without any a priori knowledge of the disease of interest. The combined contextual search and relevance methods power a tool which makes pertinent biomedical associations easier to find, easier to assimilate into ongoing work, and more prominent than currently available databases. Increasing the accessibility of current information is an important component to understanding high-throughput experimental results and surviving the data deluge

    Assessing the pathogenicity of insertion and deletion variants with the Variant Effect Scoring Tool (VEST-Indel)

    Get PDF
    Insertion/deletion variants (indels) alter protein sequence and length, yet are highly prevalent in healthy populations, presenting a challenge to bioinformatics classifiers. Commonly used features—DNA and protein sequence conservation, indel length, and occurrence in repeat regions—are useful for inference of protein damage. However, these features can cause false positives when predicting the impact of indels on disease. Existing methods for indel classification suffer from low specificities, severely limiting clinical utility. Here, we further develop our variant effect scoring tool (VEST) to include the classification of in-frame and frameshift indels (VEST-indel) as pathogenic or benign. We apply 24 features, including a new “PubMed” feature, to estimate a gene's importance in human disease. When compared with four existing indel classifiers, our method achieves a drastically reduced false-positive rate, improving specificity by as much as 90%. This approach of estimating gene importance might be generally applicable to missense and other bioinformatics pathogenicity predictors, which often fail to achieve high specificity. Finally, we tested all possible meta-predictors that can be obtained from combining the four different indel classifiers using Boolean conjunctions and disjunctions, and derived a meta-predictor with improved performance over any individual method

    IN SILICO MODELING THE EFFECT OF SINGLE POINT MUTATIONS AND RESCUING THE EFFECT BY SMALL MOLECULES BINDING

    Get PDF
    Single-point mutation in genome, for example, single-nucleotide polymorphism (SNP) or rare genetic mutation, is the change of a single nucleotide for another in the genome sequence. Some of them will result in an amino acid substitution in the corresponding protein sequence (missense mutations); others will not. This investigation focuses on genetic mutations resulting in a change in the amino acid sequence of the corresponding protein. This choice is motivated by the fact that missense mutations are frequently found to affect the native function of proteins by altering their structure, interaction and other properties and cause diseases. A particular disease is the Snyder-Robinson syndrome (SRS), which is an X-linked mental retardation found to be caused by missense mutations in human spermine synthase (SMS). In this thesis, a rational approach to predict the effects of missense mutations on SMS wild-type characteristics was carried. Following this work, a structure-based virtual screening of small molecules was applied to rescue the disease-causing effect by searching the small molecules to stabilize the malfunctioning SMS mutant dimer
    corecore