108 research outputs found

    SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants

    Get PDF
    Single nucleotide variants (SNVs) are, together with copy number variation, the primary source of variation in the human genome and are associated with phenotypic variation such as altered response to drug treatment and susceptibility to disease. Linking structural effects of non-synonymous SNVs to functional outcomes is a major issue in structural bioinformatics. The SNPeffect database (http://snpeffect.switchlab.org) uses sequence- and structure-based bioinformatics tools to predict the effect of protein-coding SNVs on the structural phenotype of proteins. It integrates aggregation prediction (TANGO), amyloid prediction (WALTZ), chaperone-binding prediction (LIMBO) and protein stability analysis (FoldX) for structural phenotyping. Additionally, SNPeffect holds information on affected catalytic sites and a number of post-translational modifications. The database contains all known human protein variants from UniProt, but users can now also submit custom protein variants for a SNPeffect analysis, including automated structure modeling. The new meta-analysis application allows plotting correlations between phenotypic features for a user-selected set of variants

    Developing an optimised activity type annotation method based on classification accuracy and entropy indices

    Get PDF
    The generation of substantial amounts of travel and mobility related data has spawned the emergence of the era of big data. However, this data generally lacks activity-travel information such as trip purpose. This deficiency led to the development of trip purpose inference (activity type imputation / annotation) techniques, of which the performance depends on the available input data and the (number of) activity type classes to infer. Aggregating activity types strongly increases the inference accuracy and is usually left to the discretion of the researcher. As this is open for interpretation, it undermines the reported inference accuracy. This study developed an optimised classification methodology by identifying classes of activity types with an optimal balance between improving model accuracy, and preserving activity information from the original data set. A sensitivity analysis was performed. Additionally, several machine learning algorithms are experimented with. The proposed method may be applied to any study area

    An Evolutionary Trade-Off between Protein Turnover Rate and Protein Aggregation Favors a Higher Aggregation Propensity in Fast Degrading Proteins

    Get PDF
    We previously showed the existence of selective pressure against protein aggregation by the enrichment of aggregation-opposing ‘gatekeeper’ residues at strategic places along the sequence of proteins. Here we analyzed the relationship between protein lifetime and protein aggregation by combining experimentally determined turnover rates, expression data, structural data and chaperone interaction data on a set of more than 500 proteins. We find that selective pressure on protein sequences against aggregation is not homogeneous but that short-living proteins on average have a higher aggregation propensity and fewer chaperone interactions than long-living proteins. We also find that short-living proteins are more often associated to deposition diseases. These findings suggest that the efficient degradation of high-turnover proteins is sufficient to preclude aggregation, but also that factors that inhibit proteasomal activity, such as physiological ageing, will primarily affect the aggregation of short-living proteins

    MutDB: update on development of tools for the biochemical analysis of genetic variation

    Get PDF
    Understanding how genetic variation affects the molecular function of gene products is an emergent area of bioinformatic research. Here, we present updates to MutDB (http://www.mutdb.org), a tool aiming to aid bioinformatic studies by integrating publicly available databases of human genetic variation with molecular features and clinical phenotype data. MutDB, first developed in 2002, integrates annotated SNPs in dbSNP and amino acid substitutions in Swiss-Prot with protein structural information, links to scores that predict functional disruption and other useful annotations. Though these functional annotations are mainly focused on nonsynonymous SNPs, some information on other SNP types included in dbSNP is also provided. Additionally, we have developed a new functionality that facilitates KEGG pathway visualization of genes containing SNPs and a SNP query tool for visualizing and exporting sets of SNPs that share selected features based on certain filters

    Improved Detection of Rare Genetic Variants for Diseases

    Get PDF
    Technology advances have promoted gene-based sequencing studies with the aim of identifying rare mutations responsible for complex diseases. A complication in these types of association studies is that the vast majority of non-synonymous mutations are believed to be neutral to phenotypes. It is thus critical to distinguish potential causative variants from neutral variation before performing association tests. In this study, we used existing predicting algorithms to predict functional amino acid substitutions, and incorporated that information into association tests. Using simulations, we comprehensively studied the effects of several influential factors, including the sensitivity and specificity of functional variant predictions, number of variants, and proportion of causative variants, on the performance of association tests. Our results showed that incorporating information regarding functional variants obtained from existing prediction algorithms improves statistical power under certain conditions, particularly when the proportion of causative variants is moderate. The application of the proposed tests to a real sequencing study confirms our conclusions. Our work may help investigators who are planning to pursue gene-based sequencing studies

    Genetic polymorphisms are associated with serum levels of sex hormone binding globulin in postmenopausal women

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Estrogen activity plays a critical role in bone homeostasis. The serum levels of sex hormone binding globulin (SHBG) influence free estrogen levels and activity on target tissues. The objective of this study was to analyze the influence of common polymorphisms of the <it>SHBG </it>gene on serum SHBG, bone mineral density (BMD), and osteoporotic fractures.</p> <p>Methods</p> <p>Four biallelic polymorphisms of the <it>SHBG </it>gene were studied by means of Taqman assays in 753 postmenopausal women. BMD was measured by DXA and serum SHBG was measured by ELISA.</p> <p>Results</p> <p>Age, body weight, and two polymorphisms of the <it>SHBG </it>gene (rs6257 and rs1799941 [A/G]) were significantly associated with serum SHBG in unadjusted and age- and weight-adjusted models. Alleles at the rs1799941 locus showed the strongest association with serum SHBG (p = 0.0004). The difference in SHBG levels between women with AA and GG genotypes at the rs1799941 locus was 39%. There were no significant differences in BMD across SHBG genotypes. The genotypes showed similar frequency distributions in control women and women with vertebral or hip fractures.</p> <p>Conclusion</p> <p>Some common genetic variants of the <it>SHBG </it>gene, and particularly an A/G polymorphism situated in the 5' region, influence serum SHBG levels. However, a significant association with BMD or osteoporotic fractures has not been demonstrated.</p

    Domain Altering SNPs in the Human Proteome and Their Impact on Signaling Pathways

    Get PDF
    Single nucleotide polymorphisms (SNPs) constitute an important mode of genetic variations observed in the human genome. A small fraction of SNPs, about four thousand out of the ten million, has been associated with genetic disorders and complex diseases. The present study focuses on SNPs that fall on protein domains, 3D structures that facilitate connectivity of proteins in cell signaling and metabolic pathways. We scanned the human proteome using the PROSITE web tool and identified proteins with SNP containing domains. We showed that SNPs that fall on protein domains are highly statistically enriched among SNPs linked to hereditary disorders and complex diseases. Proteins whose domains are dramatically altered by the presence of an SNP are even more likely to be present among proteins linked to hereditary disorders. Proteins with domain-altering SNPs comprise highly connected nodes in cellular pathways such as the focal adhesion, the axon guidance pathway and the autoimmune disease pathways. Statistical enrichment of domain/motif signatures in interacting protein pairs indicates extensive loss of connectivity of cell signaling pathways due to domain-altering SNPs, potentially leading to hereditary disorders

    Distribution and Effects of Nonsense Polymorphisms in Human Genes

    Get PDF
    BACKGROUND: A great amount of data has been accumulated on genetic variations in the human genome, but we still do not know much about how the genetic variations affect gene function. In particular, little is known about the distribution of nonsense polymorphisms in human genes despite their drastic effects on gene products. METHODOLOGY/PRINCIPAL FINDINGS: To detect polymorphisms affecting gene function, we analyzed all publicly available polymorphisms in a database for single nucleotide polymorphisms (dbSNP build 125) located in the exons of 36,712 known and predicted protein-coding genes that were defined in an annotation project of all human genes and transcripts (H-InvDB ver3.8). We found a total of 252,555 single nucleotide polymorphisms (SNPs) and 8,479 insertion and deletions in the representative transcripts in these genes. The SNPs located in ORFs include 40,484 synonymous and 53,754 nonsynonymous SNPs, and 1,258 SNPs that were predicted to be nonsense SNPs or read-through SNPs. We estimated the density of nonsense SNPs to be 0.85x10(-3) per site, which is lower than that of nonsynonymous SNPs (2.1x10(-3) per site). On average, nonsense SNPs were located 250 codons upstream of the original termination codon, with the substitution occurring most frequently at the first codon position. Of the nonsense SNPs, 581 were predicted to cause nonsense-mediated decay (NMD) of transcripts that would prevent translation. We found that nonsense SNPs causing NMD were more common in genes involving kinase activity and transport. The remaining 602 nonsense SNPs are predicted to produce truncated polypeptides, with an average truncation of 75 amino acids. In addition, 110 read-through SNPs at termination codons were detected. CONCLUSION/SIGNIFICANCE: Our comprehensive exploration of nonsense polymorphisms showed that nonsense SNPs exist at a lower density than nonsynonymous SNPs, suggesting that nonsense mutations have more severe effects than amino acid changes. The correspondence of nonsense SNPs to known pathological variants suggests that phenotypic effects of nonsense SNPs have been reported for only a small fraction of nonsense SNPs, and that nonsense SNPs causing NMD are more likely to be involved in phenotypic variations. These nonsense SNPs may include pathological variants that have not yet been reported. These data are available from Transcript View of H-InvDB and VarySysDB (http://h-invitational.jp/varygene/)

    Genetically Engineered iPSC-Derived FTDP-17 MAPT Neurons Display Mutation-Specific Neurodegenerative and Neurodevelopmental Phenotypes

    Get PDF
    Tauopathies such as frontotemporal dementia (FTD) remain incurable to date, partially due to the lack of translational in vitro disease models. The MAPT gene, encoding the microtubule-associated protein tau, has been shown to play an important role in FTD pathogenesis. Therefore, we used zinc finger nucleases to introduce two MAPT mutations into healthy donor induced pluripotent stem cells (iPSCs). The IVS10+16 mutation increases the expression of 4R tau, while the P301S mutation is pro-aggregant. Whole-transcriptome analysis of MAPT IVS10+16 neurons reveals neuronal subtype differences, reduced neural progenitor proliferation potential, and aberrant WNT/SHH signaling. Notably, these neurodevelopmental phenotypes could be recapitulated in neurons from patients carrying the MAPT IVS10+16 mutation. Moreover, the additional pro-aggregant P301S mutation revealed additional phenotypes, such as an increased calcium burst frequency, reduced lysosomal acidity, tau oligomerization, and neurodegeneration. This series of iPSCs could serve as a platform to unravel a potential link between pathogenic 4R tau and FTD

    Predicting disease-associated substitution of a single amino acid by analyzing residue interactions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The rapid accumulation of data on non-synonymous single nucleotide polymorphisms (nsSNPs, also called SAPs) should allow us to further our understanding of the underlying disease-associated mechanisms. Here, we use complex networks to study the role of an amino acid in both local and global structures and determine the extent to which disease-associated and polymorphic SAPs differ in terms of their interactions to other residues.</p> <p>Results</p> <p>We found that SAPs can be well characterized by network topological features. Mutations are probably disease-associated when they occur at a site with a high centrality value and/or high degree value in a protein structure network. We also discovered that study of the neighboring residues around a mutation site can help to determine whether the mutation is disease-related or not. We compiled a dataset from the Swiss-Prot variant pages and constructed a model to predict disease-associated SAPs based on the random forest algorithm. The values of total accuracy and MCC were 83.0% and 0.64, respectively, as determined by 5-fold cross-validation. With an independent dataset, our model achieved a total accuracy of 80.8% and MCC of 0.59, respectively.</p> <p>Conclusions</p> <p>The satisfactory performance suggests that network topological features can be used as quantification measures to determine the importance of a site on a protein, and this approach can complement existing methods for prediction of disease-associated SAPs. Moreover, the use of this method in SAP studies would help to determine the underlying linkage between SAPs and diseases through extensive investigation of mutual interactions between residues.</p
    corecore