151 research outputs found

    Identification of deleterious non-synonymous single nucleotide polymorphisms using sequence-derived information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>As the number of non-synonymous single nucleotide polymorphisms (nsSNPs), also known as single amino acid polymorphisms (SAPs), increases rapidly, computational methods that can distinguish disease-causing SAPs from neutral SAPs are needed. Many methods have been developed to distinguish disease-causing SAPs based on both structural and sequence features of the mutation point. One limitation of these methods is that they are not applicable to the cases where protein structures are not available. In this study, we explore the feasibility of classifying SAPs into disease-causing and neutral mutations using only information derived from protein sequence.</p> <p>Results</p> <p>We compiled a set of 686 features that were derived from protein sequence. For each feature, the distance between the wild-type residue and mutant-type residue was computed. Then a greedy approach was used to select the features that were useful for the classification of SAPs. 10 features were selected. Using the selected features, a decision tree method can achieve 82.6% overall accuracy with 0.607 Matthews Correlation Coefficient (MCC) in cross-validation. When tested on an independent set that was not seen by the method during the training and feature selection, the decision tree method achieves 82.6% overall accuracy with 0.604 MCC. We also evaluated the proposed method on all SAPs obtained from the Swiss-Prot, the method achieves 0.42 MCC with 73.2% overall accuracy. This method allows users to make reliable predictions when protein structures are not available. Different from previous studies, in which only a small set of features were arbitrarily chosen and considered, here we used an automated method to systematically discover useful features from a large set of features well-annotated in public databases.</p> <p>Conclusion</p> <p>The proposed method is a useful tool for the classification of SAPs, especially, when the structure of the protein is not available.</p

    Improving the prediction of disease-related variants using protein three-dimensional structure

    Get PDF
    Background: Single Nucleotide Polymorphisms (SNPs) are an important source of human genome variability. Non-synonymous SNPs occurring in coding regions result in single amino acid polymorphisms (SAPs) that may affect protein function and lead to pathology. Several methods attempt to estimate the impact of SAPs using different sources of information. Although sequence-based predictors have shown good performance, the quality of these predictions can be further improved by introducing new features derived from three-dimensional protein structures.Results: In this paper, we present a structure-based machine learning approach for predicting disease-related SAPs. We have trained a Support Vector Machine (SVM) on a set of 3,342 disease-related mutations and 1,644 neutral polymorphisms from 784 protein chains. We use SVM input features derived from the protein's sequence, structure, and function. After dataset balancing, the structure-based method (SVM-3D) reaches an overall accuracy of 85%, a correlation coefficient of 0.70, and an area under the receiving operating characteristic curve (AUC) of 0.92. When compared with a similar sequence-based predictor, SVM-3D results in an increase of the overall accuracy and AUC by 3%, and correlation coefficient by 0.06. The robustness of this improvement has been tested on different datasets and in all the cases SVM-3D performs better than previously developed methods even when compared with PolyPhen2, which explicitly considers in input protein structure information.Conclusion: This work demonstrates that structural information can increase the accuracy of disease-related SAPs identification. Our results also quantify the magnitude of improvement on a large dataset. This improvement is in agreement with previously observed results, where structure information enhanced the prediction of protein stability changes upon mutation. Although the structural information contained in the Protein Data Bank is limiting the application and the performance of our structure-based method, we expect that SVM-3D will result in higher accuracy when more structural date become available. \ua9 2011 Capriotti; licensee BioMed Central Ltd

    wKinMut: An integrated tool for the analysis and interpretation of mutations in human protein kinases

    Get PDF
    BACKGROUND: Protein kinases are involved in relevant physiological functions and a broad number of mutations in this superfamily have been reported in the literature to affect protein function and stability. Unfortunately, the exploration of the consequences on the phenotypes of each individual mutation remains a considerable challenge. RESULTS: The wKinMut web-server offers direct prediction of the potential pathogenicity of the mutations from a number of methods, including our recently developed prediction method based on the combination of information from a range of diverse sources, including physicochemical properties and functional annotations from FireDB and Swissprot and kinase-specific characteristics such as the membership to specific kinase groups, the annotation with disease-associated GO terms or the occurrence of the mutation in PFAM domains, and the relevance of the residues in determining kinase subfamily specificity from S3Det. This predictor yields interesting results that compare favourably with other methods in the field when applied to protein kinases. Together with the predictions, wKinMut offers a number of integrated services for the analysis of mutations. These include: the classification of the kinase, information about associations of the kinase with other proteins extracted from iHop, the mapping of the mutations onto PDB structures, pathogenicity records from a number of databases and the classification of mutations in large-scale cancer studies. Importantly, wKinMut is connected with the SNP2L system that extracts mentions of mutations directly from the literature, and therefore increases the possibilities of finding interesting functional information associated to the studied mutations. CONCLUSIONS: wKinMut facilitates the exploration of the information available about individual mutations by integrating prediction approaches with the automatic extraction of information from the literature (text mining) and several state-of-the-art databases. wKinMut has been used during the last year for the analysis of the consequences of mutations in the context of a number of cancer genome projects, including the recent analysis of Chronic Lymphocytic Leukemia cases and is publicly available at http://wkinmut.bioinfo.cnio.es

    Characterization of pathogenic germline mutations in human Protein Kinases

    Get PDF
    Background Protein Kinases are a superfamily of proteins involved in crucial cellular processes such as cell cycle regulation and signal transduction. Accordingly, they play an important role in cancer biology. To contribute to the study of the relation between kinases and disease we compared pathogenic mutations to neutral mutations as an extension to our previous analysis of cancer somatic mutations. First, we analyzed native and mutant proteins in terms of amino acid composition. Secondly, mutations were characterized according to their potential structural effects and finally, we assessed the location of the different classes of polymorphisms with respect to kinase-relevant positions in terms of subfamily specificity, conservation, accessibility and functional sites.&lt;p&gt;&lt;/p&gt; Results Pathogenic Protein Kinase mutations perturb essential aspects of protein function, including disruption of substrate binding and/or effector recognition at family-specific positions. Interestingly these mutations in Protein Kinases display a tendency to avoid structurally relevant positions, what represents a significant difference with respect to the average distribution of pathogenic mutations in other protein families.&lt;p&gt;&lt;/p&gt; Conclusions Disease-associated mutations display sound differences with respect to neutral mutations: several amino acids are specific of each mutation type, different structural properties characterize each class and the distribution of pathogenic mutations within the consensus structure of the Protein Kinase domain is substantially different to that for non-pathogenic mutations. This preferential distribution confirms previous observations about the functional and structural distribution of the controversial cancer driver and passenger somatic mutations and their use as a proxy for the study of the involvement of somatic mutations in cancer development.&lt;p&gt;&lt;/p&gt

    Prediction of Deleterious Non-Synonymous SNPs Based on Protein Interaction Network and Hybrid Properties

    Get PDF
    Non-synonymous SNPs (nsSNPs), also known as Single Amino acid Polymorphisms (SAPs) account for the majority of human inherited diseases. It is important to distinguish the deleterious SAPs from neutral ones. Most traditional computational methods to classify SAPs are based on sequential or structural features. However, these features cannot fully explain the association between a SAP and the observed pathophysiological phenotype. We believe the better rationale for deleterious SAP prediction should be: If a SAP lies in the protein with important functions and it can change the protein sequence and structure severely, it is more likely related to disease. So we established a method to predict deleterious SAPs based on both protein interaction network and traditional hybrid properties. Each SAP is represented by 472 features that include sequential features, structural features and network features. Maximum Relevance Minimum Redundancy (mRMR) method and Incremental Feature Selection (IFS) were applied to obtain the optimal feature set and the prediction model was Nearest Neighbor Algorithm (NNA). In jackknife cross-validation, 83.27% of SAPs were correctly predicted when the optimized 263 features were used. The optimized predictor with 263 features was also tested in an independent dataset and the accuracy was still 80.00%. In contrast, SIFT, a widely used predictor of deleterious SAPs based on sequential features, has a prediction accuracy of 71.05% on the same dataset. In our study, network features were found to be most important for accurate prediction and can significantly improve the prediction performance. Our results suggest that the protein interaction context could provide important clues to help better illustrate SAP's functional association. This research will facilitate the post genome-wide association studies

    Re-Patterning Sleep Architecture in Drosophila through Gustatory Perception and Nutritional Quality

    Get PDF
    Organisms perceive changes in their dietary environment and enact a suite of behavioral and metabolic adaptations that can impact motivational behavior, disease resistance, and longevity. However, the precise nature and mechanism of these dietary responses is not known. We have uncovered a novel link between dietary factors and sleep behavior in Drosophila melanogaster. Dietary sugar rapidly altered sleep behavior by modulating the number of sleep episodes during both the light and dark phase of the circadian period, independent of an intact circadian rhythm and without affecting total sleep, latency to sleep, or waking activity. The effect of sugar on sleep episode number was consistent with a change in arousal threshold for waking. Dietary protein had no significant effect on sleep or wakefulness. Gustatory perception of sugar was necessary and sufficient to increase the number of sleep episodes, and this effect was blocked by activation of bitter-sensing neurons. Further addition of sugar to the diet blocked the effects of sweet gustatory perception through a gustatory-independent mechanism. However, gustatory perception was not required for diet-induced fat accumulation, indicating that sleep and energy storage are mechanistically separable. We propose a two-component model where gustatory and metabolic cues interact to regulate sleep architecture in response to the quantity of sugar available from dietary sources. Reduced arousal threshold in response to low dietary availability may have evolved to provide increased responsiveness to cues associated with alternative nutrient-dense feeding sites. These results provide evidence that gustatory perception can alter arousal thresholds for sleep behavior in response to dietary cues and provide a mechanism by which organisms tune their behavior and physiology to environmental cues

    VLPs and particle strategies for cancer vaccines

    Get PDF
    n/

    Increased Monocyte Turnover from Bone Marrow Correlates with Severity of SIV Encephalitis and CD163 Levels in Plasma

    Get PDF
    Cells of the myeloid lineage are significant targets for human immunodeficiency virus (HIV) in humans and simian immunodeficiency virus (SIV) in monkeys. Monocytes play critical roles in innate and adaptive immunity during inflammation. We hypothesize that specific subsets of monocytes expand with AIDS and drive central nervous system (CNS) disease. Additionally, there may be expansion of cells from the bone marrow through blood with subsequent macrophage accumulation in tissues driving pathogenesis. To identify monocytes that recently emigrated from bone marrow, we used 5-bromo-2′-deoxyuridine (BrdU) labeling in a longitudinal study of SIV-infected CD8+ T lymphocyte depleted macaques. Monocyte expansion and kinetics in blood was assessed and newly migrated monocyte/macrophages were identified within the CNS. Five animals developed rapid AIDS with differing severity of SIVE. The percentages of BrdU+ monocytes in these animals increased dramatically, early after infection, peaking at necropsy where the percentage of BrdU+ monocytes correlated with the severity of SIVE. Early analysis revealed changes in the percentages of BrdU+ monocytes between slow and rapid progressors as early as 8 days and consistently by 27 days post infection. Soluble CD163 (sCD163) in plasma correlated with the percentage of BrdU+ monocytes in blood, demonstrating a relationship between monocyte activation and expansion with disease. BrdU+ monocytes/macrophages were found within perivascular spaces and SIVE lesions. The majority (80–90%) of the BrdU+ cells were Mac387+ that were not productively infected. There was a minor population of CD68+BrdU+ cells (<10%), very few of which were infected (<1% of total BrdU+ cells). Our results suggest that an increased rate of monocyte recruitment from bone marrow into the blood correlates with rapid progression to AIDS, and the magnitude of BrdU+ monocytes correlates with the severity of SIVE

    Chronic Hypoxia Impairs Muscle Function in the Drosophila Model of Duchenne's Muscular Dystrophy (DMD)

    Get PDF
    Duchenne's muscular dystrophy (DMD) is a severe progressive myopathy caused by mutations in the DMD gene leading to a deficiency of the dystrophin protein. Due to ongoing muscle necrosis in respiratory muscles late-stage DMD is associated with respiratory insufficiency and chronic hypoxia (CH). To understand the effects of CH on dystrophin-deficient muscle in vivo, we exposed the Drosophila model for DMD (dmDys) to CH during a 16-day ascent to the summit of Mount Denali/McKinley (6194 meters above sea level). Additionally, dmDys and wild type (WT) flies were also exposed to CH in laboratory simulations of high altitude hypoxia. Expression profiling was performed using Affymetrix GeneChips® and validated using qPCR. Hypoxic dmDys differentially expressed 1281 genes, whereas the hypoxic WT flies differentially expressed 56 genes. Interestingly, a number of genes (e.g. heat shock proteins) were discordantly regulated in response to CH between dmDys and WT. We tested the possibility that the disparate molecular responses of dystrophin-deficient tissues to CH could adversely affect muscle by performing functional assays in vivo. Normoxic and CH WT and dmDys flies were challenged with acute hypoxia and time-to-recover determined as well as subjected to climbing tests. Impaired performance was noted for CH-dmDys compared to normoxic dmDys or WT flies (rank order: Normoxic-WT ≈ CH-WT> Normoxic-dmDys> CH-dmDys). These data suggest that dystrophin-deficiency is associated with a disparate, pathological hypoxic stress response(s) and is more sensitive to hypoxia induced muscle dysfunction in vivo. We hypothesize that targeting/correcting the disparate molecular response(s) to hypoxia may offer a novel therapeutic strategy in DMD

    Distinct Type of Transmission Barrier Revealed by Study of Multiple Prion Determinants of Rnq1

    Get PDF
    Prions are self-propagating protein conformations. Transmission of the prion state between non-identical proteins, e.g. between homologous proteins from different species, is frequently inefficient. Transmission barriers are attributed to sequence differences in prion proteins, but their underlying mechanisms are not clear. Here we use a yeast Rnq1/[PIN+]-based experimental system to explore the nature of transmission barriers. [PIN+], the prion form of Rnq1, is common in wild and laboratory yeast strains, where it facilitates the appearance of other prions. Rnq1's prion domain carries four discrete QN-rich regions. We start by showing that Rnq1 encompasses multiple prion determinants that can independently drive amyloid formation in vitro and transmit the [PIN+] prion state in vivo. Subsequent analysis of [PIN+] transmission between Rnq1 fragments with different sets of prion determinants established that (i) one common QN-rich region is required and usually sufficient for the transmission; (ii) despite identical sequences of the common QNs, such transmissions are impeded by barriers of different strength. Existence of transmission barriers in the absence of amino acid mismatches in transmitting regions indicates that in complex prion domains multiple prion determinants act cooperatively to attain the final prion conformation, and reveals transmission barriers determined by this cooperative fold
    corecore