243 research outputs found

    Identification of deleterious non-synonymous single nucleotide polymorphisms using sequence-derived information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>As the number of non-synonymous single nucleotide polymorphisms (nsSNPs), also known as single amino acid polymorphisms (SAPs), increases rapidly, computational methods that can distinguish disease-causing SAPs from neutral SAPs are needed. Many methods have been developed to distinguish disease-causing SAPs based on both structural and sequence features of the mutation point. One limitation of these methods is that they are not applicable to the cases where protein structures are not available. In this study, we explore the feasibility of classifying SAPs into disease-causing and neutral mutations using only information derived from protein sequence.</p> <p>Results</p> <p>We compiled a set of 686 features that were derived from protein sequence. For each feature, the distance between the wild-type residue and mutant-type residue was computed. Then a greedy approach was used to select the features that were useful for the classification of SAPs. 10 features were selected. Using the selected features, a decision tree method can achieve 82.6% overall accuracy with 0.607 Matthews Correlation Coefficient (MCC) in cross-validation. When tested on an independent set that was not seen by the method during the training and feature selection, the decision tree method achieves 82.6% overall accuracy with 0.604 MCC. We also evaluated the proposed method on all SAPs obtained from the Swiss-Prot, the method achieves 0.42 MCC with 73.2% overall accuracy. This method allows users to make reliable predictions when protein structures are not available. Different from previous studies, in which only a small set of features were arbitrarily chosen and considered, here we used an automated method to systematically discover useful features from a large set of features well-annotated in public databases.</p> <p>Conclusion</p> <p>The proposed method is a useful tool for the classification of SAPs, especially, when the structure of the protein is not available.</p

    Exhaustive prediction of disease susceptibility to coding base changes in the human genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Single Nucleotide Polymorphisms (SNPs) are the most abundant form of genomic variation and can cause phenotypic differences between individuals, including diseases. Bases are subject to various levels of selection pressure, reflected in their inter-species conservation.</p> <p>Results</p> <p>We propose a method that is not dependant on transcription information to score each coding base in the human genome reflecting the disease probability associated with its mutation. Twelve factors likely to be associated with disease alleles were chosen as the input for a support vector machine prediction algorithm. The analysis yielded 83% sensitivity and 84% specificity in segregating disease like alleles as found in the Human Gene Mutation Database from non-disease like alleles as found in the Database of Single Nucleotide Polymorphisms. This algorithm was subsequently applied to each base within all known human genes, exhaustively confirming that interspecies conservation is the strongest factor for disease association. For each gene, the length normalized average disease potential score was calculated. Out of the 30 genes with the highest scores, 21 are directly associated with a disease. In contrast, out of the 30 genes with the lowest scores, only one is associated with a disease as found in published literature. The results strongly suggest that the highest scoring genes are enriched for those that might contribute to disease, if mutated.</p> <p>Conclusion</p> <p>This method provides valuable information to researchers to identify sensitive positions in genes that have a high disease probability, enabling them to optimize experimental designs and interpret data emerging from genetic and epidemiological studies.</p

    Genome-Wide Analysis of Human Disease Alleles Reveals That Their Locations Are Correlated in Paralogous Proteins

    Get PDF
    The millions of mutations and polymorphisms that occur in human populations are potential predictors of disease, of our reactions to drugs, of predisposition to microbial infections, and of age-related conditions such as impaired brain and cardiovascular functions. However, predicting the phenotypic consequences and eventual clinical significance of a sequence variant is not an easy task. Computational approaches have found perturbation of conserved amino acids to be a useful criterion for identifying variants likely to have phenotypic consequences. To our knowledge, however, no study to date has explored the potential of variants that occur at homologous positions within paralogous human proteins as a means of identifying polymorphisms with likely phenotypic consequences. In order to investigate the potential of this approach, we have assembled a unique collection of known disease-causing variants from OMIM and the Human Genome Mutation Database (HGMD) and used them to identify and characterize pairs of sequence variants that occur at homologous positions within paralogous human proteins. Our analyses demonstrate that the locations of variants are correlated in paralogous proteins. Moreover, if one member of a variant-pair is disease-causing, its partner is likely to be disease-causing as well. Thus, information about variant-pairs can be used to identify potentially disease-causing variants, extend existing procedures for polymorphism prioritization, and provide a suite of candidates for further diagnostic and therapeutic purposes

    Characteristics of transposable element exonization within human and mouse

    Get PDF
    Insertion of transposed elements within mammalian genes is thought to be an important contributor to mammalian evolution and speciation. Insertion of transposed elements into introns can lead to their activation as alternatively spliced cassette exons, an event called exonization. Elucidation of the evolutionary constraints that have shaped fixation of transposed elements within human and mouse protein coding genes and subsequent exonization is important for understanding of how the exonization process has affected transcriptome and proteome complexities. Here we show that exonization of transposed elements is biased towards the beginning of the coding sequence in both human and mouse genes. Analysis of single nucleotide polymorphisms (SNPs) revealed that exonization of transposed elements can be population-specific, implying that exonizations may enhance divergence and lead to speciation. SNP density analysis revealed differences between Alu and other transposed elements. Finally, we identified cases of primate-specific Alu elements that depend on RNA editing for their exonization. These results shed light on TE fixation and the exonization process within human and mouse genes.Comment: 11 pages, 4 figure

    Apolipoprotein M Gene (APOM) Polymorphism Modifies Metabolic and Disease Traits in Type 2 Diabetes

    Get PDF
    This study aimed at substantiating the associations of the apolipoproein M gene (APOM) with type 2 diabetes (T2D) as well as with metabolic traits in Hong Kong Chinese. In addition, APOM gene function was further characterized to elucidate its activity in cholesterol metabolism. Seventeen APOM SNPs documented in the NCBI database were genotyped. Five SNPs were confirmed in our study cohort of 1234 T2D and 606 control participants. Three of the five SNPs rs707921(C+1871A), rs707922(G+1837T) and rs805264(G+203A) were in linkage disequilibrium (LD). We chose rs707922 to tag this LD region for down stream association analyses and characterized the function of this SNP at molecular level. No association between APOM and T2D susceptibility was detected in our Hong Kong Chinese cohort. Interestingly, the C allele of rs805297 was significantly associated with T2D duration of longer than 10 years (OR = 1.245, p = 0.015). The rs707922 TT genotype was significantly associated with elevated plasma total- and LDL- cholesterol levels (p = 0.006 and p = 0.009, respectively) in T2D patients. Molecular analyses of rs707922 lead to the discoveries of a novel transcript APOM5 as well as the cryptic nature of exon 5 of the gene. Ectopic expression of APOM5 transcript confirmed rs707922 allele-dependent activity of the transcript in modifying cholesterol homeostasis in vitro. In conclusion, the results here did not support APOM as a T2D susceptibility gene in Hong Kong Chinese. However, in T2D patients, a subset of APOM SNPs was associated with disease duration and metabolic traits. Further molecular analysis proved the functional activity of rs707922 in APOM expression and in regulation of cellular cholesterol content

    Rhinovirus Genome Variation during Chronic Upper and Lower Respiratory Tract Infections

    Get PDF
    Routine screening of lung transplant recipients and hospital patients for respiratory virus infections allowed to identify human rhinovirus (HRV) in the upper and lower respiratory tracts, including immunocompromised hosts chronically infected with the same strain over weeks or months. Phylogenetic analysis of 144 HRV-positive samples showed no apparent correlation between a given viral genotype or species and their ability to invade the lower respiratory tract or lead to protracted infection. By contrast, protracted infections were found almost exclusively in immunocompromised patients, thus suggesting that host factors rather than the virus genotype modulate disease outcome, in particular the immune response. Complete genome sequencing of five chronic cases to study rhinovirus genome adaptation showed that the calculated mutation frequency was in the range observed during acute human infections. Analysis of mutation hot spot regions between specimens collected at different times or in different body sites revealed that non-synonymous changes were mostly concentrated in the viral capsid genes VP1, VP2 and VP3, independent of the HRV type. In an immunosuppressed lung transplant recipient infected with the same HRV strain for more than two years, both classical and ultra-deep sequencing of samples collected at different time points in the upper and lower respiratory tracts showed that these virus populations were phylogenetically indistinguishable over the course of infection, except for the last month. Specific signatures were found in the last two lower respiratory tract populations, including changes in the 5′UTR polypyrimidine tract and the VP2 immunogenic site 2. These results highlight for the first time the ability of a given rhinovirus to evolve in the course of a natural infection in immunocompromised patients and complement data obtained from previous experimental inoculation studies in immunocompetent volunteers

    Building Babies - Chapter 16

    Get PDF
    In contrast to birds, male mammals rarely help to raise the offspring. Of all mammals, only among rodents, carnivores, and primates, males are sometimes intensively engaged in providing infant care (Kleiman and Malcolm 1981). Male caretaking of infants has long been recognized in nonhuman primates (Itani 1959). Given that infant care behavior can have a positive effect on the infant’s development, growth, well-being, or survival, why are male mammals not more frequently involved in “building babies”? We begin the chapter defining a few relevant terms and introducing the theory and hypotheses that have historically addressed the evolution of paternal care. We then review empirical findings on male care among primate taxa, before focusing, in the final section, on our own work on paternal care in South American owl monkeys (Aotus spp.). We conclude the chapter with some suggestions for future studies.Deutsche Forschungsgemeinschaft (HU 1746/2-1) Wenner-Gren Foundation, the L.S.B. Leakey Foundation, the National Geographic Society, the National Science Foundation (BCS-0621020), the University of Pennsylvania Research Foundation, the Zoological Society of San Dieg

    Predicting disease-associated substitution of a single amino acid by analyzing residue interactions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The rapid accumulation of data on non-synonymous single nucleotide polymorphisms (nsSNPs, also called SAPs) should allow us to further our understanding of the underlying disease-associated mechanisms. Here, we use complex networks to study the role of an amino acid in both local and global structures and determine the extent to which disease-associated and polymorphic SAPs differ in terms of their interactions to other residues.</p> <p>Results</p> <p>We found that SAPs can be well characterized by network topological features. Mutations are probably disease-associated when they occur at a site with a high centrality value and/or high degree value in a protein structure network. We also discovered that study of the neighboring residues around a mutation site can help to determine whether the mutation is disease-related or not. We compiled a dataset from the Swiss-Prot variant pages and constructed a model to predict disease-associated SAPs based on the random forest algorithm. The values of total accuracy and MCC were 83.0% and 0.64, respectively, as determined by 5-fold cross-validation. With an independent dataset, our model achieved a total accuracy of 80.8% and MCC of 0.59, respectively.</p> <p>Conclusions</p> <p>The satisfactory performance suggests that network topological features can be used as quantification measures to determine the importance of a site on a protein, and this approach can complement existing methods for prediction of disease-associated SAPs. Moreover, the use of this method in SAP studies would help to determine the underlying linkage between SAPs and diseases through extensive investigation of mutual interactions between residues.</p

    An integrated approach to the interpretation of Single Amino Acid Polymorphisms within the framework of CATH and Gene3D

    Get PDF
    Background The phenotypic effects of sequence variations in protein-coding regions come about primarily via their effects on the resulting structures, for example by disrupting active sites or affecting structural stability. In order better to understand the mechanisms behind known mutant phenotypes, and predict the effects of novel variations, biologists need tools to gauge the impacts of DNA mutations in terms of their structural manifestation. Although many mutations occur within domains whose structure has been solved, many more occur within genes whose protein products have not been structurally characterized.&lt;p&gt;&lt;/p&gt; Results Here we present 3DSim (3D Structural Implication of Mutations), a database and web application facilitating the localization and visualization of single amino acid polymorphisms (SAAPs) mapped to protein structures even where the structure of the protein of interest is unknown. The server displays information on 6514 point mutations, 4865 of them known to be associated with disease. These polymorphisms are drawn from SAAPdb, which aggregates data from various sources including dbSNP and several pathogenic mutation databases. While the SAAPdb interface displays mutations on known structures, 3DSim projects mutations onto known sequence domains in Gene3D. This resource contains sequences annotated with domains predicted to belong to structural families in the CATH database. Mappings between domain sequences in Gene3D and known structures in CATH are obtained using a MUSCLE alignment. 1210 three-dimensional structures corresponding to CATH structural domains are currently included in 3DSim; these domains are distributed across 396 CATH superfamilies, and provide a comprehensive overview of the distribution of mutations in structural space.&lt;p&gt;&lt;/p&gt; Conclusion The server is publicly available at http://3DSim.bioinfo.cnio.es/ webcite. In addition, the database containing the mapping between SAAPdb, Gene3D and CATH is available on request and most of the functionality is available through programmatic web service access.&lt;p&gt;&lt;/p&gt

    Malarial Hemozoin Activates the NLRP3 Inflammasome through Lyn and Syk Kinases

    Get PDF
    The intraerythrocytic parasite Plasmodium—the causative agent of malaria—produces an inorganic crystal called hemozoin (Hz) during the heme detoxification process, which is released into the circulation during erythrocyte lysis. Hz is rapidly ingested by phagocytes and induces the production of several pro-inflammatory mediators such as interleukin-1β (IL-1β). However, the mechanism regulating Hz recognition and IL-1β maturation has not been identified. Here, we show that Hz induces IL-1β production. Using knockout mice, we showed that Hz-induced IL-1β and inflammation are dependent on NOD-like receptor containing pyrin domain 3 (NLRP3), ASC and caspase-1, but not NLRC4 (NLR containing CARD domain). Furthermore, the absence of NLRP3 or IL-1β augmented survival to malaria caused by P. chabaudi adami DS. Although much has been discovered regarding the NLRP3 inflammasome induction, the mechanism whereby this intracellular multimolecular complex is activated remains unclear. We further demonstrate, using pharmacological and genetic intervention, that the tyrosine kinases Syk and Lyn play a critical role in activation of this inflammasome. These findings not only identify one way by which the immune system is alerted to malarial infection but also are one of the first to suggest a role for tyrosine kinase signaling pathways in regulation of the NLRP3 inflammasome
    corecore