92 research outputs found

    Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty

    Get PDF
    Background: Accurate estimation of statistical significance of a pairwise alignment is an important problem in sequence comparison. Recently, a comparative study of pairwise statistical significance with database statistical significance was conducted. In this paper, we extend the earlier work on pairwise statistical significance by incorporating with it the use of multiple parameter sets. Results: Results for a knowledge discovery application of homology detection reveal that using multiple parameter sets for pairwise statistical significance estimates gives better coverage than using a single parameter set, at least at some error levels. Further, the results of pairwise statistical significance using multiple parameter sets are shown to be significantly better than database statistical significance estimates reported by BLAST and PSI-BLAST, and comparable and at times significantly better than SSEARCH. Using non-zero parameter set change penalty values give better performance than zero penalty. Conclusion: The fact that the homology detection performance does not degrade when using multiple parameter sets is a strong evidence for the validity of the assumption that the alignment score distribution follows an extreme value distribution even when using multiple parameter sets. Parameter set change penalty is a useful parameter for alignment using multiple parameter sets. Pairwise statistical significance using multiple parameter sets can be effectively used to determine the relatedness of a (or a few) pair(s) of sequences without performing a time-consuming database search

    Island method for estimating the statistical significance of profile-profile alignment scores

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the last decade, a significant improvement in detecting remote similarity between protein sequences has been made by utilizing alignment profiles in place of amino-acid strings. Unfortunately, no analytical theory is available for estimating the significance of a gapped alignment of two profiles. Many experiments suggest that the distribution of local profile-profile alignment scores is of the Gumbel form. However, estimating distribution parameters by random simulations turns out to be computationally very expensive.</p> <p>Results</p> <p>We demonstrate that the background distribution of profile-profile alignment scores heavily depends on profiles' composition and thus the distribution parameters must be estimated independently, for each pair of profiles of interest. We also show that accurate estimates of statistical parameters can be obtained using the "island statistics" for profile-profile alignments.</p> <p>Conclusion</p> <p>The island statistics can be generalized to profile-profile alignments to provide an efficient method for the alignment score normalization. Since multiple island scores can be extracted from a single comparison of two profiles, the island method has a clear speed advantage over the direct shuffling method for comparable accuracy in parameter estimates.</p

    A Probabilistic Model of Local Sequence Alignment That Simplifies Statistical Significance Estimation

    Get PDF
    Sequence database searches require accurate estimation of the statistical significance of scores. Optimal local sequence alignment scores follow Gumbel distributions, but determining an important parameter of the distribution (λ) requires time-consuming computational simulation. Moreover, optimal alignment scores are less powerful than probabilistic scores that integrate over alignment uncertainty (“Forward” scores), but the expected distribution of Forward scores remains unknown. Here, I conjecture that both expected score distributions have simple, predictable forms when full probabilistic modeling methods are used. For a probabilistic model of local sequence alignment, optimal alignment bit scores (“Viterbi” scores) are Gumbel-distributed with constant λ = log 2, and the high scoring tail of Forward scores is exponential with the same constant λ. Simulation studies support these conjectures over a wide range of profile/sequence comparisons, using 9,318 profile-hidden Markov models from the Pfam database. This enables efficient and accurate determination of expectation values (E-values) for both Viterbi and Forward scores for probabilistic local alignments

    Genetic variation in a member of the laminin gene family affects variation in body composition in Drosophila and humans

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The objective of the present study was to map candidate loci influencing naturally occurring variation in triacylglycerol (TAG) storage using quantitative complementation procedures in <it>Drosophila melanogaster</it>. Based on our results from <it>Drosophila</it>, we performed a human population-based association study to investigate the effect of natural variation in <it>LAMA5 </it>gene on body composition in humans.</p> <p>Results</p> <p>We identified four candidate genes that contributed to differences in TAG storage between two strains of <it>D. melanogaster</it>, including <it>Laminin A </it>(<it>LanA</it>), which is a member of the α subfamily of laminin chains. We confirmed the effects of this gene using a viable <it>LanA </it>mutant and showed that female flies homozygous for the mutation had significantly lower TAG storage, body weight, and total protein content than control flies. <it>Drosophila LanA </it>is closely related to human <it>LAMA5 </it>gene, which maps to the well-replicated obesity-linkage region on chromosome 20q13.2-q13.3. We tested for association between three common single nucleotide polymorphisms (SNPs) in the human <it>LAMA5 </it>gene and variation in body composition and lipid profile traits in a cohort of unrelated women of European American (EA) and African American (AA) descent. In both ethnic groups, we found that SNP rs659822 was associated with weight (EA: <it>P </it>= 0.008; AA: <it>P </it>= 0.05) and lean mass (EA: <it>P= </it>0.003; AA: <it>P </it>= 0.03). We also found this SNP to be associated with height (<it>P </it>= 0.01), total fat mass (<it>P </it>= 0.01), and HDL-cholesterol (<it>P </it>= 0.003) but only in EA women. Finally, significant associations of SNP rs944895 with serum TAG levels (<it>P </it>= 0.02) and HDL-cholesterol (<it>P </it>= 0.03) were observed in AA women.</p> <p>Conclusion</p> <p>Our results suggest an evolutionarily conserved role of a member of the laminin gene family in contributing to variation in weight and body composition.</p

    Prevalence of sexual dimorphism in mammalian phenotypic traits

    Get PDF
    The role of sex in biomedical studies has often been overlooked, despite evidence of sexually dimorphic effects in some biological studies. Here, we used high-throughput phenotype data from 14,250 wildtype and 40,192 mutant mice (representing 2,186 knockout lines), analysed for up to 234 traits, and found a large proportion of mammalian traits both in wildtype and mutants are influenced by sex. This result has implications for interpreting disease phenotypes in animal models and humans

    Vegetation fire smoke, indigenous status and cardio-respiratory hospital admissions in Darwin, Australia, 1996–2005: a time-series study

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Air pollution in Darwin, Northern Australia, is dominated by smoke from seasonal fires in the surrounding savanna that burn during the dry season from April to November. Our aim was to study the association between particulate matter less than or equal to 10 microns diameter (PM<sub>10</sub>) and daily emergency hospital admissions for cardio-respiratory diseases for each fire season from 1996 to 2005. We also investigated whether the relationship differed in indigenous Australians; a disadvantaged population sub-group.</p> <p>Methods</p> <p>Daily PM<sub>10 </sub>exposure levels were estimated for the population of the city from visibility data using a previously validated model. We used over-dispersed Poisson generalized linear models with parametric smoothing functions for time and meteorology to examine the association between admissions and PM<sub>10 </sub>up to three days prior. An interaction between indigenous status and PM<sub>10 </sub>was included to examine differences in the impact on indigenous people.</p> <p>Results</p> <p>We found both positive and negative associations and our estimates had wide confidence intervals. There were generally positive associations between respiratory disease and PM<sub>10 </sub>but not with cardiovascular disease. An increase of 10 μg/m<sup>3 </sup>in same-day estimated ambient PM<sub>10 </sub>was associated with a 4.81% (95%CI: -1.04%, 11.01%) increase in total respiratory admissions. When the interaction between indigenous status and PM<sub>10 </sub>was assessed a statistically different association was found between PM<sub>10 </sub>and admissions three days later for respiratory infections of indigenous people (15.02%; 95%CI: 3.73%, 27.54%) than for non-indigenous people (0.67%; 95%CI: -7.55%, 9.61%). There were generally negative estimates for cardiovascular conditions. For non-indigenous admissions the estimated association with total cardiovascular admissions for same day ambient PM<sub>10 </sub>and admissions was -3.43% (95%CI: -9.00%, 2.49%) and the estimate for indigenous admissions was -3.78% (95%CI: -13.4%, 6.91%), although ambient PM<sub>10 </sub>did have positive (non-significant) associations with cardiovascular admissions of indigenous people two and three days later.</p> <p>Conclusion</p> <p>We observed positive associations between vegetation fire smoke and daily hospital admissions for respiratory diseases that were stronger in indigenous people. While this study was limited by the use of estimated rather than measured exposure data, the results are consistent with the currently small evidence base concerning this source of air pollution.</p

    Adenosine A2A receptors: localization and function

    Get PDF
    Adenosine is an endogenous purine nucleoside present in all mammalian tissues, that originates from the breakdown of ATP. By binding to its four receptor subtypes (A1, A2A, A2B, and A3), adenosine regulates several important physiological functions at both the central and peripheral levels. Therefore, ligands for the different adenosine receptors are attracting increasing attention as new potential drugs to be used in the treatment of several diseases. This chapter is aimed at providing an overview of adenosine metabolism, adenosine receptors localization and their signal transduction pathways. Particular attention will be paid to the biochemistry and pharmacology of A2A receptors, since antagonists of these receptors have emerged as promising new drugs for the treatment of Parkinson's disease. The interactions of A2A receptors with other nonadenosinergic receptors, and the effects of the pharmacological manipulation of A2A receptors on different body organs will be discussed, together with the usefulness of A2A receptor antagonists for the treatment of Parkinson's disease and the potential adverse effects of these drugs
    corecore