46 research outputs found

    Comparison study on k-word statistical measures for protein: From sequence to 'sequence space'

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many proposed statistical measures can efficiently compare protein sequence to further infer protein structure, function and evolutionary information. They share the same idea of using <it>k</it>-word frequencies of protein sequences. Given a protein sequence, the information on its related protein sequences hasn't been used for protein sequence comparison until now. This paper proposed a scheme to construct protein 'sequence space' which was associated with protein sequences related to the given protein, and the performances of statistical measures were compared when they explored the information on protein 'sequence space' or not. This paper also presented two statistical measures for protein: <it>gre.k </it>(generalized relative entropy) and <it>gsm.k </it>(gapped similarity measure).</p> <p>Results</p> <p>We tested statistical measures based on protein 'sequence space' or not with three data sets. This not only offers the systematic and quantitative experimental assessment of these statistical measures, but also naturally complements the available comparison of statistical measures based on protein sequence. Moreover, we compared our statistical measures with alignment-based measures and the existing statistical measures. The experiments were grouped into two sets. The first one, performed via ROC (Receiver Operating Curve) analysis, aims at assessing the intrinsic ability of the statistical measures to discriminate and classify protein sequences. The second set of the experiments aims at assessing how well our measure does in phylogenetic analysis. Based on the experiments, several conclusions can be drawn and, from them, novel valuable guidelines for the use of protein 'sequence space' and statistical measures were obtained.</p> <p>Conclusion</p> <p>Alignment-based measures have a clear advantage when the data is high redundant. The more efficient statistical measure is the novel <it>gsm.k </it>introduced by this article, the <it>cos.k </it>followed. When the data becomes less redundant, <it>gre.k </it>proposed by us achieves a better performance, but all the other measures perform poorly on classification tasks. Almost all the statistical measures achieve improvement by exploring the information on 'sequence space' as word's length increases, especially for less redundant data. The reasonable results of phylogenetic analysis confirm that <it>Gdis.k </it>based on 'sequence space' is a reliable measure for phylogenetic analysis. In summary, our quantitative analysis verifies that exploring the information on 'sequence space' is a promising way to improve the abilities of statistical measures for protein comparison.</p

    Discovery of permuted and recently split transfer RNAs in Archaea

    Get PDF
    Background: As in eukaryotes, precursor transfer RNAs in Archaea often contain introns that are removed in tRNA maturation. Two unrelated archaeal species display unique pre-tRNA processing complexity in the form of split tRNA genes, in which two to three segments of tRNAs are transcribed from different loci, then trans-spliced to form a mature tRNA. Another rare type of pre-tRNA, found only in eukaryotic algae, is permuted, where the 3 ’ half is encoded upstream of the 5 ’ half, and must be processed to be functional. Results: Using an improved version of the gene-finding program tRNAscan-SE, comparative analyses and experimental verifications, we have now identified four novel trans-spliced tRNA genes, each in a different species of the Desulfurococcales branch of the Archaea: tRNA Asp(GUC) in Aeropyrum pernix and Thermosphaera aggregans, and tRNA Lys(CUU) in Staphylothermus hellenicus and Staphylothermus marinus. Each of these includes features surprisingly similar to previously studied split tRNAs, yet comparative genomic context analysis and phylogenetic distribution suggest several independent, relatively recent splitting events. Additionally, we identified the first examples of permuted tRNA genes in Archaea: tRNA iMet(CAU) and tRNA Tyr(GUA) in Thermofilum pendens, which appear to be permuted in the same arrangement seen previously in red alga. Conclusions: Our findings illustrate that split tRNAs are sporadically spread across a major branch of the Archaea

    The Cyst-Dividing Bacterium Ramlibacter tataouinensis TTB310 Genome Reveals a Well-Stocked Toolbox for Adaptation to a Desert Environment

    Get PDF
    Ramlibacter tataouinensis TTB310T (strain TTB310), a betaproteobacterium isolated from a semi-arid region of South Tunisia (Tataouine), is characterized by the presence of both spherical and rod-shaped cells in pure culture. Cell division of strain TTB310 occurs by the binary fission of spherical “cyst-like” cells (“cyst-cyst” division). The rod-shaped cells formed at the periphery of a colony (consisting mainly of cysts) are highly motile and colonize a new environment, where they form a new colony by reversion to cyst-like cells. This unique cell cycle of strain TTB310, with desiccation tolerant cyst-like cells capable of division and desiccation sensitive motile rods capable of dissemination, appears to be a novel adaptation for life in a hot and dry desert environment. In order to gain insights into strain TTB310's underlying genetic repertoire and possible mechanisms responsible for its unusual lifestyle, the genome of strain TTB310 was completely sequenced and subsequently annotated. The complete genome consists of a single circular chromosome of 4,070,194 bp with an average G+C content of 70.0%, the highest among the Betaproteobacteria sequenced to date, with total of 3,899 predicted coding sequences covering 92% of the genome. We found that strain TTB310 has developed a highly complex network of two-component systems, which may utilize responses to light and perhaps a rudimentary circadian hourglass to anticipate water availability at the dew time in the middle/end of the desert winter nights and thus direct the growth window to cyclic water availability times. Other interesting features of the strain TTB310 genome that appear to be important for desiccation tolerance, including intermediary metabolism compounds such as trehalose or polyhydroxyalkanoate, and signal transduction pathways, are presented and discussed

    Indiferença, simetria e perfeição segundo Leibniz

    No full text
    Na TeodicĂ©ia, Leibniz apresentatrĂȘs soluçÔes para o sofisma de Buridan, em particular, e para o problema da liberdade de indiferença, em geral. A primeira refuta a idĂ©ia de que, mesmo em uma situação de perfeito equilĂ­brio e total ausĂȘncia de uma razĂŁo determinante, os homens (diferentemente dos animais irracionais) seriam capazes de agir. As outras duas refutam diretamente a possibilidade de haver no universo tal situação de equilĂ­brio e simetria perfeitos, de modo que o prĂłprio sofisma perde seu sentido.<br>In the Theodicy, Leibniz presents three different solutions to the sophism of Buridan's Ass, and more generally, to the problem of the liberty of indifference. The first criticizes the idea that, even in a situation of perfect equilibrium and complete absence of a determining reason, men (as opposed to irrational animals) would be able to act. The other two deny the very possibility of perfect equilibrium and symmetry in the universe, such that this sophism looses its meaning
    corecore