2,062 research outputs found

    SATCHMO-JS: a webserver for simultaneous protein multiple sequence alignment and phylogenetic tree construction.

    Get PDF
    We present the jump-start simultaneous alignment and tree construction using hidden Markov models (SATCHMO-JS) web server for simultaneous estimation of protein multiple sequence alignments (MSAs) and phylogenetic trees. The server takes as input a set of sequences in FASTA format, and outputs a phylogenetic tree and MSA; these can be viewed online or downloaded from the website. SATCHMO-JS is an extension of the SATCHMO algorithm, and employs a divide-and-conquer strategy to jump-start SATCHMO at a higher point in the phylogenetic tree, reducing the computational complexity of the progressive all-versus-all HMM-HMM scoring and alignment. Results on a benchmark dataset of 983 structurally aligned pairs from the PREFAB benchmark dataset show that SATCHMO-JS provides a statistically significant improvement in alignment accuracy over MUSCLE, Multiple Alignment using Fast Fourier Transform (MAFFT), ClustalW and the original SATCHMO algorithm. The SATCHMO-JS webserver is available at http://phylogenomics.berkeley.edu/satchmo-js. The datasets used in these experiments are available for download at http://phylogenomics.berkeley.edu/satchmo-js/supplementary/

    Disease Sequences High-Accuracy Alignment Based on the Precision Medicine

    Get PDF

    Phylogenetic assessment of alignments reveals neglected tree signal in gaps

    Get PDF
    Tree-based tests of alignment methods enable the evaluation of the effect of gap placement on the inference of phylogenetic relationships

    Molecular evolution of RRM-containing proteins and glycine-rich RNA-binding proteins in plants

    Get PDF
    *Abstract*

*Background:*
In angiosperms, RNA-binding proteins with an RNA recognition motif (RRM)-type RNA interaction domain play an important role in developmental and environmental responses. Despite their pivotal role, a comprehensive analysis of their number and diversity has only been performed in _Arabidopsis_ so far.

*Results:*
Here we present a detailed phylogenetic analysis of RRM-containing proteins in plants, the red algae _Cyanidioschyzon merolae_ and cyanobacteria. We identified two major events during the diversification of the RRM in plants, one at the emergence of green plants, and the other at the water-to-land transition. We focused on proteins that combine a single RRM with a glycine-rich stretch, known as glycine-rich RNA-binding proteins (GRPs). We found that GRPs are present in cyanobacteria, however plant and cyanobacterial GRPs are not of monophyletic origin. We provide evidence that plant GRPs form a polyphyletic group.
 
*Conclusion:*
Our work provides insights into the origin of GRPs in plants. We determined that the RRM from plants and cyanobacteria do not have a common origin. We could also determine that the acquisition of the glycine-rich stretch has happened at least on three separate occasions during the evolution of GRPs. One event led to the emergence of cyanobacterial GRPs, while later acquisition events led to the emergence of GRPs in the green lineage. No GRPs were found in red or marine green algae. We found a subgroup of GRPs exclusive to land plants, and its appearance may be linked to challenges related to the water-to-land transition.
&#xa

    Assessing Multiple Sequence Alignments Using Visual Tools

    Get PDF
    Bioinformatics and molecular evolutionary analyses most often start with comparing DNA or amino acid sequences by aligning them. Pairwise alignment, for example, is used to measure the similarities between a query sequence and each of those in a database in BLAST similarity search, the most used bioinformatics tool (Altschul et al., 1990; Camacho et al.

    MTRAP: Pairwise sequence alignment algorithm by a new measure based on transition probability between two consecutive pairs of residues

    Get PDF
    BACKGROUND: Sequence alignment is one of the most important techniques to analyze biological systems. It is also true that the alignment is not complete and we have to develop it to look for more accurate method. In particular, an alignment for homologous sequences with low sequence similarity is not in satisfactory level. Usual methods for aligning protein sequences in recent years use a measure empirically determined. As an example, a measure is usually defined by a combination of two quantities (1) and (2) below: (1) the sum of substitutions between two residue segments, (2) the sum of gap penalties in insertion/deletion region. Such a measure is determined on the assumption that there is no an intersite correlation on the sequences. In this paper, we improve the alignment by taking the correlation of consecutive residues. RESULTS: We introduced a new method of alignment, called MTRAP by introducing a metric defined on compound systems of two sequences. In the benchmark tests by PREFAB 4.0 and HOMSTRAD, our pairwise alignment method gives higher accuracy than other methods such as ClustalW2, TCoffee, MAFFT. Especially for the sequences with sequence identity less than 15%, our method improves the alignment accuracy significantly. Moreover, we also showed that our algorithm works well together with a consistency-based progressive multiple alignment by modifying the TCoffee to use our measure. CONCLUSIONS: We indicated that our method leads to a significant increase in alignment accuracy compared with other methods. Our improvement is especially clear in low identity range of sequences. The source code is available at our web page, whose address is found in the section "Availability and requirements"
    corecore