44 research outputs found

    PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information

    Get PDF
    PRofile ALIgNEment (PRALINE) is a fully customizable multiple sequence alignment application. In addition to a number of available alignment strategies, PRALINE can integrate information from database homology searches to generate a homology-extended multiple alignment. PRALINE also provides a choice of seven different secondary structure prediction programs that can be used individually or in combination as a consensus for integrating structural information into the alignment process. The program can be used through two separate interfaces: one has been designed to cater to more advanced needs of researchers in the field, and the other for standard construction of high confidence alignments. The web-based output is designed to facilitate the comprehensive visualization of the generated alignments by means of five default colour schemes based on: residue type, position conservation, position reliability, residue hydrophobicity and secondary structure, depending on the options set. A user can also define a custom colour scheme by selecting which colour will represent one or more amino acids in the alignment. All generated alignments are also made available in the PDF format for easy figure generation for publications. The grouping of sequences, on which the alignment is based, can also be visualized as a dendrogram. PRALINE is available at

    The database of experimentally supported targets: a functional update of TarBase

    Get PDF
    TarBase5.0 is a database which houses a manually curated collection of experimentally supported microRNA (miRNA) targets in several animal species of central scientific interest, plants and viruses. MiRNAs are small non-coding RNA molecules that exhibit an inhibitory effect on gene expression, interfering with the stability and translational efficiency of the targeted mature messenger RNAs. Even though several computational programs exist to predict miRNA targets, there is a need for a comprehensive collection and description of miRNA targets with experimental support. Here we introduce a substantially extended version of this resource. The current version includes more than 1300 experimentally supported targets. Each target site is described by the miRNA that binds it, the gene in which it occurs, the nature of the experiments that were conducted to test it, the sufficiency of the site to induce translational repression and/or cleavage, and the paper from which all these data were extracted. Additionally, the database is functionally linked to several other relevant and useful databases such as Ensembl, Hugo, UCSC and SwissProt. The TarBase5.0 database can be queried or downloaded from http://microrna.gr/tarbase

    PROMALS3D web server for accurate multiple protein sequence and structure alignments

    Get PDF
    Multiple sequence alignments are essential in computational sequence and structural analysis, with applications in homology detection, structure modeling, function prediction and phylogenetic analysis. We report PROMALS3D web server for constructing alignments for multiple protein sequences and/or structures using information from available 3D structures, database homologs and predicted secondary structures. PROMALS3D shows higher alignment accuracy than a number of other advanced methods. Input of PROMALS3D web server can be FASTA format protein sequences, PDB format protein structures and/or user-defined alignment constraints. The output page provides alignments with several formats, including a colored alignment augmented with useful information about sequence grouping, predicted secondary structures and consensus sequences. Intermediate results of sequence and structural database searches are also available. The PROMALS3D web server is available at: http://prodata.swmed.edu/promals3d/

    The Opportunistic Pathogen Propionibacterium acnes: Insights into Typing, Human Disease, Clonal Diversification and CAMP Factor Evolution

    Get PDF
    We previously described a Multilocus Sequence Typing (MLST) scheme based on eight genes that facilitates population genetic and evolutionary analysis of P. acnes. While MLST is a portable method for unambiguous typing of bacteria, it is expensive and labour intensive. Against this background, we now describe a refined version of this scheme based on two housekeeping (aroE; guaA) and two putative virulence (tly; camp2) genes (MLST4) that correctly predicted the phylogroup (IA1, IA2, IB, IC, II, III), clonal complex (CC) and sequence type (ST) (novel or described) status for 91% isolates (n = 372) via cross-referencing of the four gene allelic profiles to the full eight gene versions available in the MLST database (http:// pubmlst.org/pacnes/). Even in the small number of cases where specific STs were not completely resolved, the MLST4 method still correctly determined phylogroup and CC membership. Examination of nucleotide changes within all the MLST loci provides evidence that point mutations generate new alleles approximately 1.5 times as frequently as recombination; although the latter still plays an important role in the bacterium’s evolution. The secreted/cell-associated ‘virulence’ factors tly and camp2 show no clear evidence of episodic or pervasive positive selection and have diversified at a rate similar to housekeeping loci. The co-evolution of these genes with the core genome might also indicate a role in commensal/normal existence constraining their diversity and preventing their loss from the P. acnes population. The possibility that members of the expanded CAMP factor protein family, including camp2, may have been lost from other propionibacteria, but not P. acnes, would further argue for a possible role in niche/host adaption leading to their retention within the genome. These evolutionary insights may prove important for discussions surrounding camp2 as an immunotherapy target for acne, and the effect such treatments may have on commensal lineages

    PROMALS3D: a tool for multiple protein sequence and structure alignments

    Get PDF
    Although multiple sequence alignments (MSAs) are essential for a wide range of applications from structure modeling to prediction of functional sites, construction of accurate MSAs for distantly related proteins remains a largely unsolved problem. The rapidly increasing database of spatial structures is a valuable source to improve alignment quality. We explore the use of 3D structural information to guide sequence alignments constructed by our MSA program PROMALS. The resulting tool, PROMALS3D, automatically identifies homologs with known 3D structures for the input sequences, derives structural constraints through structure-based alignments and combines them with sequence constraints to construct consistency-based multiple sequence alignments. The output is a consensus alignment that brings together sequence and structural information about input proteins and their homologs. PROMALS3D can also align sequences of multiple input structures, with the output representing a multiple structure-based alignment refined in combination with sequence constraints. The advantage of PROMALS3D is that it gives researchers an easy way to produce high-quality alignments consistent with both sequences and structures of proteins. PROMALS3D outperforms a number of existing methods for constructing multiple sequence or structural alignments using both reference-dependent and reference-independent evaluation methods

    DIANA-microT web server: elucidating microRNA functions through target prediction

    Get PDF
    Computational microRNA (miRNA) target prediction is one of the key means for deciphering the role of miRNAs in development and disease. Here, we present the DIANA-microT web server as the user interface to the DIANA-microT 3.0 miRNA target prediction algorithm. The web server provides extensive information for predicted miRNA:target gene interactions with a user-friendly interface, providing extensive connectivity to online biological resources. Target gene and miRNA functions may be elucidated through automated bibliographic searches and functional information is accessible through Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The web server offers links to nomenclature, sequence and protein databases, and users are facilitated by being able to search for targeted genes using different nomenclatures or functional features, such as the genes possible involvement in biological pathways. The target prediction algorithm supports parameters calculated individually for each miRNA:target gene interaction and provides a signal-to-noise ratio and a precision score that helps in the evaluation of the significance of the predicted results. Using a set of miRNA targets recently identified through the pSILAC method, the performance of several computational target prediction programs was assessed. DIANA-microT 3.0 achieved there with 66% the highest ratio of correctly predicted targets over all predicted targets. The DIANA-microT web server is freely available at www.microrna.gr/microT

    MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information

    Get PDF
    We have developed MUMMALS, a program to construct multiple protein sequence alignment using probabilistic consistency. MUMMALS improves alignment quality by using pairwise alignment hidden Markov models (HMMs) with multiple match states that describe local structural information without exploiting explicit structure predictions. Parameters for such models have been estimated from a large library of structure-based alignments. We show that (i) on remote homologs, MUMMALS achieves statistically best accuracy among several leading aligners, such as ProbCons, MAFFT and MUSCLE, albeit the average improvement is small, in the order of several percent; (ii) a large collection (>10 000) of automatically computed pairwise structure alignments of divergent protein domains is superior to smaller but carefully curated datasets for estimation of alignment parameters and performance tests; (iii) reference-independent evaluation of alignment quality using sequence alignment-dependent structure superpositions correlates well with reference-dependent evaluation that compares sequence-based alignments to structure-based reference alignments

    Towards realistic benchmarks for multiple alignments of non-coding sequences

    Get PDF
    <p><b>Abstract</b></p> <p>Background</p> <p>With the continued development of new computational tools for multiple sequence alignment, it is necessary today to develop benchmarks that aid the selection of the most effective tools. Simulation-based benchmarks have been proposed to meet this necessity, especially for non-coding sequences. However, it is not clear if such benchmarks truly represent real sequence data from any given group of species, in terms of the difficulty of alignment tasks.</p> <p>Results</p> <p>We find that the conventional simulation approach, which relies on empirically estimated values for various parameters such as substitution rate or insertion/deletion rates, is unable to generate synthetic sequences reflecting the broad genomic variation in conservation levels. We tackle this problem with a new method for simulating non-coding sequence evolution, by relying on genome-wide distributions of evolutionary parameters rather than their averages. We then generate synthetic data sets to mimic orthologous sequences from the <it>Drosophila </it>group of species, and show that these data sets truly represent the variability observed in genomic data in terms of the difficulty of the alignment task. This allows us to make significant progress towards estimating the alignment accuracy of current tools in an absolute sense, going beyond only a relative assessment of different tools. We evaluate six widely used multiple alignment tools in the context of <it>Drosophila </it>non-coding sequences, and find the accuracy to be significantly different from previously reported values. Interestingly, the performance of most tools degrades more rapidly when there are more insertions than deletions in the data set, suggesting an asymmetric handling of insertions and deletions, even though none of the evaluated tools explicitly distinguishes these two types of events. We also examine the accuracy of two existing tools for annotating insertions versus deletions, and find their performance to be close to optimal in <it>Drosophila </it>non-coding sequences if provided with the true alignments.</p> <p>Conclusion</p> <p>We have developed a method to generate benchmarks for multiple alignments of <it>Drosophila </it>non-coding sequences, and shown it to be more realistic than traditional benchmarks. Apart from helping to select the most effective tools, these benchmarks will help practitioners of comparative genomics deal with the effects of alignment errors, by providing accurate estimates of the extent of these errors.</p

    UniProt Tools

    No full text
    corecore