Search CORE

44 research outputs found

PRALINE: a multiple sequence alignment toolbox that integrates homology-extended and secondary structure information

Author: Heringa J.
Simossis V. A.
Publication venue: Oxford University Press
Publication date: 27/06/2005
Field of study

PRofile ALIgNEment (PRALINE) is a fully customizable multiple sequence alignment application. In addition to a number of available alignment strategies, PRALINE can integrate information from database homology searches to generate a homology-extended multiple alignment. PRALINE also provides a choice of seven different secondary structure prediction programs that can be used individually or in combination as a consensus for integrating structural information into the alignment process. The program can be used through two separate interfaces: one has been designed to cater to more advanced needs of researchers in the field, and the other for standard construction of high confidence alignments. The web-based output is designed to facilitate the comprehensive visualization of the generated alignments by means of five default colour schemes based on: residue type, position conservation, position reliability, residue hydrophobicity and secondary structure, depending on the options set. A user can also define a custom colour scheme by selecting which colour will represent one or more amino acids in the alignment. All generated alignments are also made available in the PDF format for easy figure generation for publications. The grouping of sequences, on which the alignment is based, can also be visualized as a dendrogram. PRALINE is available at

Crossref

PubMed Central

The database of experimentally supported targets: a functional update of TarBase

Author: A. G. Hatzigeorgiou
Addo-Quaye
Baek
Bartel
G. L. Papadopoulos
Griffiths-Jones
Lagos-Quintana
Lau
Lee
M. Reczko
Mangan
P. Sethupathy
Selbach
Sethupathy
V. A. Simossis
Wu
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

TarBase5.0 is a database which houses a manually curated collection of experimentally supported microRNA (miRNA) targets in several animal species of central scientific interest, plants and viruses. MiRNAs are small non-coding RNA molecules that exhibit an inhibitory effect on gene expression, interfering with the stability and translational efficiency of the targeted mature messenger RNAs. Even though several computational programs exist to predict miRNA targets, there is a need for a comprehensive collection and description of miRNA targets with experimental support. Here we introduce a substantially extended version of this resource. The current version includes more than 1300 experimentally supported targets. Each target site is described by the miRNA that binds it, the gene in which it occurs, the nature of the experiments that were conducted to test it, the sufficiency of the site to induce translational repression and/or cleavage, and the paper from which all these data were extracted. Additionally, the database is functionally linked to several other relevant and useful databases such as Ensembl, Hugo, UCSC and SwissProt. The TarBase5.0 database can be queried or downloaded from http://microrna.gr/tarbase

CiteSeerX

Crossref

PubMed Central

PROMALS3D web server for accurate multiple protein sequence and structure alignments

Author: Bateman
Chandonia
Do
Edgar
Henikoff
Holm
J. Pei
Jones
Katoh
Lipman
M. Tang
Murzin
N. V. Grishin
Notredame
O'Sullivan
Pei
Simossis
Thompson
Thompson
Thompson
Wu
Zhang
Zhu
Publication venue: Oxford University Press
Publication date
Field of study

Multiple sequence alignments are essential in computational sequence and structural analysis, with applications in homology detection, structure modeling, function prediction and phylogenetic analysis. We report PROMALS3D web server for constructing alignments for multiple protein sequences and/or structures using information from available 3D structures, database homologs and predicted secondary structures. PROMALS3D shows higher alignment accuracy than a number of other advanced methods. Input of PROMALS3D web server can be FASTA format protein sequences, PDB format protein structures and/or user-defined alignment constraints. The output page provides alignments with several formats, including a colored alignment augmented with useful information about sequence grouping, predicted secondary structures and consensus sequences. Intermediate results of sequence and structural database searches are also available. The PROMALS3D web server is available at: http://prodata.swmed.edu/promals3d/

Crossref

PubMed Central

The Opportunistic Pathogen Propionibacterium acnes: Insights into Typing, Human Disease, Clonal Diversification and CAMP Factor Evolution

Author: A Holmberg
A McDowell
A McDowell
A McDowell
A McDowell
A McDowell
A Perry
A Rahman
A Stirling
A Vörös
AC Goodwin
AC Jahns
AE Darling
Andrew McDowell
B Haubold
BA Shannon
BG Spratt
C Dessinioti
C Hill
C Holland
DA Fitzpatrick
DG Torgerson
DH Huson
Dongsheng Zhou
E Nagy
EA McGraw
EJ Feil
EJ Feil
Emma Barnard
FM Cohan
G Sanchez-Perez
GC McLorinan
H Bruggemann
H Brüggemann
H Brüggemann
H Falentin
HB Lomholt
I Nagy
I Nagy
István Nagy
J Ayer
J Ho-Huu
J Hunyadkürti
J Olsson
J Olsson
JA Eisen
K Minegishi
KA Jolley
KE Piper
L Fassi Fehri
L Ördögh
LP Parizzi
M Kellis
M Miragaia
M O'Connell-Motherway
M Shu
M Sörensen
MCJ Maiden
MF Sampedro
MJ Lodes
MM Tunney
Márta Magyari
NE Arenas
O O'Sullivan
P Librado
PF Liu
PR Hunter
R Leon-Kempis Mdel
R Pushker
RD Sleator
RJ Cohen
S Tomida
S Valanne
SA Niazi
Sheila Patrick
SL Kosakovsky Pond
SL Kosakovsky Pond
T Nakatsuji
T Yasuhara
TJ Treangen
TN Mak
TN Petersen
V Pancholi
V Zeller
VA Simossis
VJ Lynch
Y Eishi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

We previously described a Multilocus Sequence Typing (MLST) scheme based on eight genes that facilitates population genetic and evolutionary analysis of P. acnes. While MLST is a portable method for unambiguous typing of bacteria, it is expensive and labour intensive. Against this background, we now describe a refined version of this scheme based on two housekeeping (aroE; guaA) and two putative virulence (tly; camp2) genes (MLST4) that correctly predicted the phylogroup (IA1, IA2, IB, IC, II, III), clonal complex (CC) and sequence type (ST) (novel or described) status for 91% isolates (n = 372) via cross-referencing of the four gene allelic profiles to the full eight gene versions available in the MLST database (http:// pubmlst.org/pacnes/). Even in the small number of cases where specific STs were not completely resolved, the MLST4 method still correctly determined phylogroup and CC membership. Examination of nucleotide changes within all the MLST loci provides evidence that point mutations generate new alleles approximately 1.5 times as frequently as recombination; although the latter still plays an important role in the bacterium’s evolution. The secreted/cell-associated ‘virulence’ factors tly and camp2 show no clear evidence of episodic or pervasive positive selection and have diversified at a rate similar to housekeeping loci. The co-evolution of these genes with the core genome might also indicate a role in commensal/normal existence constraining their diversity and preventing their loss from the P. acnes population. The possibility that members of the expanded CAMP factor protein family, including camp2, may have been lost from other propionibacteria, but not P. acnes, would further argue for a possible role in niche/host adaption leading to their retention within the genome. These evolutionary insights may prove important for discussions surrounding camp2 as an immunotherapy target for acne, and the effect such treatments may have on commensal lineages

Queen's University Belfast Research Portal

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Repository of the Academy's Library

The Francis Crick Institute

PROMALS3D: a tool for multiple protein sequence and structure alignments

Author: Altschul
Armougom
Bong-Hyun Kim
Boutonnet
Chandonia
Do
Edgar
Edgar
Henikoff
Holm
Jimin Pei
Jones
Katoh
Konagurthu
Murzin
Nick V. Grishin
Notredame
Notredame
O'Sullivan
Pei
Pei
Shindyalov
Simossis
Taylor
Thompson
Thompson
Van Walle
Wu
Zemla
Zhang
Zhou
Zhu
Publication venue: Oxford University Press
Publication date: 01/01/2008
Field of study

Although multiple sequence alignments (MSAs) are essential for a wide range of applications from structure modeling to prediction of functional sites, construction of accurate MSAs for distantly related proteins remains a largely unsolved problem. The rapidly increasing database of spatial structures is a valuable source to improve alignment quality. We explore the use of 3D structural information to guide sequence alignments constructed by our MSA program PROMALS. The resulting tool, PROMALS3D, automatically identifies homologs with known 3D structures for the input sequences, derives structural constraints through structure-based alignments and combines them with sequence constraints to construct consistency-based multiple sequence alignments. The output is a consensus alignment that brings together sequence and structural information about input proteins and their homologs. PROMALS3D can also align sequences of multiple input structures, with the output representing a multiple structure-based alignment refined in combination with sequence constraints. The advantage of PROMALS3D is that it gives researchers an easy way to produce high-quality alignments consistent with both sequences and structures of proteins. PROMALS3D outperforms a number of existing methods for constructing multiple sequence or structural alignments using both reference-dependent and reference-independent evaluation methods

CiteSeerX

Crossref

PubMed Central

DIANA-microT web server: elucidating microRNA functions through target prediction

Author: A. G. Hatzigeorgiou
Bartel
Brennecke
E. Koukis
G. Giannopoulos
G. Goumas
G. L. Papadopoulos
Gaidatzis
Grimson
K. Kourtis
Kanehisa
Kawahara
Kertesz
Kwon
Lee
Lewis
M. Maragkakis
M. Reczko
N. Koziris
Nollmann
P. Alexiou
P. Tsanakas
Selbach
Sethupathy
T. Dalamagas
T. Sellis
T. Vergoulis
V. A. Simossis
Publication venue: Oxford University Press
Publication date: 01/01/2009
Field of study

Computational microRNA (miRNA) target prediction is one of the key means for deciphering the role of miRNAs in development and disease. Here, we present the DIANA-microT web server as the user interface to the DIANA-microT 3.0 miRNA target prediction algorithm. The web server provides extensive information for predicted miRNA:target gene interactions with a user-friendly interface, providing extensive connectivity to online biological resources. Target gene and miRNA functions may be elucidated through automated bibliographic searches and functional information is accessible through Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The web server offers links to nomenclature, sequence and protein databases, and users are facilitated by being able to search for targeted genes using different nomenclatures or functional features, such as the genes possible involvement in biological pathways. The target prediction algorithm supports parameters calculated individually for each miRNA:target gene interaction and provides a signal-to-noise ratio and a precision score that helps in the evaluation of the significance of the predicted results. Using a set of miRNA targets recently identified through the pSILAC method, the performance of several computational target prediction programs was assessed. DIANA-microT 3.0 achieved there with 66% the highest ratio of correctly predicted targets over all predicted targets. The DIANA-microT web server is freely available at www.microrna.gr/microT

Crossref

PubMed Central

Research Repository RMIT University

DSpace@NTUA (National Technical Univ. of Athens)

Swinburne Research Bank

MUMMALS: multiple sequence alignment improved by using hidden Markov models with local structural information

Author: Altschul
Bahr
Blake
Boutonnet
Chandonia
Chivian
Cline
Dayhoff
de Bakker
Do
Durbin
Eddy
Eddy
Edgar
Edgar
Ginalski
Gotoh
Henikoff
Holm
Holm
Huang
Hubbard
Jimin Pei
Jones
Jones
Kabsch
Katoh
Kinch
Lichtarge
Marchler-Bauer
Miyazawa
Murzin
Needleman
Nick V. Grishin
Notredame
O'Sullivan
O'Sullivan
Pei
Pei
Prlic
Rost
Rychlewski
Sadreyev
Shindyalov
Simossis
Smith
Thompson
Thompson
Thompson
Van Walle
Venclovas
Wallace
Wallace
Wallner
Wang
Zemla
Zhang
Zhou
Publication venue: Oxford University Press
Publication date: 26/08/2006
Field of study

We have developed MUMMALS, a program to construct multiple protein sequence alignment using probabilistic consistency. MUMMALS improves alignment quality by using pairwise alignment hidden Markov models (HMMs) with multiple match states that describe local structural information without exploiting explicit structure predictions. Parameters for such models have been estimated from a large library of structure-based alignments. We show that (i) on remote homologs, MUMMALS achieves statistically best accuracy among several leading aligners, such as ProbCons, MAFFT and MUSCLE, albeit the average improvement is small, in the order of several percent; (ii) a large collection (>10 000) of automatically computed pairwise structure alignments of divergent protein domains is superior to smaller but carefully curated datasets for estimation of alignment parameters and performance tests; (iii) reference-independent evaluation of alignment quality using sequence alignment-dependent structure superpositions correlates well with reference-dependent evaluation that compares sequence-based alignments to structure-based reference alignments

Crossref

PubMed Central

Towards realistic benchmarks for multiple alignments of non-coding sequences

Author: A Loytynoja
A Prakash
A Prakash
A Siepel
AB Diallo
AG Clark
AP Dempster
AR Subramanian
AW Dress
B Paten
BG Hall
C Notredame
CM Bergman
D Karolchik
D Tian
DA Pollard
DA Pollard
G Bejerano
G Landan
G Landan
G Lunter
G Lunter
I Van Walle
J Felsenstein
J Kim
J Kim
J Stoye
Jaebum Kim
JD Thompson
K Katoh
K Mizuguchi
L Chindelevitch
M Blanchette
M Blanchette
M Brudno
MA Larkin
MS Rosenberg
N Bray
RA Cartwright
RC Edgar
RK Bradley
RK Bradley
S Sinha
S Snir
Saurabh Sinha
TH Ogdenw
V Simossis
W Fletcher
W Huang
W Pirovano
X He
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background With the continued development of new computational tools for multiple sequence alignment, it is necessary today to develop benchmarks that aid the selection of the most effective tools. Simulation-based benchmarks have been proposed to meet this necessity, especially for non-coding sequences. However, it is not clear if such benchmarks truly represent real sequence data from any given group of species, in terms of the difficulty of alignment tasks. Results We find that the conventional simulation approach, which relies on empirically estimated values for various parameters such as substitution rate or insertion/deletion rates, is unable to generate synthetic sequences reflecting the broad genomic variation in conservation levels. We tackle this problem with a new method for simulating non-coding sequence evolution, by relying on genome-wide distributions of evolutionary parameters rather than their averages. We then generate synthetic data sets to mimic orthologous sequences from the <it>Drosophila </it>group of species, and show that these data sets truly represent the variability observed in genomic data in terms of the difficulty of the alignment task. This allows us to make significant progress towards estimating the alignment accuracy of current tools in an absolute sense, going beyond only a relative assessment of different tools. We evaluate six widely used multiple alignment tools in the context of <it>Drosophila </it>non-coding sequences, and find the accuracy to be significantly different from previously reported values. Interestingly, the performance of most tools degrades more rapidly when there are more insertions than deletions in the data set, suggesting an asymmetric handling of insertions and deletions, even though none of the evaluated tools explicitly distinguishes these two types of events. We also examine the accuracy of two existing tools for annotating insertions versus deletions, and find their performance to be close to optimal in <it>Drosophila </it>non-coding sequences if provided with the true alignments. Conclusion We have developed a method to generate benchmarks for multiple alignments of <it>Drosophila </it>non-coding sequences, and shown it to be more realistic than traditional benchmarks. Apart from helping to select the most effective tools, these benchmarks will help practitioners of comparative genomics deal with the effects of alignment errors, by providing accurate estimates of the extent of these errors.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central