Search CORE

6 research outputs found

Automatic extraction of reliable regions from multiple sequence alignments

Author: B Morgenstern
B Morgenstern
C Grasso
C Lee
C Notredame
CB Do
D Morrison
Erik LL Sonnhammer
I Van Walle
IM Wallace
J Stoye
J Thompson
J Thompson
JD Thompson
JD Thompson
K Katoh
K Katoh
K Sjolander
O Lecompte
RC Edgar
T Lassmann
T Lassmann
T Lassmann
TH Ogdenw
Timo Lassmann
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Reproducing the manual annotation of multiple sequence alignments using a SVM classifier

Author: Allan Lavell
Altschul
Andrew J. Roger
Beiko
Bradley
Castresana
Chang
Christian Blouin
Do
Dutheil
Eddy
Edgar
Edward Susko
Fawcett
Feng
Finn
Hall
Holmes
Jones
Landan
Landan
Lassmann
Lassmann
Lunter
Löytynoja
Needleman
Notredame
Notredame
Nuin
Ogdenw
Pei
R Development Core Team
Roettger
Saitou
Scott Perry
Shan
Sing
Smith
Thompson
Thompson
Thompson
Van Walle
Wong
Publication venue: Oxford University Press
Publication date
Field of study

Motivation: Aligning protein sequences with the best possible accuracy requires sophisticated algorithms. Since the optimal alignment is not guaranteed to be the correct one, it is expected that even the best alignment will contain sites that do not respect the assumption of positional homology. Because formulating rules to identify these sites is difficult, it is common practice to manually remove them. Although considered necessary in some cases, manual editing is time consuming and not reproducible. We present here an automated editing method based on the classification of ‘valid’ and ‘invalid’ sites

Crossref

PubMed Central

AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

Author: Bautista Rocío
Cantón Francisco R
Claros M Gonzalo
Guerrero Darío
Villalobos David P
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, <monospace>Entropy</monospace> being the method that provides the highest number of regions with the greatest length, and <monospace>Weighted</monospace> being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. <it>In silico </it>and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect divergent regions via several scoring methods that provide different levels of selectivity. Its predictions have been verified by experimental means. Hence, it is expected that its usage will save researchers' time and ensure an objective selection of the best-possible divergent region when closely related sequences are analysed. AlignMiner is freely available at <url>http://www.scbi.uma.es/alignminer</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

H2r: Identification of evolutionary important residues by means of an entropy based analysis of multiple sequence alignments

Author: A del Sol Mesa
AL Barabási
B Rost
C Notredame
C Ouzounis
C Sander
C Steegborn
CC Hyde
CE Shannon
D Altschuh
DR Caffrey
E Eyal
E Neher
E Weber-Ban
E Zuckerkandl
ER Tillier
F Pearl
GB Gloor
GM Süel
HO Villar
I Kass
IM Wallace
J Tsai
JA Capra
JP Dekker
K Katoh
K Wang
LA Kelley
LC Martin
M Landau
Matthias Zwick
MC Saraf
ME Noble
O Noivirt
O Olmea
OV Kalinina
OV Kalinina
R Merkl
RA Estabrook
RA Laskowski
Rainer Merkl
RD Finn
RI Dima
S Henikoff
SJ Fleishman
SM Larson
SW Lockless
T Lassmann
T Sato
TD Schneider
U Göbel
V Kulik
V Kulik
WH Press
WR Atchley
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

BACKGROUND: A multiple sequence alignment (MSA) generated for a protein can be used to characterise residues by means of a statistical analysis of single columns. In addition to the examination of individual positions, the investigation of co-variation of amino acid frequencies offers insights into function and evolution of the protein and residues. RESULTS: We introduce conn(k), a novel parameter for the characterisation of individual residues. For each residue k, conn(k) is the number of most extreme signals of co-evolution. These signals were deduced from a normalised mutual information (MI) value U(k, l) computed for all pairs of residues k, l. We demonstrate that conn(k) is a more robust indicator than an individual MI-value for the prediction of residues most plausibly important for the evolution of a protein. This proposition was inferred by means of statistical methods. It was further confirmed by the analysis of several proteins. A server, which computes conn(k)-values is available at http://www-bioinf.uni-regensburg.de. CONCLUSION: The algorithms H2r, which analyses MSAs and computes conn(k)-values, characterises a specific class of residues. In contrast to strictly conserved ones, these residues possess some flexibility in the composition of side chains. However, their allocation is sensibly balanced with several other positions, as indicated by conn(k)

University of Regensburg Publication Server

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Automatic extraction of reliable regions from multiple sequence alignments-1

Author: Erik LL Sonnhammer (16370)
Timo Lassmann (14297)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Automatic extraction of reliable regions from multiple sequence alignments"http://www.biomedcentral.com/1471-2105/8/S5/S9BMC Bioinformatics 2007;8(Suppl 5):S9-S9.Published online 24 May 2007PMCID:PMC1892097.n to the cumulative running time of the alignment programs used to generate the input alignments. The running times of Mumsa were multiplied by 100 to be visible in the plot. The sequence files were generated by ROSE [16] using an average sequence length of 500 residues and and average evolutionary distance of 250. It is clear that the running time of Mumsa is at least two orders of magnitude lower than that required by the alignment programs

FigShare

Automatic extraction of reliable regions from multiple sequence alignments-0

Author: Erik LL Sonnhammer (16370)
Timo Lassmann (14297)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Automatic extraction of reliable regions from multiple sequence alignments"http://www.biomedcentral.com/1471-2105/8/S5/S9BMC Bioinformatics 2007;8(Suppl 5):S9-S9.Published online 24 May 2007PMCID:PMC1892097.d Dialign alignment of the Balibase 3.0 test case BB20007. The parameter was chosen to be two, requiring that residues in the output alignment appear in at least two input alignments. Each residue is colored according to the average occurrence of the POARs it is involved in. Regions that appear in red are identically aligned in all 5 input alignments while green and blue regions are only aligned identically in fewer and fewer cases. It is clear that all alignment programs find conserved motifs in the sequences but disagree on how the residues in between should be aligned

FigShare