Search CORE

333 research outputs found

"Multiple Sequence Alignment Using External Sources Of Information"

Author: Yasin Layal
Publication venue: University Goettingen Repository
Publication date: 28/01/2016
Field of study

Multiple sequence alignment is an alignment of three or more protein or nucleic acid sequences. The alignment area has always been of much interest for researchers, this is due to that fact that many scientifi c researchs depend in their workflow on sequence alignments. Thus, having an alignment of high quality is of high importance. Much work has been done and is still carried in this field to help improving the quality of alignments. Many approaches have been developed so far for performing pairwise and multiple sequence alignments, yet, most of those approaches rely basically on the sequences to be aligned as their only input. Recently, some approaches began to incorporate additional sources of information in the alignment process, the sources of external data can come from user knowledge or online databases. This data, when integrated in the workflow of the alignment programs, may add new constraints to the produced alignment and improve its quality by making it biologically more meaningful. In this thesis, I will introduce new approaches for multiple sequence alignment which use the alignment software DIALIGN along with external information from databases, where useful information is extracted and then integrated in the alignment process. By testing those approaches on benchmark databases, I will show that using additional data during alignment produced better results than using DIALIGN alone without any external input other than the sequences to be aligned

New algorithms and methods for protein and DNA sequence comparison

Author: Crook James
Publication venue: The University of Edinburgh
Publication date: 01/01/1991
Field of study

The SeqFEATURE library of 3D functional site models: comparison to existing methods and applications to protein function annotation

Author: Altman Russ B
Liang Mike P
Wu Shirley
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

SeqFEATURE, a tool for protein function annotation, models protein functions described by sequence motifs using a structural representation. The tool shows significantly improved performance over other methods when sequence and structural similarity are low

Springer - Publisher Connector

A tool for reconstructing phylogenies from the composition of protein motifs

Author
Publication venue
Publication date
Field of study

The aim of this work was the development of a tool for phylogenetic analysis. In particular, the tool implements an alignment free approach that consider biological signals as vector units. We called it TBP as for Trees from Biologically significant Patterns. Some preliminary experiments hint that some evolutionary signal might be indeed encoded with presence/absence of biologically significant pattern

Padua Thesis and Dissertation Archive

A structural study for the optimisation of functional motifs encoded in protein sequences

Author: Helmer-Citterich Manuela
Via Allegra
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: A large number of PROSITE patterns select false positives and/or miss known true positives. It is possible that – at least in some cases – the weak specificity and/or sensitivity of a pattern is due to the fact that one, or maybe more, functional and/or structural key residues are not represented in the pattern. Multiple sequence alignments are commonly used to build functional sequence patterns. If residues structurally conserved in proteins sharing a function cannot be aligned in a multiple sequence alignment, they are likely to be missed in a standard pattern construction procedure. RESULTS: Here we present a new procedure aimed at improving the sensitivity and/ or specificity of poorly-performing patterns. The procedure can be summarised as follows: 1. residues structurally conserved in different proteins, that are true positives for a pattern, are identified by means of a computational technique and by visual inspection. 2. the sequence positions of the structurally conserved residues falling outside the pattern are used to build extended sequence patterns. 3. the extended patterns are optimised on the SWISS-PROT database for their sensitivity and specificity. The method was applied to eight PROSITE patterns. Whenever structurally conserved residues are found in the surface region close to the pattern (seven out of eight cases), the addition of information inferred from structural analysis is shown to improve pattern selectivity and in some cases selectivity and sensitivity as well. In some of the cases considered the procedure allowed the identification of functionally interesting residues, whose biological role is also discussed. CONCLUSION: Our method can be applied to any type of functional motif or pattern (not only PROSITE ones) which is not able to select all and only the true positive hits and for which at least two true positive structures are available. The computational technique for the identification of structurally conserved residues is already available on request and will be soon accessible on our web server. The procedure is intended for the use of pattern database curators and of scientists interested in a specific protein family for which no specific or selective patterns are yet available

Springer - Publisher Connector

Directory of Open Access Journals

Archivio della ricerca- Università di Roma La Sapienza

Plant protein-coding gene families: emerging bioinformatics approaches

Author: Altschul
Andreeva
Attwood
Beers
Benson
Bru
Cambra
Carretero-Paulet
Chain
Chen
Cochrane
Cuff
de Lima Morais
Del Bem
Enright
Faro
Feng
Finn
Fraser
Frech
Garcia-Lorenzo
Guilfoyle
Guindon
Haft
Hunter
Kaminuma
Kersey
Klimke
Kolodziejczyk
Kotsyfakis
Lees
Leinonen
Letunic
Li
Li
Lijavetzky
Lima
Liolios
Lu
Manuel Martinez
Marchler-Bauer
Martinez
Martinez
Martinez
Mi
Moreno-Risueno
Mugford
Nikolskaya
Nissen
Paterson
Pearson
Perez-Rodriguez
Philippe
Plett
Proost
Pruitt
Rautengarten
Rawlings
Remington
Roberts
Rouard
Sigrist
Singh
Swaminathan
Takahashi
Tatusov
Tian
Tyler
UniProt_Consortium
Van de Peer
Vercammen
Wang
Yu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

Protein-coding gene families are sets of similar genes with a shared evolutionary origin and, generally, with similar biological functions. In plants, the size and role of gene families has been only partially addressed. However, suitable bioinformatics tools are being developed to cluster the enormous number of sequences currently available in databases. Specifically, comparative genomic databases promise to become powerful tools for gene family annotation in plant clades. In this review, I evaluate the data retrieved from various gene family databases, the ease with which they can be extracted and how useful the extracted information is