Search CORE

923 research outputs found

New algorithms and methods for protein and DNA sequence comparison

Author: Crook James
Publication venue: The University of Edinburgh
Publication date: 01/01/1991
Field of study

Phylogenetic analysis of the SAP30 family of transcriptional regulators reveals functional divergence in the domain that binds the nuclear matrix

Author: Heinonen TYK
Lohi O
Maki M
Viiri KM
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Background: Deacetylation of histones plays a fundamental role in gene silencing, and this is mediated by a corepressor complex containing Sin3 as an essential scaffold protein. In this report we examine the evolution of two proteins in this complex, the Sin3-associated proteins SAP30L and SAP30, by using an archive of protein sequences from 62 species.Results: Our analysis indicates that in tetrapods SAP30L is more similar than SAP30 to the ancestral protein, and the two copies in this group originated by gene duplication which occurred after the divergence of Actinopterygii and Sarcopterygii about 450 million years ago (Mya). The phylogenetic analysis and biochemical experiments suggest that SAP30 has diverged functionally from the ancestral SAP30L by accumulating mutations that have caused attenuation of one of the original functions, association with the nuclear matrix. This function is mediated by a nuclear matrix association sequence, which consists of a conserved motif in the C-terminus and the adjacent nucleolar localization signal (NoLS).Conclusion: These results add further insight into the evolution and function of proteins of the SAP30 family, which share many characteristic with nuclear scaffolding proteins that are intimately involved in regulation of gene expression. Furthermore, SAP30L seems essential to eukaryotic biology, as it is found in animals, plants, fungi, as well as some taxa of unicellular eukaryotes

Springer - Publisher Connector

UCL Discovery

PubMed Central

TamPub Julkaisuarkisto - TamPub Institutional Repository

Trepo - Institutional Repository of Tampere University

ClustalXeed: a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment

Author: A Boukerche
B Rost
D Mikhailov
DG Higgins
Hyun Joo
J Garnier
J Kleinjung
JD Thompson
JD Thompson
JD Thompson
K-B Li
M Schmollinger
MA Larkin
N Essoussi
O Trelles
R Chenna
RD Page
T Hagerup
Taeho Kim
V Chaudhary
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background There is an increasing demand to assemble and align large-scale biological sequence data sets. The commonly used multiple sequence alignment programs are still limited in their ability to handle very large amounts of sequences because the system lacks a scalable high-performance computing (HPC) environment with a greatly extended data storage capacity. Results We designed ClustalXeed, a software system for multiple sequence alignment with incremental improvements over previous versions of the ClustalX and ClustalW-MPI software. The primary advantage of ClustalXeed over other multiple sequence alignment software is its ability to align a large family of protein or nucleic acid sequences. To solve the conventional memory-dependency problem, ClustalXeed uses both physical random access memory (RAM) and a distributed file-allocation system for distance matrix construction and pair-align computation. The computation efficiency of disk-storage system was markedly improved by implementing an efficient load-balancing algorithm, called "idle node-seeking task algorithm" (INSTA). The new editing option and the graphical user interface (GUI) provide ready access to a parallel-computing environment for users who seek fast and easy alignment of large DNA and protein sequence sets. Conclusions ClustalXeed can now compute a large volume of biological sequence data sets, which were not tractable in any other parallel or single MSA program. The main developments include: 1) the ability to tackle larger sequence alignment problems than possible with previous systems through markedly improved storage-handling capabilities. 2) Implementing an efficient task load-balancing algorithm, INSTA, which improves overall processing times for multiple sequence alignment with input sequences of non-uniform length. 3) Support for both single PC and distributed cluster systems.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MAP2: multiple alignment of syntenic genomic sequences

Author: Huang Xiaoqiu
Ye Liang
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

We describe a multiple alignment program named MAP2 based on a generalized pairwise global alignment algorithm for handling long, different intergenic and intragenic regions in genomic sequences. The MAP2 program produces an ordered list of local multiple alignments of similar regions among sequences, where different regions between local alignments are indicated by reporting only similar regions. We propose two similarity measures for the evaluation of the performance of MAP2 and existing multiple alignment programs. Experimental results produced by MAP2 on four real sets of orthologous genomic sequences show that MAP2 rarely missed a block of transitively similar regions and that MAP2 never produced a block of regions that are not transitively similar. Experimental results by MAP2 on six simulated data sets show that MAP2 found the boundaries between similar and different regions precisely. This feature is useful for finding conserved functional elements in genomic sequences. The MAP2 program is freely available in source code form at for academic use

Digital Repository @ Iowa State University (ISU)

CiteSeerX

Crossref

PubMed Central

Simultaneous phylogeny reconstruction and multiple sequence alignment

Author: BME Moret
C Notredame
C Notredame
D Higgins
D Huson
D Powell
D Sankoff
D Sankoff
D Sankoff
E Myers
F Yue
Feng Yue
G Lancia
J Hein
J Stoye
J Strugnell
J Thompson
J Thompson
Jian Shi
Jijun Tang
K Wong
L Wang
M Vingron
N Goldman
N Saitou
O Gotoh
R Robinson
S Henikoff
T Jiang
T Ogden
U Roshan
W Pearson
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

MISHIMA - a new method for high speed multiple alignment of nucleotide sequences of bacterial genome scale data

Author: ACC Shih
AL Delcher
B Morgenstern
C Notredame
D Mikhailov
DF Feng
DG Higgins
DJ Lipman
F Corpet
GJ Barton
J Cheetham
J Stoye
JD Thompson
K Katoh
K Kryukov
K Reinert
KB Li
Kirill Kryukov
M Brudno
M Brudno
M Brudno
M Kimura
N Bray
Naruya Saitou
O Gotoh
RC Edgar
U Tonges
WR Taylor
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Large nucleotide sequence datasets are becoming increasingly common objects of comparison. Complete bacterial genomes are reported almost everyday. This creates challenges for developing new multiple sequence alignment methods. Conventional multiple alignment methods are based on pairwise alignment and/or progressive alignment techniques. These approaches have performance problems when the number of sequences is large and when dealing with genome scale sequences. Results We present a new method of multiple sequence alignment, called MISHIMA (Method for Inferring Sequence History In terms of Multiple Alignment), that does not depend on pairwise sequence comparison. A new algorithm is used to quickly find rare oligonucleotide sequences shared by all sequences. Divide and conquer approach is then applied to break the sequences into fragments that can be aligned independently by an external alignment program. These partial alignments are assembled together to form a complete alignment of the original sequences. Conclusions MISHIMA provides improved performance compared to the commonly used multiple alignment methods. As an example, six complete genome sequences of bacteria species <it>Helicobacter pylori </it>(about 1.7 Mb each) were successfully aligned in about 6 hours using a single PC.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Maize Streak Virus: diversity and virulence

Author: Martin Darren Patrick
Publication venue: Department of Molecular and Cell Biology
Publication date: 01/01/2000
Field of study

Zea mays was first introduced to Africa in Ghana by Portuguese traders in the 16th century. The steady spread of maize cultivation since then has made it the most important cereal crop in Africa today. Whereas improved maize genotypes and agricultural techniques enable yearly yields above 10 tons hectare-1 in the developed world, yearly yields across Africa have remained low at about 1 ton hectare-1 in most countries. Although outmoded agricultural practices are the main reason for poor yields, maize pathogens inflict substantial additional losses. Of the many pathogens currently confronting maize farmers in Africa, Maize streak virus (MSV) is the most significant

Cape Town University OpenUCT

ReformAlign: improved multiple sequence alignments using a profile-based meta-alignment approach

Author: Lyras Dimitrios P.
Metzler Dirk
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background: Obtaining an accurate sequence alignment is fundamental for consistently analyzing biological data. Although this problem may be efficiently solved when only two sequences are considered, the exact inference of the optimal alignment easily gets computationally intractable for the multiple sequence alignment case. To cope with the high computational expenses, approximate heuristic methods have been proposed that address the problem indirectly by progressively aligning the sequences in pairs according to their relatedness. These methods however are not flexible to change the alignment of an already aligned group of sequences in the view of new data, resulting thus in compromises on the quality of the deriving alignment. In this paper we present ReformAlign, a novel meta-alignment approach that may significantly improve on the quality of the deriving alignments from popular aligners. We call ReformAlign a meta-aligner as it requires an initial alignment, for which a variety of alignment programs can be used. The main idea behind ReformAlign is quite straightforward: at first, an existing alignment is used to construct a standard profile which summarizes the initial alignment and then all sequences are individually re-aligned against the formed profile. From each sequence-profile comparison, the alignment of each sequence against the profile is recorded and the final alignment is indirectly inferred by merging all the individual sub-alignments into a unified set. The employment of ReformAlign may often result in alignments which are significantly more accurate than the starting alignments. Results: We evaluated the effect of ReformAlign on the generated alignments from ten leading alignment methods using real data of variable size and sequence identity. The experimental results suggest that the proposed meta-aligner approach may often lead to statistically significant more accurate alignments. Furthermore, we show that ReformAlign results in more substantial improvement in cases where the starting alignment is of relatively inferior quality or when the input sequences are harder to align. Conclusions: The proposed profile-based meta-alignment approach seems to be a promising and computationally efficient method that can be combined with practically all popular alignment methods and may lead to significant improvements in the generated alignments

Springer - Publisher Connector

Open Access LMU

PubMed Central

Genetic sequences: tracing the mutations of a disease.

Author: Mitchell I.
Mitchell I.
Passmore P.
Passmore P.
Xu K.
Xu K.
Publication venue
Publication date: 01/01/2010
Field of study

The entry is to address the mini challenge 3 in the IEEE VAST Challenge 2010. A patient, identified by Interpol as Nicolai Kuryakin, is admitted to a hospital in Paris after being removed from a flight to Moscow due to illness. The patient, now deceased, was admitted with an unidentified illness and later developed symptoms consistent with Drafa Fever. An autopsy confirmed the presence of the Drafa virus in the patient’s bloodstream. Visual analytics methods are applied to help to understand the evolution of the current Drafa virus outbreak, as it may shed some light on Nicolai’s contacts

Middlesex University Research Repository