Search CORE

29 research outputs found

Positional Homology in Bacterial Genomes

Author: Burgetz Ingrid J.
Pang Andy
Shariff Salimah
Tillier Elisabeth R. M.
Publication venue: Libertas Academica
Publication date: 01/01/2006
Field of study

In comparative genomic studies, syntenic groups of homologous sequence in the same order have been used as supplementary information that can be used in helping to determine the orthology of the compared sequences. The assumption is that ortholo-gous gene copies are more likely to share the same genome positions and share the same gene neighbors. In this study we have defined positional homologs as those that also have homologous neighboring genes and we investigated the usefulness of this distinction for bacterial comparative genomics. We considered the identification of positionaly homologous gene pairs in bacterial genomes using protein and DNA sequence level alignments and found that the positional homologs had on average relatively lower rates of substitution at the DNA level (synonymous substitutions) than duplicate homologs in different genomic locations, regardless of the level of protein sequence divergence (measured with non-synonymous substitution rate). Since gene order conservation can indicate accuracy of orthology assignments, we also considered the effect of imposing certain alignment quality requirements on the sensitivity and specificity of identification of protein pairs by BLAST and FASTA when neighboring information is not available and in comparisons where gene order is not conserved. We found that the addition of a stringency filter based on the second best hits was an efficient way to remove dubious ortholog identifications in BLAST and FASTA analyses. Gene order conservation and DNA sequence homology are useful to consider in comparative genomic studies as they may indicate different orthology assignments than protein sequence homology alone

Directory of Open Access Journals

PubMed Central

Multidimensional mutual information methods for the analysis of covariation in multiple sequence alignments

Author: Ackerman Sharon H.
Clark Greg W.
Gatti Domenico L.
Tillier Elisabeth R.
Publication venue
Publication date: 01/01/2014
Field of study

Several methods are available for the detection of covarying positions from a multiple sequence alignment (MSA). If the MSA contains a large number of sequences, information about the proximities between residues derived from covariation maps can be sufficient to predict a protein fold. If the structure is already known, information on the covarying positions can be valuable to understand the protein mechanism. In this study we have sought to determine whether a multivariate extension of traditional mutual information (MI) can be an additional tool to study covariation. The performance of two multidimensional MI (mdMI) methods, designed to remove the effect of ternary/quaternary interdependencies, was tested with a set of 9 MSAs each containing <400 sequences, and was shown to be comparable to that of methods based on maximum entropy/pseudolikelyhood statistical models of protein sequences. However, while all the methods tested detected a similar number of covarying pairs among the residues separated by < 8 {\AA} in the reference X-ray structures, there was on average less than 65% overlap between the top scoring pairs detected by methods that are based on different principles. We have also attempted to identify whether the difference in performance among methods is due to different efficiency in removing covariation originating from chains of structural contacts. We found that the reason why methods that derive partial correlation between the columns of a MSA provide a better recognition of close contacts is not because they remove chaining effects, but because they filter out the correlation between distant residues that originates from general fitness constraints. In contrast we found that true chaining effects are expression of real physical perturbations that propagate inside proteins, and therefore are not removed by the derivation of partial correlation between variables.Comment: 21 pages, 4 figures, 1 table, supporting information containing 2 additional figures is included at the end of the manuscrip

arXiv.org e-Print Archive

Crossref

A Transition Probability Model for Amino Acid Substitutions from Blocks

Author: Andrew Smith
Dayhoff M.R.
Elisabeth R. M. Tillier
Jones D.T.
Krause A.J.
Shalini Veerassamy
Publication venue: 'Mary Ann Liebert Inc'
Publication date
Field of study

Crossref

A new, fast algorithm for detecting protein coevolution using maximum compatible cliques

Author: A Rodionov
A Valencia
AK Ramani
Alex Rodionov
Alexandr Bezginov
AM Altenhoff
D MacLeod
D Robinson
Elisabeth RM Tillier
ERM Tillier
ERM Tillier
F Pazos
F Pazos
GW Clark
J Felsenstein
J Felsenstein
Jonathan Rose
K Katoh
MK Kuhner
PRJ Östergård
R Jothi
RG Beiko
RM Karp
S Razick
T Sato
V Soria-Carrasco
W Li
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The MatrixMatchMaker algorithm was recently introduced to detect the similarity between phylogenetic trees and thus the coevolution between proteins. MMM finds the largest common submatrices between pairs of phylogenetic distance matrices, and has numerous advantages over existing methods of coevolution detection. However, these advantages came at the cost of a very long execution time. Results In this paper, we show that the problem of finding the maximum submatrix reduces to a multiple maximum clique subproblem on a graph of protein pairs. This allowed us to develop a new algorithm and program implementation, MMMvII, which achieved more than 600× speedup with comparable accuracy to the original MMM. Conclusions MMMvII will thus allow for more more extensive and intricate analyses of coevolution. Availability An implementation of the MMMvII algorithm is available at: <url>http://www.uhnresearch.ca/labs/tillier/MMMWEBvII/MMMWEBvII.php</url></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MSH3 polymorphisms and protein levels affect CAG repeat instability in huntington's disease mice

Author: A Lloret
A Lopez Castel
A Lopez Castel
A Rosenblatt
A Seriola
A Watanabe
AA Fuller
AC Haugen
AM Marcelino
Anne Messer
AV Goula
AV Goula
AV Goula
C Blake
C Chiang
C Kumar
C Savouret
C Savouret
C Savouret
CE Nestor
CE Pearson
CE Pearson
Christopher E. Pearson
CJ Otto
CM Venkatachalam
Darren G. Monckton
DG Monckton
DK Chang
E Dragileva
E Taherzadeh-Fard
EG Hutchinson
EG Hutchinson
EL McCallister
Elisabeth R. M. Tillier
EM Ramos
F Coppede
F Morales
G Gourdon
GB Panigrahi
GB Panigrahi
GF Crouse
GG Krivov
GJ Brock
Greg W. Clark
Gregory S. Barsh
H Fu
H Hashida
H Takano
H Telenius
HM Berman
HM Kim
I Holt
IV Kovtun
IV Kovtun
IV Kovtun
J Conde
J Du
J Genschel
J Jiricny
JA Ybe
JD Cleary
JD Cleary
JL Li
JL Weber
JM Harrington
JM Lee
Jodie P. Simard
JP Linton
JV Olsen
K Katoh
K Manley
K Manley
K Takano
KE De Rooij
Kevin Manley
KL Burr
L Foiry
L Giunti
L Hubert Jr
L Kennedy
L Mangiarini
L Mangiarini
L Mollersen
L Mollersen
L Tian
LN Johnson
M Clamp
M Gomes-Pereira
M Gomes-Pereira
M Mangoni
M Swami
Meera Swami
Meghan M. Slean
MH Lamers
MM Slean
NS Wexler
Peggy F. Shelbourne
PF Shelbourne
PF Shelbourne
RM Cowin
RM Cowin
RP Chen
RT Libby
RT Libby
S Ku
S Michiels
S Oda
S Tome
S Tome
S Tome
SC Vatsavayai
SC Warby
SF Altschul
SJ Littman
SL Martinez
SN Thibodeau
SR Trevino
Stéphanie Tomé
T Kin
V Ezzatizadeh
VC Wheeler
VC Wheeler
W Kabsch
WJ van Den Broek
WJ van den Broek
X Dong
XY Hauge
Y Lin
Y Watanabe
Y Zhang
YC Hsieh
Z Yang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Expansions of trinucleotide CAG/CTG repeats in somatic tissues are thought to contribute to ongoing disease progression through an affected individual's life with Huntington's disease or myotonic dystrophy. Broad ranges of repeat instability arise between individuals with expanded repeats, suggesting the existence of modifiers of repeat instability. Mice with expanded CAG/CTG repeats show variable levels of instability depending upon mouse strain. However, to date the genetic modifiers underlying these differences have not been identified. We show that in liver and striatum the R6/1 Huntington's disease (HD) (CAG)~100 transgene, when present in a congenic C57BL/6J (B6) background, incurred expansion-biased repeat mutations, whereas the repeat was stable in a congenic BALB/cByJ (CBy) background. Reciprocal congenic mice revealed the Msh3 gene as the determinant for the differences in repeat instability. Expansion bias was observed in congenic mice homozygous for the B6 Msh3 gene on a CBy background, while the CAG tract was stabilized in congenics homozygous for the CBy Msh3 gene on a B6 background. The CAG stabilization was as dramatic as genetic deficiency of Msh2. The B6 and CBy Msh3 genes had identical promoters but differed in coding regions and showed strikingly different protein levels. B6 MSH3 variant protein is highly expressed and associated with CAG expansions, while the CBy MSH3 variant protein is expressed at barely detectable levels, associating with CAG stability. The DHFR protein, which is divergently transcribed from a promoter shared by the Msh3 gene, did not show varied levels between mouse strains. Thus, naturally occurring MSH3 protein polymorphisms are modifiers of CAG repeat instability, likely through variable MSH3 protein stability. Since evidence supports that somatic CAG instability is a modifier and predictor of disease, our data are consistent with the hypothesis that variable levels of CAG instability associated with polymorphisms of DNA repair genes may have prognostic implications for various repeat-associated diseases

Crossref

Directory of Open Access Journals

PubMed Central

Enlighten

FigShare

Corresponding author:

Author: Andrew D. Smith
Elisabeth R. M. Tillier
Elisabeth R. M. Tillier
Thomas W. H. Lui
Publication venue
Publication date
Field of study

(PMB), ribosomal RNA (rRNA), transfer RNA (tRNA), Hidden Markov Model (HMM) 1 Copyright (c) 2003 Society for Molecular Biology and Evolution Empirical models of substitution are often used in protein sequence analysis because the large alphabet of amino acids requires that many parameters be estimated in all but the simplest parametric models. When information about structure is used in the analysis of substitutions in structured RNA, a similar situation occurs. The number of parameters necessary to adequately describe the substitution process increases in order to model the substitution of paired bases. We have developed a method to obtain substitution rate matrices empirically from RNA alignments that include structural information in the form of base pairs. Our data consisted of alignments from the European ribosomal RNA database of Bacterial and Eukaryotic Small Subunit and Large Subunit ribosomal RNA (Wuyts et al., 2001a; Wuyts et al., 2002). Using secondary structural information, we converted each sequence in the alignments into

CiteSeerX

BIOINFORMATICS ORIGINAL PAPER doi:10.1093/bioinformatics/btm114 Sequence analysis

Author: Elisabeth R. M. Tillier
Shengzhong Feng
Publication venue
Publication date
Field of study

A fast and flexible approach to oligonucleotide probe design for genomes and gene familie

CiteSeerX