Search CORE

23 research outputs found

Non-Negative Matrix Factorization for Learning Alignment-Specific Models of Protein Evolution

Author: Ben Murrell
C Kosiol
D Posada
D Posada
D Robinson
Daniel Kaliski
DC Nickle
DD Lee
DJ Lipman
DT Jones
F Abascal
Gerdus Benade
J Adachi
J Felsenstein
J Felsenstein
Jan Buys
K Devarajan
Konrad Scheffler
KP Burnham
KP Burnham
L Stanfel
Lise du Buisson
MO Dayhoff
MO Dayhoff
MW Dimmic
N Goldman
N Lartillot
Robert Ketteringham
S Whelan
S Whelan
S Zoller
SA Guindon
Sasha Moola
SL Kosakovsky Pond
SL Kosakovsky Pond
SQ Le
SQ Le
Thomas Mailund
Thomas Weighill
Tristan Hands
W Delport
Y Cao
Z Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Models of protein evolution currently come in two flavors: generalist and specialist. Generalist models (e.g. PAM, JTT, WAG) adopt a one-size-fits-all approach, where a single model is estimated from a number of different protein alignments. Specialist models (e.g. mtREV, rtREV, HIVbetween) can be estimated when a large quantity of data are available for a single organism or gene, and are intended for use on that organism or gene only. Unsurprisingly, specialist models outperform generalist models, but in most instances there simply are not enough data available to estimate them. We propose a method for estimating alignment-specific models of protein evolution in which the complexity of the model is adapted to suit the richness of the data. Our method uses non-negative matrix factorization (NNMF) to learn a set of basis matrices from a general dataset containing a large number of alignments of different proteins, thus capturing the dimensions of important variation. It then learns a set of weights that are specific to the organism or gene of interest and for which only a smaller dataset is available. Thus the alignment-specific model is obtained as a weighted sum of the basis matrices. Having been constrained to vary along only as many dimensions as the data justify, the model has far fewer parameters than would be required to estimate a specialist model. We show that our NNMF procedure produces models that outperform existing methods on all but one of 50 test alignments. The basis matrices we obtain confirm the expectation that amino acid properties tend to be conserved, and allow us to quantify, on specific alignments, how the strength of conservation varies across different properties. We also apply our new models to phylogeny inference and show that the resulting phylogenies are different from, and have improved likelihood over, those inferred under standard models

Public Library of Science (PLOS)

Cape Town University OpenUCT

Crossref

Directory of Open Access Journals

PubMed Central

Stellenbosch University SUNScholar Repository

CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences

Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of rate classes, where is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

TUScholarShare (Temple University)

PubMed Central

Cronfa at Swansea University

Stellenbosch University SUNScholar Repository

The Francis Crick Institute

Bayesian data mining of protein domains gives an efficient predictive algorithm and new insight

Author: AG Murzin
CA Orengo
EG Hutchinson
HM Berman
J Sim
LE Stanfel
LJ McGuffin
M Suyama
MR Batencourt
R Sowdhamini
RA George
Rajani R. Joshi
RL Marsden
RR Joshi
S Jyothi
SJ Wheelan
T Tanaka
Vivekanand V. Samant
W Kabsch
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Tuberous sclerosis complex–mediated mTORC1 overactivation promotes age-related hearing loss

Author: Aizhen Zhang
Deosaran
Ding
Ebnoether
Fu
Fujimoto
Haibo Wang
Hine
Jianfeng Li
Jiangang Gao
Kim
Le
Lili Yang
Linqing Zhang
Renjie Chai
Sha
Stanfel
Xiangguo Liu
Xiaochun Bai
Xiaolong Fu
Xiaoyang Sun
Yamasoba
Yecheng Jin
Zheng
Publication venue: 'American Society for Clinical Investigation'
Publication date
Field of study

Crossref

Fast prediction of protein domain boundaries using conserved local patterns

Author: AV Tendulkar
AV Tendulkar
EG Hutchinson
J Sim
LE Stanfel
LJ McGuffin
M Suyama
MR Batencourt
R Sowdhamini
Rajani R. Joshi
RL Marsden
RR Joshi
RR Joshi
S Jones
S Jyothi
SJ Wheelan
T Tanaka
Vivekanand V. Samant
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Gene expression analysis of mTOR pathway: association with human longevity

Author: Barzilai
Brown
Deelen
Deelen
Derhovanessian
Dowling
Fang
Fontana
Garcia-Martinez
Goeman
Goeman
Guertin
Harrison
Heijmans
Holloszy
Ikenoue
Kaeberlein
Kapahi
Koch
Krebs
Kumar
Lamming
Laplante
Le
Liu
Livak
Morath
Newgard
Passtoors
Polak
Purcell
Rozing
Rozing
Sarbassov
Schoenmaker
Sengupta
Soukas
Spindler
Stanfel
StataCorp
Strimpakos
Thedieck
Vellai
Westendorp
Wijsman
Yap
Publication venue: 'Wiley'
Publication date: 16/10/2012
Field of study

mTOR signalling is implicated in the development of disease and in lifespan extension in model organisms. This pathway has been associated with human diseases such as diabetes and cancer, but has not been investigated for its impact on longevity per se. Here, we investigated whether transcriptional variation within the mTOR pathway is associated with human longevity using whole-blood samples from the Leiden Longevity Study. This is a unique cohort of Dutch families with extended survival across generations, decreased morbidity and beneficial metabolic profiles in middle-age. By comparing mRNA levels of nonagenarians and middle-aged controls, the mTOR signalling gene set was found to associate with old age (P=4.6×10-7). Single gene analysis showed that seven of 40 mTOR pathway genes had a significant differential expression of at least 5%. Of these, the RPTOR (Raptor) gene was found to be differentially expressed also when the offspring of nonagenarians was compared with their spouses, indicating association with familial longevity in middle-age. This association was not explained by variation between the groups in the prevalence of type 2 diabetes and cancer or glucose levels. Thus, the mTOR pathway not only plays a role in the regulation of disease and aging in animal models, but also in human health and longevity.</p

Crossref

VU Research Portal

Leiden University Scholary Publications

MPG.PuRe