Search CORE

arXiv.org e-Print Archive

Interplay between pleiotropy and secondary selection determines rise and fall of mutators in stress response

Author: A Giraud
A Sali
AC Shaver
B Li
CS Wylie
DK Klimov
E Dekel
E Denamur
E Denamur
E Shakhnovich
E Shakhnovich
EI Shakhnovich
EI Shakhnovich
EJ Deeds
ES Slechta
Eugene I. Shakhnovich
F Taddei
G Feng
GM Suel
HC Tsui
I Bjedov
I Matic
IN Berezovsky
JAGM de Visser
JB Andre
JD Bloom
JM Raser
JW Drake
KB Zeldovich
KB Zeldovich
KB Zeldovich
KK Yan
KL Huisinga
KP Bjornson
L Lopez-Maury
M Heo
M Wrande
MB Elowitz
MJ Schofield
ML Mendillo
Muyoung Heo
N Rosenfeld
O Tenaillon
O Tenaillon
PD Sniegowski
PD Sniegowski
PJ Gerrish
PL Foster
R Du
RE Lenski
RR Iyer
RS Galhardo
RS Harris
Rustom Antia
S Maslov
S Miyazawa
WA Rosche
WJ Blake
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 23/12/2009
Field of study

Dramatic rise of mutators has been found to accompany adaptation of bacteria in response to many kinds of stress. Two views on the evolutionary origin of this phenomenon emerged: the pleiotropic hypothesis positing that it is a byproduct of environmental stress or other specific stress response mechanisms and the second order selection which states that mutators hitchhike to fixation with unrelated beneficial alleles. Conventional population genetics models could not fully resolve this controversy because they are based on certain assumptions about fitness landscape. Here we address this problem using a microscopic multiscale model, which couples physically realistic molecular descriptions of proteins and their interactions with population genetics of carrier organisms without assuming any a priori fitness landscape. We found that both pleiotropy and second order selection play a crucial role at different stages of adaptation: the supply of mutators is provided through destabilization of error correction complexes or fluctuations of production levels of prototypic mismatch repair proteins (pleiotropic effects), while rise and fixation of mutators occur when there is a sufficient supply of beneficial mutations in replication-controlling genes. This general mechanism assures a robust and reliable adaptation of organisms to unforeseen challenges. This study highlights physical principles underlying physical biological mechanisms of stress response and adaptation

Digital Repository @ Iowa State University (ISU)

Use of machine learning algorithms to classify binary protein sequences as highly-designable or poorly-designable

Author: A Kloczkowski
A Kloczkowski
A Kloczkowski
A Kloczkowski
AJ Guttmann
AM Gutin
Andrzej Kloczkowski
B Shakhnovich
C Cejtin
CL Dias
DG Covell
E Shakhnovich
EI Shakhnovich
EI Shakhnovich
EI Shakhnovich
GM Crippen
H Li
H Li
H Li
HS Chan
HS Chan
HS Chan
I Jensen
IH Witten
IN Berezovsky
IN Berezovsky
J des Cloizeaux
JL England
JR Quinlan
K Yue
M Peto
M Peto
ML Mansfield
MR Ejtehadi
MR Ejtehadi
Myron Peto
N Madras
N Wingreen
Robert L Jernigan
TG Schmalz
V Shahrezaei
V Shahrezaei
Vasant Honavar
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background By using a standard Support Vector Machine (SVM) with a Sequential Minimal Optimization (SMO) method of training, Naïve Bayes and other machine learning algorithms we are able to distinguish between two classes of protein sequences: those folding to highly-designable conformations, or those folding to poorly- or non-designable conformations. Results First, we generate all possible compact lattice conformations for the specified shape (a hexagon or a triangle) on the 2D triangular lattice. Then we generate all possible binary hydrophobic/polar (H/P) sequences and by using a specified energy function, thread them through all of these compact conformations. If for a given sequence the lowest energy is obtained for a particular lattice conformation we assume that this sequence folds to that conformation. Highly-designable conformations have many H/P sequences folding to them, while poorly-designable conformations have few or no H/P sequences. We classify sequences as folding to either highly – or poorly-designable conformations. We have randomly selected subsets of the sequences belonging to highly-designable and poorly-designable conformations and used them to train several different standard machine learning algorithms. Conclusion By using these machine learning algorithms with ten-fold cross-validation we are able to classify the two classes of sequences with high accuracy – in some cases exceeding 95%.</p

Springer - Publisher Connector

Public Library of Science (PLOS)

Mutation Bias Favors Protein Folding Stability in the Evolution of Small Populations

Author: A Babajide
A Godzik
A Muto
B Derrida
CO Wilke
D Graur
DJ Lambert
DM Taverna
DM Taverna
E Bornberg-Bauer
E Denamur
E Duarte
E Loh
E van Nimwegen
EI Shakhnovich
EP Rocha
Eugene I. Shakhnovich
FLatorre Silva
G Bernardi
G D'Onofrio
G Parisi
G Sella
G Tiana
GI Peterson
H Musto
H Naya
HJ Bussemaker
HJ Muller
IS Novella
J Berg
J Clune
JD Bloom
JD Bloom
JD Bloom
JP McCutcheon
JW Drake
KB Zeldovich
Kettler
LA Mirny
M dos Reis
M Eigen
M Hasegawa
M Kimura
M Kimura
M Nilsson
MA DePristo
MA Fares
MA Fares
Markus Porto
Miriam Fritsche
N Petit
N Sueoka
NA Moran
NV Dokholyan
OG Berg
P Chen
PM Sharp
R Durrett
R Guerois
RA Fisher
Raul Mendez
RC van Ham
S Govindarajan
Scanlan
SG Wright
SL Chen
SY Ho
T Banerjee
T Itoh
T Ohta
U Bastolla
U Bastolla
U Bastolla
U Bastolla
U Bastolla
Ugo Bastolla
V Daubin
VN Uversky
W Kauzmann
Y Brumer
Publication venue: Public Library of Science
Publication date: 01/05/2010
Field of study

Mutation bias in prokaryotes varies from extreme adenine and thymine (AT) in obligatory endosymbiotic or parasitic bacteria to extreme guanine and cytosine (GC), for instance in actinobacteria. GC mutation bias deeply influences the folding stability of proteins, making proteins on the average less hydrophobic and therefore less stable with respect to unfolding but also less susceptible to misfolding and aggregation. We study a model where proteins evolve subject to selection for folding stability under given mutation bias, population size, and neutrality. We find a non-neutral regime where, for any given population size, there is an optimal mutation bias that maximizes fitness. Interestingly, this optimal GC usage is small for small populations, large for intermediate populations and around 50% for large populations. This result is robust with respect to the definition of the fitness function and to the protein structures studied. Our model suggests that small populations evolving with small GC usage eventually accumulate a significant selective advantage over populations evolving without this bias. This provides a possible explanation to the observation that most species adopting obligatory intracellular lifestyles with a consequent reduction of effective population size shifted their mutation spectrum towards AT. The model also predicts that large GC usage is optimal for intermediate population size. To test these predictions we estimated the effective population sizes of bacterial species using the optimal codon usage coefficients computed by dos Reis et al. and the synonymous to non-synonymous substitution ratio computed by Daubin and Moran. We found that the population sizes estimated in these ways are significantly smaller for species with small and large GC usage compared to species with no bias, which supports our prediction

Digital.CSIC

Optimal enumeration of state space of finitely buffered stochastic molecular networks and exact computation of steady state landscape probability

Author: A Arkin
A Sali
A Samant
B Munsky
D Hawley
D Hawley
D Schultz
D Volfson
DK Klimov
DT Gillespie
DT Gillespie
EI Shakhnovich
EM Ozbudak
H Salis
HH McAdams
J Hasty
J Little
J Paulsson
JE Hornos
Jie Liang
JT Mettetal
KA Dill
KA Dill
KY Kim
M Li
M Samoilov
MD Levin
ND Socci
NG Van Kampen
NI Markevich
P Ao
R Lehoucq
S Kachalo
SB Ozkan
T Kepler
T Zhou
TH Cormen
TL Hill
XM Zhu
Y Cao
Y Morishita
Y Morishita
Youfang Cao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Mathematical modeling and comparison of protein size distribution in different plant, animal, fungal and microbial species reveals a negative correlation between protein size and protein number, thus providing insight into the evolution of proteomes

Author: A Gierlik
A McLachlan
AA Gimelbrant
AG Murzin
AS Warren
Axel Tiessen
C Blake
C Chothia
DL Nelson
E Mayr
EI Shakhnovich
F Lewis
GA Petsko
H Akaike
H Akashi
H Naora
H Seligmann
J Darnell
J Kiraga
JL Oliver
JM Chandonia
JV White
JZ Zhang
KA Dill
KF Lau
L Aravind
L Brocchieri
LH Chen
Luis José Delaye-Arredondo
M Nei
M Schlegela
MJ Denton
P Mackiewicz
Paulino Pérez-Rodríguez
R Development Core Team
R Dorit
R Holmquist
R Jain
R Kolodny
RL Dorit
RV Eck
S Cebrat
S Nandi
S Sommer
SH White
SH White
SH White
SH White
SM Ross
TH Jukes
TH Jukes
TO Yeates
WN Venables
Y Zhang
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background The sizes of proteins are relevant to their biochemical structure and for their biological function. The statistical distribution of protein lengths across a diverse set of taxa can provide hints about the evolution of proteomes. Results Using the full genomic sequences of over 1,302 prokaryotic and 140 eukaryotic species two datasets containing 1.2 and 6.1 million proteins were generated and analyzed statistically. The lengthwise distribution of proteins can be roughly described with a gamma type or log-normal model, depending on the species. However the shape parameter of the gamma model has not a fixed value of 2, as previously suggested, but varies between 1.5 and 3 in different species. A gamma model with unrestricted shape parameter described best the distributions in ~48% of the species, whereas the log-normal distribution described better the observed protein sizes in 42% of the species. The gamma restricted function and the sum of exponentials distribution had a better fitting in only ~5% of the species. Eukaryotic proteins have an average size of 472 aa, whereas bacterial (320 aa) and archaeal (283 aa) proteins are significantly smaller (33-40% on average). Average protein sizes in different phylogenetic groups were: Alveolata (628 aa), Amoebozoa (533 aa), Fornicata (543 aa), Placozoa (453 aa), Eumetazoa (486 aa), Fungi (487 aa), Stramenopila (486 aa), Viridiplantae (392 aa). Amino acid composition is biased according to protein size. Protein length correlated negatively with %C, %M, %K, %F, %R, %W, %Y and positively with %D, %E, %Q, %S and %T. Prokaryotic proteins had a different protein size bias for %E, %G, %K and %M as compared to eukaryotes. Conclusions Mathematical modeling of protein length empirical distributions can be used to asses the quality of small ORFs annotation in genomic releases (detection of too many false positive small ORFs). There is a negative correlation between average protein size and total number of proteins among eukaryotes but not in prokaryotes. The %GC content is positively correlated to total protein number and protein size in prokaryotes but not in eukaryotes. Small proteins have a different amino acid bias than larger proteins. Compared to prokaryotic species, the evolution of eukaryotic proteomes was characterized by increased protein number (massive gene duplication) and substantial changes of protein size (domain addition/subtraction).</p

Springer - Publisher Connector

Public Library of Science (PLOS)

Protein 3D Structure Computed from Evolutionary Sequence Variation

Author: A Kryshtafovych
A Roy
A Schug
A Zemla
AA Fodor
AF Poon
AF Poon
Andrea Pagnani
Andrej Sali
AP Kamat
AR Ortiz
AR Ortiz
ASGB Lapedes
AT Brunger
B Reva
BG Giraud
C Chothia
Chris Sander
CS Miller
D Altschuh
D Altschuh
D Cozzetto
DE Kim
DE Shaw
Debora S. Marks
E Neher
E Schneidman
EI Shakhnovich
F Morcos
G Kolesov
H Fehlhammer
HRFB Kappen
IN Shindyalov
J DeBartolo
J Moult
J Moult
J Moult
J Qiu
J Skolnick
JM Duarte
JM Skerker
JS Yang
JW Locasale
KT Simons
L Burger
L Burger
L Holm
Lucy J. Colwell
M Mezard
M Miyano
M Vendruscolo
M Weigt
MMT Mezard
N Halabi
N Siew
P Bradley
P Bradley
P Fariselli
P Joost
PMJW Ravikumar
R Das
R Nair
R Sathyapriya
RD Finn
Riccardo Zecchina
RO Dror
Robert Sheridan
S Raman
S Raman
S Wu
S Wu
S Yooseph
SD Dunn
T Mora
TF Havel
Thomas A. Hopf
TR Lezon
TR Lezon
U Göbel
V Morea
VMR Sessak
WP Russ
WR Atchley
WR Taylor
WR Taylor
Y Duan
Y Zhang
Y Zhang
YJAH Roudi
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Development of a knowledge-based potential for crystals of small organic molecules: Calculation of energy surfaces for C=0 center dot center dot center dot H-N hydrogen bonds

Author: DeWitte RS
Grzybowski BA
Ishchenko AV
Shakhnovich EI
Whitesides GM
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/08/2000
Field of study

This paper describes the derivation of a Knowledge-Based Potential for intermolecular interactions from the statistical information stored in the Cambridge Structural Database. We develop a statistical mechanical method that relates the occurrences of intermolecular contacts in the database to their energies. Our approach allows us to quantify (in the form of energy) the geometrical preferences of interactions. We use our method to construct energy maps for a hydrogen bond between carbonyl oxygen and amino hydrogen. Our results demonstrate high orientational selectivity of this type of hydrogen bonding

ScholarWorks@UNIST

Conserved residues and the mechanism of protein folding

Author: A Fersht
A Kolinski
A Matouschek
A Matouschek
A Sali
A Sali
AM Gutin
AM Gutin
AR Viguera
BB Kragelund
C Sander
DE Oltsen
EI Shakhnovich
EI Shakhnovich
EI Shakhnovich
EI Shakhnovich
JU Bowie
L Itzhaki
M-H Hao
N Metropolis
P Alexander
PG Wolynes
R Goldstein
S Myazawa
SE Jackson
SE Jackson
T Schindler
TR Sosnick
VI Abkevich
VI Abkevich
Z Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Cosmology and proteins: landscape of possibilities

Author: C Dobson
Collin M. Stultz
EI Shakhnovich
JN Onuchic
L Susskind
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study