Search CORE

8 research outputs found

Rational Design of Temperature-Sensitive Alleles Using Computational Structure Prediction

Author: B Cunningham
B Lee
C Cortes
Ca Rohl
Christopher S. Poultney
CJ Burges
David Gresham
Dennis E. Shasha
EH Kellogg
G Chakshusmathi
Glenn L. Butterfoss
HM Muller
JM Word
JR Quinlan
K Bajaj
K Drew
KD Pruitt
Kevin Drew
Kristin C. Gunsalus
M Hall
Michelle R. Gutwein
N Eswar
N Siew
R Varadarajan
Richard Bonneau
RJ Dohmen
S Tweedie
SF Altschul
SF Altschul
TW Harris
Vladimir N. Uversky
WS Noble
WS Sandberg
Publication venue: Public Library of Science
Publication date: 02/09/2011
Field of study

Temperature-sensitive (ts) mutations are mutations that exhibit a mutant phenotype at high or low temperatures and a wild-type phenotype at normal temperature. Temperature-sensitive mutants are valuable tools for geneticists, particularly in the study of essential genes. However, finding ts mutations typically relies on generating and screening many thousands of mutations, which is an expensive and labor-intensive process. Here we describe an in silico method that uses Rosetta and machine learning techniques to predict a highly accurate “top 5” list of ts mutations given the structure of a protein of interest. Rosetta is a protein structure prediction and design code, used here to model and score how proteins accommodate point mutations with side-chain and backbone movements. We show that integrating Rosetta relax-derived features with sequence-based features results in accurate temperature-sensitive mutation predictions

Public Library of Science (PLOS)

Crossref

PubMed Central

Sites of predicted temperature-sensitive mutations.

Author: Christopher S. Poultney (345054)
David Gresham (148568)
Dennis E. Shasha (345059)
Glenn L. Butterfoss (345055)
Kevin Drew (345057)
Kristin C. Gunsalus (345058)
Michelle R. Gutwein (345056)
Richard Bonneau (4318)
Publication venue
Publication date
Field of study

The crystal structure of one domain of yeast calmodulin is shown in cartoon representation in green. Residues in the hydrophobic core are shown as green sticks, and hydrophobic core residues with predicted ts mutations are shown in purple. Of the top 20 predictions on calmodulin, 10 each from SVM-LIN and SVM-RBF, 15 mutations occur at these six sites.</p

FigShare

Typical ensembles of structures produced by Rosetta relax runs for calmodulin.

Author: Christopher S. Poultney (345054)
David Gresham (148568)
Dennis E. Shasha (345059)
Glenn L. Butterfoss (345055)
Kevin Drew (345057)
Kristin C. Gunsalus (345058)
Michelle R. Gutwein (345056)
Richard Bonneau (4318)
Publication venue
Publication date
Field of study

Shown here are structures generated by Rosetta relax runs that allow protein structures to “relax” to a lower energy state. The starting structure – one domain of yeast calmodulin – is shown in green, and the generated structures are shown in gray, with runs starting from the native structure on the left and runs from a mutation (F89I) on the right. The mutated site is shown in red in the mutant structure. The wt ensemble shows less variation in both difference from the starting structure and difference within the ensemble than the mutation ensemble. The differences between wild-type and mutation ensembles are quantified by comparing distributions of Rosetta score terms.</p

FigShare

Quartile method for comparing distributions of Rosetta score terms.

Author: Christopher S. Poultney (345054)
David Gresham (148568)
Dennis E. Shasha (345059)
Glenn L. Butterfoss (345055)
Kevin Drew (345057)
Kristin C. Gunsalus (345058)
Michelle R. Gutwein (345056)
Richard Bonneau (4318)
Publication venue
Publication date
Field of study

Mutant ensemble quartiles 1–3 were calculated for the mutant ensemble distribution (top) of the omega score term, which measures deviation of the bond angle from its ideal of . Q1–Q3 are indicated by red lines, with the corresponding values above and percentiles below. The mutant Q1–Q3 values were then mapped to locations in the wild type (wt) ensemble distribution (bottom). Q1–Q3 of the mutant distribution are again indicated by red lines, with their percentiles relative to the wt distribution shown below. Wild type ensemble Q1–Q3 are shown in blue for reference.</p

FigShare

Classifier Performance.

Author: Christopher S. Poultney (345054)
David Gresham (148568)
Dennis E. Shasha (345059)
Glenn L. Butterfoss (345055)
Kevin Drew (345057)
Kristin C. Gunsalus (345058)
Michelle R. Gutwein (345056)
Richard Bonneau (4318)
Publication venue
Publication date
Field of study

The Receiver-operating characteristic (ROC) curve is shown for SVM-LIN, SVM-RBF, and SVM-seq (RBF classifier trained only on sequence data). ROC curves for each classifier showing false positive rate (fpr) and true positive rate (tpr), with the reference line for random classification is shown in gray. The difference between each classifier and the reference line shows the improvement over random of our method. The steep slope at the lower left of the classifier curves indicates that the highest-ranked predictions are most likely to be accurate for all three classifiers. Area under curve: SVM-LIN = 0.713, SVM-RBF = 0.734, SVM-seq = 0.563.</p

FigShare

Rosetta score terms and derived features.

Author: Christopher S. Poultney (345054)
David Gresham (148568)
Dennis E. Shasha (345059)
Glenn L. Butterfoss (345055)
Kevin Drew (345057)
Kristin C. Gunsalus (345058)
Michelle R. Gutwein (345056)
Richard Bonneau (4318)
Publication venue
Publication date
Field of study

1Removed due to high correlation with other feature(s).2Always zero.Rosetta score terms and descriptions. Three features were derived from each Rosetta score term, denoted by suffix Q1, Q2, or Q3, based on mutant distribution quartiles 1–3 as described in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0023947#s2" target="_blank">Methods</a> and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0023947#pone-0023947-g005" target="_blank">Fig. 5</a>. Superscripts denote feature groups removed from the final training set.</p

FigShare

SVM-RBF parameter space.

Author: Christopher S. Poultney (345054)
David Gresham (148568)
Dennis E. Shasha (345059)
Glenn L. Butterfoss (345055)
Kevin Drew (345057)
Kristin C. Gunsalus (345058)
Michelle R. Gutwein (345056)
Richard Bonneau (4318)
Publication venue
Publication date
Field of study

SVM-RBF precision on the ts class is shown as a function of and parameters. Values shown are the mean across the five leave-out CV runs, and range from 0.5822 to 0.788. Blue circles indicate the parameter values yielding the highest ts precision for each of the five leave-out CV runs. The final median and values are indicated by the black cross. While the optimum parameter values across the five leave-out CV runs differ, they are all located along the “valley” of high precision that is visible running from upper right to lower left, indicating that multiple combinations of and values lead to classifiers having similarly good performance.</p

FigShare

Developmental dynamics of gene expression and alternative polyadenylation in the Caenorhabditis elegans germline

Author: A Brummer
A Wilczynska
AA Mueller
AC Jungkamp
AF Severson
AM Kershner
AR Gruber
AR Jones
B Thompson
B Tian
B Utama
BJ Lesch
C Cheadle
C Frokjaer-Jensen
C Mayr
C Merritt
C Scheckel
C Yu
CC Lee
CC MacDonald
CE Holt
CH Jan
D Korčeková
D Liu
David Aristizábal-Corrales
DC Di Giammartino
Desirea Mecenas
DJ Frank
DJ Frank
E Beaudoing
E Kaymak
E Rogers
EC Lai
EM Hedgecock
ET Wang
F Ozsolak
Fabio Piano
H Mi
H Racher
HO Iwakawa
I Gupta
I Loedige
I Taniguchi
I Ulitsky
J Kimble
J Kimble
J Li
JA Waddle
JD Laver
JE Wright
JG Hardy
K Okonechnikov
KC Martin
Kristin C. Gunsalus
LR Baugh
LW Lee
M Kertesz
M Mangone
M Nousch
M Stoeckius
M Stoeckius
MA Ortiz
MI Love
Michelle Gutwein
N Suh
P Miura
P Smibert
PA Pinto
PJ Shepard
R Elkon
R Francis
R Voutev
S Haenni
S Kuersten
S Robertson
S Strome
S Strome
Sean M. West
SH Stubbs
SI Bukhari
SL Crittenden
SM Blazie
T Huang
TD Schmittgen
V Reinke
V Reinke
W Li
W Li
WG Kelly
X Li
X Ma
X Wang
X Wu
Y Li
Y Shen
Y Shi
YH Yi
YQ Su
Z Ji
Z Ji
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref