Search CORE

936 research outputs found

Predicting Positive p53 Cancer Rescue Regions Using Most Informative Positive (MIP) Active Learning

Author: A Friedler
A Petitjean
A Ventura
AC Joerger
AC Martin
AL Cuff
AN Bullock
AR Fersht
BG Buchanan
CL Brooks
DA Case
DA Cohn
EF Pettersen
F Francois
F Glaser
G Dantas
G. Wesley Hatfield
IH Witten
J Feng
James M. Briggs
JM Lambert
JS Huston
K Otsuka
Kirsty Salmon
L Itti
Linda Hall
Lydia Ho
M Hollstein
M Saar-Tsechansky
MA Hearst
N Roy
NE Sharpless
NG Karaguler
P Baldi
Peter Kaiser
PV Nikolova
R Jones
Richard H. Lathrop
RJ Fox
RK Brachmann
RK Brachmann
Roberta Baronio
S Kato
S Lain
SA Danziger
SA Danziger
Samuel A. Danziger
SM Leach
TE Baroni
VJ Bykov
W Wang
W Xue
Y Cho
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Many protein engineering problems involve finding mutations that produce proteins with a particular function. Computational active learning is an attractive approach to discover desired biological activities. Traditional active learning techniques have been optimized to iteratively improve classifier accuracy, not to quickly discover biologically significant results. We report here a novel active learning technique, Most Informative Positive (MIP), which is tailored to biological problems because it seeks novel and informative positive results. MIP active learning differs from traditional active learning methods in two ways: (1) it preferentially seeks Positive (functionally active) examples; and (2) it may be effectively extended to select gene regions suitable for high throughput combinatorial mutagenesis. We applied MIP to discover mutations in the tumor suppressor protein p53 that reactivate mutated p53 found in human cancers. This is an important biomedical goal because p53 mutants have been implicated in half of all human cancers, and restoring active p53 in tumors leads to tumor regression. MIP found Positive (cancer rescue) p53 mutants in silico using 33% fewer experiments than traditional non-MIP active learning, with only a minor decrease in classifier accuracy. Applying MIP to in vivo experimentation yielded immediate Positive results. Ten different p53 mutations found in human cancers were paired in silico with all possible single amino acid rescue mutations, from which MIP was used to select a Positive Region predicted to be enriched for p53 cancer rescue mutants. In vivo assays showed that the predicted Positive Region: (1) had significantly more (p<0.01) new strong cancer rescue mutants than control regions (Negative, and non-MIP active learning); (2) had slightly more new strong cancer rescue mutants than an Expert region selected for purely biological considerations; and (3) rescued for the first time the previously unrescuable p53 cancer mutant P152L

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

A comparative analysis to predict p53 activity using classification models

Author: Setty Priyanka
Publication venue
Publication date: 26/09/2019
Field of study

Mutation studies of TP53, the gene coding the tumor protein p53, have become increasingly common in cancer research to understand its structural changes and its implications for tumor suppression. The protein’s structure is built with four identical chains containing 393 amino acids per chain. This homo-tetrameric configuration of p53 plays an important role in suppressing tumors and it is important to understand the structure-function dynamics and their role in cancer development. A p53 mutant dataset was obtained from the University of California at Irvine (UCI) Machine Learning Repository to infer p53 protein’s ability to suppress tumors based on its two-dimensional (2D) and three-dimensional (3D) structural features. The dataset consisted of 31,283 instances (observations) and 5,408 numerical features. Among the total features, the first 4,826 accounted for 2D structural features which were based on electrostatic and surface properties. The remaining 582 3D features were the distance maps between mutant and wild type p53. After selecting a subset of the features that were statistically relevant in predicting the outcome (n=100), three classification algorithms, Logistic Regression (LR), Support Vector Machine (SVM) and Random Forest (RF), were fit to the data and trained using a cross-validation scheme to obtain good parameters to classify an active p53 mutant from its inactive counterparts. Performance metrics in terms of accuracy and area-under-the-curve (AUC) were utilized in order to evaluate a particular classification model. Among the three different algorithms used to predict the outcome, LR seemed to outperform SVM and RF with an accuracy ranging from 0.75 to 0.81 and AUC ranging from 0.75 to 0.88. The LR model identified 2D feature numbers 60,74,49,40, and 73 as features of high importance in predicting the activity of p53. The public health significance of this study is that it advances the understanding of p53, which is critical to cancer tumor suppression, by helping to predict p53 activation using set of structural features obtained from simple classification models

D-Scholarship@Pitt

Predicting Transcriptional Activity of Multiple Site p53 Mutants Based on Hybrid Properties

Author: A Efeyan
AC Martin
AP Bom
B Ma
CW Lee
DP Lane
G Bossi
H Mohabatkar
H Peng
IK Jordan
JM Smith
JP Qi
K Peng
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KK Kandaswamy
Kuo-Chen Chou
L Meng
M Hayat
M Oren
MS Greenblatt
P Baldi
P Wang
P Zakeri
Q Gu
R Grantham
R Rainwater
Reiner Albert Veitia
RR Joshi
S Kato
S Kawashima
S Niu
SA Danziger
SA Danziger
SA Danziger
SF Altschul
Shen Niu
T Huang
T Huang
T Huang
T Huang
Tao Huang
UK Mukhopadhyay
WR Atchley
XB Zhou
Xiangyin Kong
Y Cai
YD Cai
Yu-Dong Cai
Yun Huang
Z Qian
Z Yang
Zhongping Xu
Publication venue: Public Library of Science
Publication date: 08/08/2011
Field of study

As an important tumor suppressor protein, reactivate mutated p53 was found in many kinds of human cancers and that restoring active p53 would lead to tumor regression. In this work, we developed a new computational method to predict the transcriptional activity for one-, two-, three- and four-site p53 mutants, respectively. With the approach from the general form of pseudo amino acid composition, we used eight types of features to represent the mutation and then selected the optimal prediction features based on the maximum relevance, minimum redundancy, and incremental feature selection methods. The Mathew's correlation coefficients (MCC) obtained by using nearest neighbor algorithm and jackknife cross validation for one-, two-, three- and four-site p53 mutants were 0.678, 0.314, 0.705, and 0.907, respectively. It was revealed by the further optimal feature set analysis that the 2D (two-dimensional) structure features composed the largest part of the optimal feature set and maybe played the most important roles in all four types of p53 mutant active status prediction. It was also demonstrated by the optimal feature sets, especially those at the top level, that the 3D structure features, conservation, physicochemical and biochemical properties of amino acid near the mutation site, also played quite important roles for p53 mutant active status prediction. Our study has provided a new and promising approach for finding functionally important sites and the relevant features for in-depth study of p53 protein and its action mechanism

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

All-codon scanning identifies p53 cancer rescue mutations

Author: Baedeker
Baroni
Boeckler
Brachmann
Brachmann
Breslauer
Bullock
Bykov
Chalmers
Cho
Cuff
Cunningham
Danziger
G. Wesley Hatfield
Gao
Grosjean
Gutman
Hatfield
Hogrefe
Hollstein
Hoover
Irwin
Jackel
Joerger
Kato
Kegler-Ebo
Kegler-Ebo
Kirsty Salmon
Lai
Lain
Lambert
Larsen
Linda V. Hall
Lio
Liu
Martin
Moore
Muhlrad
Otsuka
Peter Kaiser
Pettersen
Reetz
Richard H. Lathrop
Roberta Baronio
Samuel A. Danziger
Sharp
Sharpless
Sorensen
Stemmer
Tian
Vartanian
Ventura
Vogelstein
Wassman
Welch
Welch
Withers-Martinez
Xue
Yuen
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

In vitro scanning mutagenesis strategies are valuable tools to identify critical residues in proteins and to generate proteins with modified properties. We describe the fast and simple All-Codon Scanning (ACS) strategy that creates a defined gene library wherein each individual codon within a specific target region is changed into all possible codons with only a single codon change per mutagenesis product. ACS is based on a multiplexed overlapping mutagenesis primer design that saturates only the targeted gene region with single codon changes. We have used ACS to produce single amino-acid changes in small and large regions of the human tumor suppressor protein p53 to identify single amino-acid substitutions that can restore activity to inactive p53 found in human cancers. Single-tube reactions were used to saturate defined 30-nt regions with all possible codon changes. The same technique was used in 20 parallel reactions to scan the 600-bp fragment encoding the entire p53 core domain. Identification of several novel p53 cancer rescue mutations demonstrated the utility of the ACS approach. ACS is a fast, simple and versatile method, which is useful for protein structure–function analyses and protein design or evolution problems

CiteSeerX

Crossref

PubMed Central

eScholarship - University of California

Ensemble-Based Computational Approach Discriminates Functional Activity of p53 Cancer and Rescue Mutants

Author: A Ventura
AC Joerger
AC Joerger
AC Joerger
AC Joerger
AJ Levine
AM Wieczorek
AN Bullock
AN Bullock
B Hess
B Vogelstein
BA Foster
Christopher D. Wassman
CP Martins
DA Case
DM Feldser
Faezeh Salehi
FM Boeckler
G. Wesley Hatfield
GJ Martyna
HC Ang
J Hill
J-P Ryckaert
JC Phillips
JM Canadillas
KH Vousden
Linda Hall
Michael Gilson
MR Junttila
Peter Kaiser
PV Nikolova
PV Nikolova
R Baronio
R Rodriguez
Richard Chamberlin
Richard H. Lathrop
RK Brachmann
Roberta Baronio
Rommie E. Amaro
RS Sikorski
S Miyamoto
S North
SA Danziger
SE Feller
SMA Rauf
T Darden
TE Baroni
V Hornak
VB Chen
VJ Bykov
VJ Bykov
VJ Bykov
WL Jorgensen
X Daura
Y Cho
YP Pang
Özlem Demir
Publication venue: Public Library of Science
Publication date: 01/10/2011
Field of study

The tumor suppressor protein p53 can lose its function upon single-point missense mutations in the core DNA-binding domain (“cancer mutants”). Activity can be restored by second-site suppressor mutations (“rescue mutants”). This paper relates the functional activity of p53 cancer and rescue mutants to their overall molecular dynamics (MD), without focusing on local structural details. A novel global measure of protein flexibility for the p53 core DNA-binding domain, the number of clusters at a certain RMSD cutoff, was computed by clustering over 0.7 µs of explicitly solvated all-atom MD simulations. For wild-type p53 and a sample of p53 cancer or rescue mutants, the number of clusters was a good predictor of in vivo p53 functional activity in cell-based assays. This number-of-clusters (NOC) metric was strongly correlated (r2 = 0.77) with reported values of experimentally measured ΔΔG protein thermodynamic stability. Interpreting the number of clusters as a measure of protein flexibility: (i) p53 cancer mutants were more flexible than wild-type protein, (ii) second-site rescue mutations decreased the flexibility of cancer mutants, and (iii) negative controls of non-rescue second-site mutants did not. This new method reflects the overall stability of the p53 core domain and can discriminate which second-site mutations restore activity to p53 cancer mutants

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Improving SNR and reducing training time of classifiers in large datasets via kernel averaging

Author: A Gonzalez-Moreno
B Schölkopf
Chih-Chung Chang
DC Dima
Diogo Ayres‐de‐Campos
Frank Jäkel
G Orrù
GE Hinton
HJ Hwang
J Hainmueller
J Schrouff
J Schrouff
JN Weinstein
MS Treder
RM Cichy
RM Cichy
S Choudhury
SA Danziger
V Youssofzadeh
X Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/12/2018
Field of study

Kernel methods are of growing importance in neuroscience research. As an elegant extension of linear methods, they are able to model complex non-linear relationships. However, since the kernel matrix grows with data size, the training of classifiers is computationally demanding in large datasets. Here, a technique developed for linear classifiers is extended to kernel methods: In linearly separable data, replacing sets of instances by their averages improves signal-to-noise ratio (SNR) and reduces data size. In kernel methods, data is linearly non-separable in input space, but linearly separable in the high-dimensional feature space that kernel methods implicitly operate in. It is shown that a classifier can be efficiently trained on instances averaged in feature space by averaging entries in the kernel matrix. Using artificial and publicly available data, it is shown that kernel averaging improves classification performance substantially and reduces training time, even in non-linearly separable data

Crossref

Online Research @ Cardiff