Search CORE

35,611 research outputs found

Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model

Author: Li Zhen
Sun Siqi
Wang Sheng
Xu Jinbo
Zhang Renyu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 27/11/2016
Field of study

Recently exciting progress has been made on protein contact prediction, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction. This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual networks. This deep neural network allows us to model very complex sequence-contact relationship as well as long-range inter-contact correlation. Our method greatly outperforms existing contact prediction methods and leads to much more accurate contact-assisted protein folding. Tested on three datasets of 579 proteins, the average top L long-range prediction accuracy obtained our method, the representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints can yield correct folds (i.e., TMscore>0.6) for 203 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 proteins, respectively. Further, our contact-assisted models have much better quality than template-based models. Using our predicted contacts as restraints, we can (ab initio) fold 208 of the 398 membrane proteins with TMscore>0.5. By contrast, when the training proteins of our method are used as templates, homology modeling can only do so for 10 of them. One interesting finding is that even if we do not train our prediction models with any membrane proteins, our method works very well on membrane protein prediction. Finally, in recent blind CAMEO benchmark our method successfully folded 5 test proteins with a novel fold

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Coevolved mutations reveal distinct architectures for two core proteins in the bacterial flagellar motor

Author: A Pandini
A Pandini
AC Lowenthal
Alessandro Pandini
AM Waterhouse
Anna Roujeinikova
AS Vartanian
B Ruhnau
BJ Grant
BJ Lowder
CJ Tsai
CM Dyer
D de Juan
D Stock
DL Guzman
DR Livesay
DR Thomas
DR Thomas
DS Bischoff
DT Jones
F Pazos
F Pazos
H Ashkenazy
H Ashkenazy
H Shimodaira
H Sockett
H Szurmant
HC Berg
J Friedman
J Yuan
Jens Kleinjung
JP Armitage
JP Armitage
JS Parkinson
K Paul
K Paul
K Paul
KA Reynolds
KH Lam
KH Lam
L Cavallo
LK Lee
M Punta
MK Sarkar
MN Price
NA Rosenberg
NJ Delalez
P Cluzel
PN Brown
PN Brown
Q Ma
R Saito
RC Edgar
RD Finn
RW Branch
S Chen
S Pronk
SA Lloyd
SD Dunn
Shafqat Rasool
Shahid Khan
SM Van Way
SY Park
SY Park
T Minamino
T Pilizota
TA Duke
VM Irikura
WR Taylor
WR Taylor
X Zhao
Y Tu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

Switching of bacterial flagellar rotation is caused by large domain movements of the FliG protein triggered by binding of the signal protein CheY to FliM. FliG and FliM form adjacent multi-subunit arrays within the basal body C-ring. The movements alter the interaction of the FliG C-terminal (FliGC) "torque" helix with the stator complexes. Atomic models based on the Salmonella entrovar C-ring electron microscopy reconstruction have implications for switching, but lack consensus on the relative locations of the FliG armadillo (ARM) domains (amino-terminal (FliGN), middle (FliGM) and FliGC) as well as changes during chemotaxis. The generality of the Salmonella model is challenged by the variation in motor morphology and response between species. We studied coevolved residue mutations to determine the unifying elements of switch architecture. Residue interactions, measured by their coevolution, were formalized as a network, guided by structural data. Our measurements reveal a common design with dedicated switch and motor modules. The FliM middle domain (FliMM) has extensive connectivity most simply explained by conserved intra and inter-subunit contacts. In contrast, FliG has patchy, complex architecture. Conserved structural motifs form interacting nodes in the coevolution network that wire FliMM to the FliGC C-terminal, four-helix motor module (C3-6). FliG C3-6 coevolution is organized around the torque helix, differently from other ARM domains. The nodes form separated, surface-proximal patches that are targeted by deleterious mutations as in other allosteric systems. The dominant node is formed by the EHPQ motif at the FliMMFliGM contact interface and adjacent helix residues at a central location within FliGM. The node interacts with nodes in the N-terminal FliGc α-helix triad (ARM-C) and FliGN. ARM-C, separated from C3-6 by the MFVF motif, has poor intra-network connectivity consistent with its variable orientation revealed by structural data. ARM-C could be the convertor element that provides mechanistic and species diversity.JK was supported by Medical Research Council grant U117581331. SK was supported by seed funds from Lahore University of Managment Sciences (LUMS) and the Molecular Biology Consortium

Directory of Open Access Journals

Brunel University Research Archive

FigShare

Inverse Statistical Physics of Protein Sequences: A Key Issues Review

Author: Cocco Simona
Feinauer Christoph
Figliuzzi Matteo
Monasson Remi
Weigt Martin
Publication venue: 'IOP Publishing'
Publication date: 03/03/2017
Field of study

In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e.~evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.Comment: 18 pages, 7 figure

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Bocconi

HAL-INSU

Network measures for protein folding state discrimination

Author: Fariselli Piero
Menichetti Giulia
Remondini Daniel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/12/2015
Field of study

Proteins fold using a two-state or multi-state kinetic mechanisms, but up to now there is not a first-principle model to explain this different behavior. We exploit the network properties of protein structures by introducing novel observables to address the problem of classifying the different types of folding kinetics. These observables display a plain physical meaning, in terms of vibrational modes, possible configurations compatible with the native protein structure, and folding cooperativity. The relevance of these observables is supported by a classification performance up to 90%, even with simple classifiers such as discriminant analysis

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Padova