Search CORE

arXiv.org e-Print Archive

Number of natively unfolded proteins scales with genome size

Author: Deiana Antonio
Giansanti Andrea
Publication venue
Publication date: 01/01/2008
Field of study

Natively unfolded proteins exist as an ensemble of flexible conformations lacking a well defined tertiary structure along a large portion of their polypeptide chain. Despite the absence of a stable configuration, they are involved in important cellular processes. In this work we used from three indicators of folding status, derived from the analysis of mean packing and mean contact energy of a protein sequence as well as from VSL2, a disorder predictor, and we combined them into a consensus score to identify natively unfolded proteins in several genomes from Archaea, Bacteria and Eukarya. We found a high correlation among the number of predicted natively unfolded proteins and the number of proteins in the genomes. More specifically, the number of natively unfolded proteins scaled with the number of proteins in the genomes, with exponent 1.81 +- 0.10. This scaling law may be important to understand the relation between the number of natively unfolded proteins and their roles in cellular processes.Comment: Submitted to Biophysics and Bioengineering Letters http://padis2.uniroma1.it:81/ojs/index.php/CISB-BB

Archivio della ricerca- Università di Roma La Sapienza

Disorder Predictors Also Predict Backbone Dynamics for a Family of Disordered Proteins

Author: A Mohan
AK Dunker
AK Dunker
B Al-Lazikani
BA Johnson
C Chothia
C Chothia
CJ Oldfield
CJ Oldfield
D Baker
D Eliezer
DB Woods
DF Lowry
E Bochkareva
F Ferron
GA Petsko
Gary W. Daughdrill
GW Daughdrill
H Lee
HB Xie
HB Xie
HJ Dyson
HJ Dyson
HJ Dyson
Hongwei Wu
I Simon
J Bargonetti
K Peng
K Peng
L Kaustov
L Li
M Wells
P Radivojac
P Romero
P Romero
P Tompa
P Tompa
P Tompa
P Tompa
PD Vise
PH Kussie
R Dawson
S Liang
S Vucetic
Vladimir N. Uversky
VN Uversky
VN Uversky
VN Uversky
VN Uversky
VP Kutyshenko
Wade M. Borcherds
X Li
Z Dosztanyi
Z Obradovic
Z Obradovic
Z Obradovic
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Several algorithms have been developed that use amino acid sequences to predict whether or not a protein or a region of a protein is disordered. These algorithms make accurate predictions for disordered regions that are 30 amino acids or longer, but it is unclear whether the predictions can be directly related to the backbone dynamics of individual amino acid residues. The nuclear Overhauser effect between the amide nitrogen and hydrogen (NHNOE) provides an unambiguous measure of backbone dynamics at single residue resolution and is an excellent tool for characterizing the dynamic behavior of disordered proteins. In this report, we show that the NHNOE values for several members of a family of disordered proteins are highly correlated with the output from three popular algorithms used to predict disordered regions from amino acid sequence. This is the first test between an experimental measure of residue specific backbone dynamics and disorder predictions. The results suggest that some disorder predictors can accurately estimate the backbone dynamics of individual amino acids in a long disordered region

CiteSeerX

Public Library of Science (PLOS)

Multidisciplinary Digital Publishing Institute

Prediction of Lysine Ubiquitylation with Ensemble Classifier and Feature Selection

Author: Aguilar
Altschul
Anand
Atchey
Boeckmann
Bordoli
Breiman
Cai
Chou
Chou
Chou
Cover
Denis
Dunker
Fleuret
He
Herrmann
Hershko
Hicke
Hicke
Hitchcock
Jeon
Jones
Kaur
Kawashima
Kim
Kirkpatrick
Levi
Li
Liu
Liu
Liu
Ma
Matsumoto
Minghao Yin
Peng
Peng
Peng
Peng
Pickart
Pugalenthi
Radivojac
Saghatelian
Shen
Sikic
Skurichina
Tompa
Tung
Welchman
Wright
Wu
Xiangtao Li
Xiao
Xiaowei Zhao
Yu
Zheng
Zhiqiang Ma
Publication venue: Molecular Diversity Preservation International (MDPI)
Publication date: 01/11/2011
Field of study

Ubiquitylation is an important process of post-translational modification. Correct identification of protein lysine ubiquitylation sites is of fundamental importance to understand the molecular mechanism of lysine ubiquitylation in biological systems. This paper develops a novel computational method to effectively identify the lysine ubiquitylation sites based on the ensemble approach. In the proposed method, 468 ubiquitylation sites from 323 proteins retrieved from the Swiss-Prot database were encoded into feature vectors by using four kinds of protein sequences information. An effective feature selection method was then applied to extract informative feature subsets. After different feature subsets were obtained by setting different starting points in the search procedure, they were used to train multiple random forests classifiers and then aggregated into a consensus classifier by majority voting. Evaluated by jackknife tests and independent tests respectively, the accuracy of the proposed predictor reached 76.82% for the training dataset and 79.16% for the test dataset, indicating that this predictor is a useful tool to predict lysine ubiquitylation sites. Furthermore, site-specific feature analysis was performed and it was shown that ubiquitylation is intimately correlated with the features of its surrounding sites in addition to features derived from the lysine site itself. The feature selection method is available upon request

Repository of the Academy's Library

D2P2: database of disordered protein predictions

Author: Dosztányi Zsuzsanna
Ishida Takashi
Oates Matt E.
Romero Pedro
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2013
Field of study

We present the Database of Disordered Protein Prediction (D2P2), available at http://d2p2.pro (including website source code). A battery of disorder predictors and their variants, VL-XT, VSL2b, PrDOS, PV2, Espritz and IUPred, were run on all protein sequences from 1765 complete proteomes (to be updated as more genomes are completed). Integrated with these results are all of the predicted (mostly structured) SCOP domains using the SUPERFAMILY predictor. These disorder/structure annotations together enable comparison of the disorder predictors with each other and examination of the overlap between disordered predictions and SCOP domains on a large scale. D2P2 will increase our understanding of the interplay between disorder and structure, the genomic distribution of disorder, and its evolutionary history. The parsed data are made available in a unified format for download as flat files or SQL tables either by genome, by predictor, or for the complete set. An interactive website provides a graphical view of each protein annotated with the SCOP domains and disordered regions from all predictors overlaid (or shown as a consensus). There are statistics and tools for browsing and comparing genomes and their disorder within the context of their position on the tree of life. © The Author(s) 2012. Published by Oxford University Press

MobiDB-lite 3.0: fast consensus annotation of intrinsic disorder flavors in proteins

Author: Damiano Clementel
Damiano Piovesan
Marco Necci
Silvio C. E. Tosatto
Zsuzsanna Dosztányi
Publication venue
Publication date: 01/12/2020
Field of study

Abstract Motivation The earlier version of MobiDB-lite is currently used in large-scale proteome annotation platforms to detect intrinsic disorder. However, new theoretical models allow for the classification of intrinsically disordered regions into subtypes from sequence features associated with specific polymeric properties or compositional bias. Results MobiDB-lite 3.0 maintains its previous speed and performance but also provides a finer classification of disorder by identifying regions with characteristics of polyolyampholytes, positive or negative polyelectrolytes, low-complexity regions or enriched in cysteine, proline or glycine or polar residues. Subregions are abundantly detected in IDRs of the human proteome. The new version of MobiDB-lite represents a new step for the proteome level analysis of protein disorder. Availability and implementation Both the MobiDB-lite 3.0 source code and a docker container are available from the GitHub repository: https://github.com/BioComputingUP/MobiDB-lit

Public Library of Science (PLOS)

Open Access Repository

The N-terminal intrinsically disordered domain of mgm101p is localized to the mitochondrial nucleoid.

Author: A Moya
A Schlessinger
AEA Hobbs
AK Dunker
AK Dunker
AM Waterhouse
B He
BA Kaufman
C Galea
CJ Brown
D Tillett
David C. Hayward
DF Bogenhagen
DF Bogenhagen
E Garner
GD Clark-Walker
George Desmond Clark-Walker
GW Daughdrill
H Dyson
H Hegyi
HJ Dyson
I Miyakawa
JD Nardozzi
JJ Ward
K Itoh
K Okamoto
K Peng
L Dente
M Ito
M Kucej
M Mbantenkhu
MA Larkin
MG Claros
MW Gray
P Tompa
PE Wright
RD Gietz
RW Gilkerson
S Meeusen
S Meeusen
Vladimir N. Uversky
X Chen
X Zuo
X Zuo
XJ Chen
Y Elbaz
Z Dosztányi
Z Dosztányi
Zsuzsanna Dosztányi
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

The mitochondrial genome maintenance gene, MGM101, is essential for yeasts that depend on mitochondrial DNA replication. Previously, in Saccharomyces cerevisiae, it has been found that the carboxy-terminal two-thirds of Mgm101p has a functional core. Furthermore, there is a high level of amino acid sequence conservation in this region from widely diverse species. By contrast, the amino-terminal region, that is also essential for function, does not have recognizable conservation. Using a bioinformatic approach we find that the functional core from yeast and a corresponding region of Mgm101p from the coral Acropora millepora have an ordered structure, while the N-terminal domains of sequences from yeast and coral are predicted to be disordered. To examine whether ordered and disordered domains of Mgm101p have specific or general functions we made chimeric proteins from yeast and coral by swapping the two regions. We find, by an in vivo assay in S.cerevisiae, that the ordered domain of A.millepora can functionally replace the yeast core region but the disordered domain of the coral protein cannot substitute for its yeast counterpart. Mgm101p is found in the mitochondrial nucleoid along with enzymes and proteins involved in mtDNA replication. By attaching green fluorescent protein to the N-terminal disordered domain of yeast Mgm101p we find that GFP is still directed to the mitochondrial nucleoid where full-length Mgm101p-GFP is targeted

The Australian National University

Repository of the Academy's Library

FigShare

Buried and accessible surface area control intrinsic protein flexibility

Author: Marsh Joseph A
Publication venue: 'Elsevier BV'
Publication date: 13/08/2013
Field of study

Proteins experience a wide variety of conformational dynamics that can be crucial for facilitating their diverse functions. How is the intrinsic flexibility required for these motions encoded in their three-dimensional structures? Here, the overall flexibility of a protein is demonstrated to be tightly coupled to the total amount of surface area buried within its fold. A simple proxy for this, the relative solvent accessible surface area (Arel), therefore shows excellent agreement with independent measures of global protein flexibility derived from various experimental and computational methods. Application of Arel on a large scale demonstrates its utility by revealing unique sequence and structural properties associated with intrinsic flexibility. In particular, flexibility as measured by Arel shows little correspondence with intrinsic disorder, but instead tends to be associated with multiple domains and increased {\alpha}- helical structure. Furthermore, the apparent flexibility of monomeric proteins is found to be useful for identifying quaternary structure errors in published crystal structures. There is also a strong tendency for the crystal structures of more flexible proteins to be solved to lower resolutions. Finally, local solvent accessibility is shown to be a primary determinant of local residue flexibility. Overall this work provides both fundamental mechanistic insight into the origin of protein flexibility and a simple, practical method for predicting flexibility from protein structures.Comment: 36 pages, 11 figures, author's manuscript, accepted for publication in Journal of Molecular Biolog

arXiv.org e-Print Archive