Search CORE

A structural annotation resource for the selection of putative target proteins in the malaria parasite

Author: A Krogh
AV McDonnell
C Chothia
CJ Stoeckert Jr.
D Frishman
DJ LaCount
DR Brooks
DS Peterson
DT Jones
E Wallin
EE Abola
F Lu
FE Herrera
Fourie Joubert
GN Sarma
J Carlton
J Gough
J Liu
J Liu
JM Carlton
LH Miller
LJ McGuffin
M Berriman
M Marti
M Sickmeier
MJ Gardner
P Rice
PJ Rozmajzl
SF Altschul
SR Eddy
SS Velanker
TA de Beer
Y Yuthavong
Yolandi Joubert
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Protein structure plays a pivotal role in elucidating mechanisms of parasite functioning and drug resistance. Moreover, protein structure aids the determination of protein function, which can together with the structure be used to identify novel drug targets in the parasite. However, various structural features in <it>Plasmodium falciparum </it>proteins complicate the experimental determination of protein structures. Limited similarity to proteins in the Protein Data Bank and the shortage of solved protein structures in the malaria parasite necessitate genome-scale structural annotation of <it>P. falciparum </it>proteins. Additionally, the annotation of a range of structural features facilitates the identification of suitable targets for experimental and computational studies. Methods An integrated structural annotation system was developed and applied to <it>P. falciparum</it>, <it>Plasmodium vivax </it>and <it>Plasmodium yoelii</it>. The annotation included searches for sequence similarity, patterns and domains in addition to the following predictions: secondary structure, transmembrane helices, protein disorder, low complexity, coiled-coils and small molecule interactions. Subsequently, candidate proteins for further structural studies were identified based on the annotated structural features. Results The annotation results are accessible through a web interface, enabling users to select groups of proteins which fulfil multiple criteria pertaining to structural and functional features <abbrgrp><abbr bid="B1">1</abbr></abbrgrp>. Analysis of features in the <it>P. falciparum </it>proteome showed that protein-interacting proteins contained a higher percentage of predicted disordered residues than non-interacting proteins. Proteins interacting with 10 or more proteins have a disordered content concentrated in the range of 60–100%, while the disorder distribution for proteins having only one interacting partner, was more evenly spread. Conclusion A series of <it>P. falciparum </it>protein targets for experimental structure determination, comparative modelling and <it>in silico </it>docking studies were putatively identified. The system is available for public use, where researchers may identify proteins by querying with multiple physico-chemical, sequence similarity and interaction features.</p

Springer - Publisher Connector

UPSpace at the University of Pretoria

Stability of domain structures in multi-domain proteins

Author: A Bauer-Mehren
A Fernandez
A Hentati
A Pang
A Schlicker
AG Murzin
AS Kondrashov
AW Munro
BB Kragelund
BM Broome
C Chothia
C Chothia
CA Gough
CJ Camacho
D Ekman
DA Di Giusto
DF Burke
DP Grandgenett
DR Caffrey
E Capriotti
E Krissinel
ED Levy
F Ali-Osman
F Dong
G Apic
GC Conant
H Wohlrab
HP Shanahan
HW He
HW He
J Karanicolas
J Rodriguez-Lopez
J Schymkowitz
J Weiner 3rd
JH Fong
JH Han
K Vlahovicek
L Riechmann
M Ryan
MA DePristo
MM Gromiha
N Tokuriki
N Tokuriki
NO Stitziel
O Keskin
PA Ory
Q Wang
R Guerois
R Rajasekaran
R Zhou
RJ Dobson
S Gong
S Teng
SJ Hamill
SO Yesylevskyy
T Tanaka
Y Xia
Z Liu
Publication venue: Nature Publishing Group
Publication date: 18/07/2011
Field of study

Multi-domain proteins have many advantages with respect to stability and folding inside cells. Here we attempt to understand the intricate relationship between the domain-domain interactions and the stability of domains in isolation. We provide quantitative treatment and proof for prevailing intuitive ideas on the strategies employed by nature to stabilize otherwise unstable domains. We find that domains incapable of independent stability are stabilized by favourable interactions with tethered domains in the multi-domain context. Stability of such folds to exist independently is optimized by evolution. Specific residue mutations in the sites equivalent to inter-domain interface enhance the overall solvation, thereby stabilizing these domain folds independently. A few naturally occurring variants at these sites alter communication between domains and affect stability leading to disease manifestation. Our analysis provides safe guidelines for mutagenesis which have attractive applications in obtaining stable fragments and domain constructs essential for structural studies by crystallography and NMR

Open Access Repository of IISc Research Publications

Cross-Over between Discrete and Continuous Protein Structure Space: Insights into Automatic Classification and Networks of Protein Structures

Structural classifications of proteins assume the existence of the fold, which is an intrinsic equivalence class of protein domains. Here, we test in which conditions such an equivalence class is compatible with objective similarity measures. We base our analysis on the transitive property of the equivalence relationship, requiring that similarity of A with B and B with C implies that A and C are also similar. Divergent gene evolution leads us to expect that the transitive property should approximately hold. However, if protein domains are a combination of recurrent short polypeptide fragments, as proposed by several authors, then similarity of partial fragments may violate the transitive property, favouring the continuous view of the protein structure space. We propose a measure to quantify the violations of the transitive property when a clustering algorithm joins elements into clusters, and we find out that such violations present a well defined and detectable cross-over point, from an approximately transitive regime at high structure similarity to a regime with large transitivity violations and large differences in length at low similarity. We argue that protein structure space is discrete and hierarchic classification is justified up to this cross-over point, whereas at lower similarities the structure space is continuous and it should be represented as a network. We have tested the qualitative behaviour of this measure, varying all the choices involved in the automatic classification procedure, i.e., domain decomposition, alignment algorithm, similarity score, and clustering algorithm, and we have found out that this behaviour is quite robust. The final classification depends on the chosen algorithms. We used the values of the clustering coefficient and the transitivity violations to select the optimal choices among those that we tested. Interestingly, this criterion also favours the agreement between automatic and expert classifications. As a domain set, we have selected a consensus set of 2,890 domains decomposed very similarly in SCOP and CATH. As an alignment algorithm, we used a global version of MAMMOTH developed in our group, which is both rapid and accurate. As a similarity measure, we used the size-normalized contact overlap, and as a clustering algorithm, we used average linkage. The resulting automatic classification at the cross-over point was more consistent than expert ones with respect to the structure similarity measure, with 86% of the clusters corresponding to subsets of either SCOP or CATH superfamilies and fewer than 5% containing domains in distinct folds according to both SCOP and CATH. Almost 15% of SCOP superfamilies and 10% of CATH superfamilies were split, consistent with the notion of fold change in protein evolution. These results were qualitatively robust for all choices that we tested, although we did not try to use alignment algorithms developed by other groups. Folds defined in SCOP and CATH would be completely joined in the regime of large transitivity violations where clustering is more arbitrary. Consistently, the agreement between SCOP and CATH at fold level was lower than their agreement with the automatic classification obtained using as a clustering algorithm, respectively, average linkage (for SCOP) or single linkage (for CATH). The networks representing significant evolutionary and structural relationships between clusters beyond the cross-over point may allow us to perform evolutionary, structural, or functional analyses beyond the limits of classification schemes. These networks and the underlying clusters are available at http://ub.cbm.uam.es/research/ProtNet.ph

Secretaría de Estado de Cultura

Digital.CSIC

The Plasmodium Export Element Revisited

Author: A Shanmugham
AA Zamyatnin
B Martoglio
BM Cooke
C Chothia
CJ Stoeckert Jr
DD Jones
DI Baruch
DM Engelmann
E Knuepfer
F Sargent
Florian Schwarte
G Cochrane
G Schneider
G Schneider
Gisbert Schneider
H Nielsen
I Ansorge
J Benting
J Zuegge
Jan Alexander Hiss
JD Bendtsen
JD Smith
JM Przyborski
Jude Marek Przyborski
Klaus Lingelbach
M Marti
M Marti
M Petter
M Rug
M Schmuker
MC Nunes
ME Wickham
MJ Gardner
N Joannin
NL Hiller
Per Westermark
Q Cheng
RS Hegde
S Baumeister
S Henikoff
SA Kyes
SA Ralph
SF Altschul
T Kohonen
TJ Sargeant
TP Hopp
XZ Su
Publication venue: Public Library of Science
Publication date: 06/02/2008
Field of study

We performed a bioinformatical analysis of protein export elements (PEXEL) in the putative proteome of the malaria parasite Plasmodium falciparum. A protein family-specific conservation of physicochemical residue profiles was found for PEXEL-flanking sequence regions. We demonstrate that the family members can be clustered based on the flanking regions only and display characteristic hydrophobicity patterns. This raises the possibility that the flanking regions may contain additional information for a family-specific role of PEXEL. We further show that signal peptide cleavage results in a positional alignment of PEXEL from both proteins with, and without, a signal peptide

Hochschulschriftenserver - Universität Frankfurt am Main

Integrated Assessment of Genomic Correlates of Protein Evolutionary Rate

Author: A Krogh
A Wagner
AE Hirsh
AE Hirsh
B Lemos
C Chothia
C Pal
C Pal
C Stark
CA Wilson
CD Warden
CJ Brown
D Graur
D Sherman
DA Drummond
DA Drummond
DP Wall
DT Jones
EP Rocha
EP Rocha
Eric A. Franzosa
FA Kondrashov
GC Conant
H Yu
HB Fraser
I Wapinski
IK Jordan
J Ihmels
JB Plotkin
JD Bloom
JD Bloom
JH Nadeau
JJ Ward
JM Cherry
JO McInerney
LJ Lu
M Ashburner
M Kellis
M Lynch
M Seringhaus
Mark B. Gerstein
Michael Levitt
N Lin
NJ Tourasse
P Zhang
SF Altschul
SH Kim
WH Li
Y Kawahara
Y Xia
YS Lin
Yu Xia
Z Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 01/06/2009
Field of study

Rates of evolution differ widely among proteins, but the causes and consequences of such differences remain under debate. With the advent of high-throughput functional genomics, it is now possible to rigorously assess the genomic correlates of protein evolutionary rate. However, dissecting the correlations among evolutionary rate and these genomic features remains a major challenge. Here, we use an integrated probabilistic modeling approach to study genomic correlates of protein evolutionary rate in Saccharomyces cerevisiae. We measure and rank degrees of association between (i) an approximate measure of protein evolutionary rate with high genome coverage, and (ii) a diverse list of protein properties (sequence, structural, functional, network, and phenotypic). We observe, among many statistically significant correlations, that slowly evolving proteins tend to be regulated by more transcription factors, deficient in predicted structural disorder, involved in characteristic biological functions (such as translation), biased in amino acid composition, and are generally more abundant, more essential, and enriched for interaction partners. Many of these results are in agreement with recent studies. In addition, we assess information contribution of different subsets of these protein properties in the task of predicting slowly evolving proteins. We employ a logistic regression model on binned data that is able to account for intercorrelation, non-linearity, and heterogeneity within features. Our model considers features both individually and in natural ensembles (“meta-features”) in order to assess joint information contribution and degree of contribution independence. Meta-features based on protein abundance and amino acid composition make strong, partially independent contributions to the task of predicting slowly evolving proteins; other meta-features make additional minor contributions. The combination of all meta-features yields predictions comparable to those based on paired species comparisons, and approaching the predictive limit of optimal lineage-insensitive features. Our integrated assessment framework can be readily extended to other correlational analyses at the genome scale

Boston University Institutional Repository (OpenBU)

Three-dimensional structure of β-cell-specific zinc transporter, ZnT-8, predicted from the type 2 diabetes-associated gene variant SLC30A8 R325W

Abstract Background We examined the effects of the R325W mutation on the three-dimensional (3D) structure of the β-cell-specific Zn2+ (zinc) transporter ZnT-8. Methods A model of the C-terminal domain of the human ZnT-8 protein was generated by homology modeling based on the known crystal structure of the <it>Escherichia coli </it>(<it>E. coli</it>) zinc transporter YiiP at 3.8 Å resolution. Results The homodimer ZnT-8 protein structure exists as a Y-shaped architecture with Arg325 located at the ultimate bottom of this motif at approximately 13.5 Å from the transmembrane domain juncture. The C-terminal domain sequences of the human ZnT-8 protein and the <it>E. coli </it>zinc transporter YiiP share 12.3% identical and 39.5% homologous residues resulting in an overall homology of 51.8%. Validation statistics of the homology model showed a reasonable quality of the model. The C-terminal domain exhibited an αββαβ fold with Arg325 as the penultimate N-terminal residue of the α2-helix. The side chains of both Arg325 and Trp325 point away from the interface with the other monomer, whereas the ε-NH3+ group of Arg325 is predicted to form an ionic interaction with the β-COO- group of Asp326 as well as Asp295. An amino acid alignment of the β2-α2 C-terminal loop domain revealed a variety of neutral amino acids at position 325 of different ZnT-8 proteins. Conclusions Our validated homology models predict that both Arg325 and Trp325, amino acids with a helix-forming behavior, and penultimate N-terminal residues in the α2-helix of the C-terminal domain, are shielded by the planar surface of the three cytoplasmic β-strands and hence unable to affect the sensing capacity of the C-terminal domain. Moreover, the amino acid residue at position 325 is too far removed from the docking and transporter parts of ZnT-8 to affect their local protein conformations. These data indicate that the inherited R325W abnormality in SLC30A8 may be tolerated and results in adequate zinc transfer to the correct sites in the pancreatic islet cells and are consistent with the observation that the <it>SLC30A8 </it>gene variant R325W has a low predicted value for future type 2 diabetes at population-based level.</p

Springer - Publisher Connector

Of Bits and Bugs — On the Use of Bioinformatics and a Bacterial Crystal Structure to Solve a Eukaryotic Repeat-Protein Structure

Author: A Biegert
A Biegert
A Graebsch
A Hildebrand
A Marchler-Bauer
A Sali
A Savchenko
AA Vaguine
AD Bergemann
AJ McCoy
AK Bjorklund
Almut Graebsch
AM Waterhouse
B Rost
BM Lunde
C Chothia
CJ Oldfield
D Fischer
Dierk Niessing
Dirk Kostrewa
DT Jones
EM Marcotte
F Melo
GL Gallia
GL Hura
GM Sheldrick
GN Murshudov
I Wong
J Söding
J Söding
JN Battey
Johannes Söding
K Khalili
M Müller
MA Larkin
MK White
NE Chayen
Niall James Haslam
P Emsley
P Koehl
R Luthy
R Page
RD Finn
RI Sadreyev
RP Bahadur
S Doublie
S Hunter
SF Altschul
SN Ho
Stéphane Roche
TC Terwilliger
W Kabsch
WN Price 2nd
X Gao
Y Kanai
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Pur-α is a nucleic acid-binding protein involved in cell cycle control, transcription, and neuronal function. Initially no prediction of the three-dimensional structure of Pur-α was possible. However, recently we solved the X-ray structure of Pur-α from the fruitfly Drosophila melanogaster and showed that it contains a so-called PUR domain. Here we explain how we exploited bioinformatics tools in combination with X-ray structure determination of a bacterial homolog to obtain diffracting crystals and the high-resolution structure of Drosophila Pur-α. First, we used sensitive methods for remote-homology detection to find three repetitive regions in Pur-α. We realized that our lack of understanding how these repeats interact to form a globular domain was a major problem for crystallization and structure determination. With our information on the repeat motifs we then identified a distant bacterial homolog that contains only one repeat. We determined the bacterial crystal structure and found that two of the repeats interact to form a globular domain. Based on this bacterial structure, we calculated a computational model of the eukaryotic protein. The model allowed us to design a crystallizable fragment and to determine the structure of Drosophila Pur-α. Key for success was the fact that single repeats of the bacterial protein self-assembled into a globular domain, instructing us on the number and boundaries of repeats to be included for crystallization trials with the eukaryotic protein. This study demonstrates that the simpler structural domain arrangement of a distant prokaryotic protein can guide the design of eukaryotic crystallization constructs. Since many eukaryotic proteins contain multiple repeats or repeating domains, this approach might be instructive for structural studies of a range of proteins

PuSH

MPG.PuRe

Maintaining and breaking symmetry in homomeric coiled-coil assemblies

Author: AJ Burton
AJ Mccoy
AN Lupas
AR Thomson
B North
C Chothia
C Cohen
C Gatsogiannis
C Li
CF Xu
CJ Dong
CR Calladine
CW Wood
CW Wood
DD Rodriguez
DN Woolfson
DR Roe
E Krissinel
E Moutevelis
EH Egelman
FHC Crick
G Grigoryan
GG Krivov
GM Ullmann
GN Murshudov
GR Grimsley
HR Powell
J Hume
J Liu
J Sodek
J Walshaw
J Walshaw
J Walshaw
JA Maier
JM Swails
JM Swails
K Oxenoid
KR Mahendran
L Sun
LT Bergendahl
M Wiederstein
MD Winn
NC Burgess
NL Ing
NR Zaccai
OD Testa
OJL Rackham
P Emsley
P Schuck
PB Harbury
PR Evans
PS Huang
PV Afonine
PW Rose
R Lizatovic
RK Spencer
RP Joosten
S Eshaghi
S McIntosh-Smith
S Ramisch
S. E. Ahnert
V Koronakis
VN Malashkevich
W Chen
WR Taylor
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2018
Field of study

Higher order coiled coils with five or more helices can form α-helical barrels. Here the authors show that placing β-branched aliphatic residues along the lumen yields stable and open α-helical barrels, which is of interest for the rational design of functional proteins; whereas, the absence of β-branched side chains leads to unusual low-symmetry α-helical bundles

Online Research @ Cardiff

Edinburgh Research Explorer

Enlighten

Explore Bristol Research

Coverage of whole proteome by structural genomics observed through protein homology modeling database

Author: A Andreeva
A Krogh
A McPherson
A Stark
A Yamaguchi
AE Todd
Akihiro Yamaguchi
B Contreras-Moreira
B Contreras-Moreira
B John
C Chothia
CA Orengo
CJ Oldfield
D Baker
D Petrey
D Vitkup
DD Leipe
E Dobrovetsky
Editorial Board
EV Koonin
FS Domingues
G Liu
HJ Dyson
HM Berman
IM Wallace
J Kopp
J Kyte
J-M Chandonia
K Kinoshita
K Lundstrom
Kei Yura
L Lo Conte
L Stein
L Xie
LJ DeLucas
M Iwadate
M Ota
MA Marti-Renom
Mitiko Go
MJ Sippl
MO Dayhoff
N O’Toole
O Lichtarge
P Walian
R Linding
R Service
RA Laskowski
RA Laskowski
RF Doolittle
RR Copley
S Goldsmith-Fischman
S Tsoka
S Yokoyama
S-H Kim
S-H Kim
SE Brenner
SJ Campbell
SJ Wodak
SK Burley
SK Burley
T Hirokawa
T Kawabata
U Pieper
V Serre
Y Kyogoku
Publication venue: Kluwer Academic Publishers
Publication date: 01/01/2006
Field of study

We have been developing FAMSBASE, a protein homology-modeling database of whole ORFs predicted from genome sequences. The latest update of FAMSBASE (http://daisy.nagahama-i-bio.ac.jp/Famsbase/), which is based on the protein three-dimensional (3D) structures released by November 2003, contains modeled 3D structures for 368,724 open reading frames (ORFs) derived from genomes of 276 species, namely 17 archaebacterial, 130 eubacterial, 18 eukaryotic and 111 phage genomes. Those 276 genomes are predicted to have 734,193 ORFs in total and the current FAMSBASE contains protein 3D structure of approximately 50% of the ORF products. However, cases that a modeled 3D structure covers the whole part of an ORF product are rare. When portion of an ORF with 3D structure is compared in three kingdoms of life, in archaebacteria and eubacteria, approximately 60% of the ORFs have modeled 3D structures covering almost the entire amino acid sequences, however, the percentage falls to about 30% in eukaryotes. When annual differences in the number of ORFs with modeled 3D structure are calculated, the fraction of modeled 3D structures of soluble protein for archaebacteria is increased by 5%, and that for eubacteria by 7% in the last 3 years. Assuming that this rate would be maintained and that determination of 3D structures for predicted disordered regions is unattainable, whole soluble protein model structures of prokaryotes without the putative disordered regions will be in hand within 15 years. For eukaryotic proteins, they will be in hand within 25 years. The 3D structures we will have at those times are not the 3D structure of the entire proteins encoded in single ORFs, but the 3D structures of separate structural domains. Measuring or predicting spatial arrangements of structural domains in an ORF will then be a coming issue of structural genomics

Springer - Publisher Connector