Search CORE

Warwick Research Archives Portal Repository

Enlighten

The University of Manchester - Institutional Repository

Proceedings - University of Groningen

ARTS repository - University of Groningen

University of Groningen Digital Archive

Radboud Repository

Dissertations of the University of Groningen

Exploring the Evolution of Novel Enzyme Functions within Structurally Defined Protein Superfamilies

Author: A Andreeva
AE Todd
AL Cuff
Alison L. Cuff
AU Tamuri
BE Engelhardt
BH Dessailly
C Chothia
CA Orengo
Christine A. Orengo
DA Benson
DE Almonacid
DM Schmidt
DS Tawfik
G Caetano-Anolles
GA Reeves
Gemma L. Holliday
GJ Bartlett
GJ Binford
GL Holliday
GL Holliday
GL Holliday
HS Park
I Nobeli
Ian Sillitoe
J Ruan
J Shi
Janet M. Thornton
JP Overington
K Katoh
LH Greene
M Bashton
M Groll
M Xu
ME Glasner
MT Murakami
N Furnham
N Gallastegui
Nicholas Furnham
NJ Mulder
O Khersonsky
PF Gherardini
PJ O'Brien
Roman A. Laskowski
SC Pegg
SD Brown
SF Altschul
W Heinemeyer
WS Valdar
Yanay Ofran
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

In order to understand the evolution of enzyme reactions and to gain an overview of biological catalysis we have combined sequence and structural data to generate phylogenetic trees in an analysis of 276 structurally defined enzyme superfamilies, and used these to study how enzyme functions have evolved. We describe in detail the analysis of two superfamilies to illustrate different paradigms of enzyme evolution. Gathering together data from all the superfamilies supports and develops the observation that they have all evolved to act on a diverse set of substrates, whilst the evolution of new chemistry is much less common. Despite that, by bringing together so much data, we can provide a comprehensive overview of the most common and rare types of changes in function. Our analysis demonstrates on a larger scale than previously studied, that modifications in overall chemistry still occur, with all possible changes at the primary level of the Enzyme Commission (E.C.) classification observed to a greater or lesser extent. The phylogenetic trees map out the evolutionary route taken within a superfamily, as well as all the possible changes within a superfamily. This has been used to generate a matrix of observed exchanges from one enzyme function to another, revealing the scale and nature of enzyme evolution and that some types of exchanges between and within E.C. classes are more prevalent than others. Surprisingly a large proportion (71%) of all known enzyme functions are performed by this relatively small set of 276 superfamilies. This reinforces the hypothesis that relatively few ancient enzymatic domain superfamilies were progenitors for most of the chemistry required for life

LSHTM Research Online

UCL Discovery

FigShare

Bhageerath: an energy based web enabled computer software suite for limiting the search space of tertiary structures of small globular proteins

Author: Al-Lazikani
Altschul
Anfinsen
Aszodi
B. Jayaram
Baker
Bates
Berman
Bradley
Bryson
Cheng
Combet
Cuff
Debashish Sahu
Frishman
Fujitsuka
Guex
Guex
Huang
Hubbard
Hung
Kim
Klepeis
Klepeis
Kolinski
Kumkum Bhushan
Lambert
Liwo
Lund
Moult
Narang
Narang
Ogata
Ortiz
Panchenko
Pillardy
Pooja Narang
Praveen Agrawal
Rost
Rost
Sali
Sandhya R. Shenoy
Scheraga
Scheraga
Sen
Simons
Skolnick
Surojit Bose
Sánchez
Tramontanoa
Tress
Vasquez
Venclovas
Vidhu Pandey
Zemla
Publication venue: Oxford University Press
Publication date: 07/11/2006
Field of study

We describe here an energy based computer software suite for narrowing down the search space of tertiary structures of small globular proteins. The protocol comprises eight different computational modules that form an automated pipeline. It combines physics based potentials with biophysical filters to arrive at 10 plausible candidate structures starting from sequence and secondary structure information. The methodology has been validated here on 50 small globular proteins consisting of 2–3 helices and strands with known tertiary structures. For each of these proteins, a structure within 3–6 Å RMSD (root mean square deviation) of the native has been obtained in the 10 lowest energy structures. The protocol has been web enabled and is accessible at

EC-BLAST: a tool to automatically search and compare enzyme reactions.

Author: A Dalby
A Theocharidis
AL Cuff
C Jochum
C Steinbeck
DARSD Latino
F Mu
Gemma L Holliday
I Ugi
J Lees
J-L Faulon
Janet M Thornton
K Tipton
L Chen
M Kanehisa
M Kotera
M Leber
Nicholas Furnham
NM O'Boyle
NM O'Boyle
Q-Y Zhang
RHS Thompson
RS Cahn
S Heller
SA Rahman
SA Rahman
Sergio Martinez Cuesta
Syed Asad Rahman
T Sing
V Egelhofer
V Prelog
WL Chen
Y Yamanishi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 12/01/2014
Field of study

We present EC-BLAST (http://www.ebi.ac.uk/thornton-srv/software/rbl/), an algorithm and Web tool for quantitative similarity searches between enzyme reactions at three levels: bond change, reaction center and reaction structure similarity. It uses bond changes and reaction patterns for all known biochemical reactions derived from atom-atom mapping across each reaction. EC-BLAST has the potential to improve enzyme classification, identify previously uncharacterized or new biochemical transformations, improve the assignment of enzyme function to sequences, and assist in enzyme engineering

LSHTM Research Online

Predicting Positive p53 Cancer Rescue Regions Using Most Informative Positive (MIP) Active Learning

Author: A Friedler
A Petitjean
A Ventura
AC Joerger
AC Martin
AL Cuff
AN Bullock
AR Fersht
BG Buchanan
CL Brooks
DA Case
DA Cohn
EF Pettersen
F Francois
F Glaser
G Dantas
G. Wesley Hatfield
IH Witten
J Feng
James M. Briggs
JM Lambert
JS Huston
K Otsuka
Kirsty Salmon
L Itti
Linda Hall
Lydia Ho
M Hollstein
M Saar-Tsechansky
MA Hearst
N Roy
NE Sharpless
NG Karaguler
P Baldi
Peter Kaiser
PV Nikolova
R Jones
Richard H. Lathrop
RJ Fox
RK Brachmann
RK Brachmann
Roberta Baronio
S Kato
S Lain
SA Danziger
SA Danziger
Samuel A. Danziger
SM Leach
TE Baroni
VJ Bykov
W Wang
W Xue
Y Cho
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

Many protein engineering problems involve finding mutations that produce proteins with a particular function. Computational active learning is an attractive approach to discover desired biological activities. Traditional active learning techniques have been optimized to iteratively improve classifier accuracy, not to quickly discover biologically significant results. We report here a novel active learning technique, Most Informative Positive (MIP), which is tailored to biological problems because it seeks novel and informative positive results. MIP active learning differs from traditional active learning methods in two ways: (1) it preferentially seeks Positive (functionally active) examples; and (2) it may be effectively extended to select gene regions suitable for high throughput combinatorial mutagenesis. We applied MIP to discover mutations in the tumor suppressor protein p53 that reactivate mutated p53 found in human cancers. This is an important biomedical goal because p53 mutants have been implicated in half of all human cancers, and restoring active p53 in tumors leads to tumor regression. MIP found Positive (cancer rescue) p53 mutants in silico using 33% fewer experiments than traditional non-MIP active learning, with only a minor decrease in classifier accuracy. Applying MIP to in vivo experimentation yielded immediate Positive results. Ten different p53 mutations found in human cancers were paired in silico with all possible single amino acid rescue mutations, from which MIP was used to select a Positive Region predicted to be enriched for p53 cancer rescue mutants. In vivo assays showed that the predicted Positive Region: (1) had significantly more (p<0.01) new strong cancer rescue mutants than control regions (Negative, and non-MIP active learning); (2) had slightly more new strong cancer rescue mutants than an Expert region selected for purely biological considerations; and (3) rescued for the first time the previously unrescuable p53 cancer mutant P152L

eScholarship - University of California

Detection of Alpha-Rod Protein Repeats Using a Neural Network and Application to Huntingtin

Author: A Adami
A Akhmanova
A Cena
A Krogh
A Losada
A Lupas
AF Neuwald
Anup Arumughan
AV Kajava
AV Kajava
B Falkowska-Hansen
B Rost
B Rost
B Rost
BE McGuinness
BM Collins
C Cole
CL Wellington
D Baillat
E Sontag
E Staub
EE Heldwein
EF Smith
Erich E. Wanker
F Rosenblatt
G Hoffner
Gareth A. Palidwor
HC Gregson
I Letunic
I Melvin
J Al-Bassam
J Al-Bassam
J Nasir
JA Cuff
L Cassimeris
L Chelysheva
L Smith
L Spagnolo
LJ McGuffin
LM Mende-Mueller
Luis Sanchez-Pulido
M Barrios-Rodiles
M Gruber
M Nakayama
M Oeffinger
M Peifer
M Sagermann
M Yao
MA Andrade
MA Andrade
MA Andrade
MA Andrade
Matthew R. Huska
MD Hatfield
Miguel A. Andrade-Navarro
MM Golas
MR Huska
MS Boguski
NC Turner
P Harjes
P Legrand
Pablo Porras
Philip E. Bourne
PJ Preker
R Sapiro
Raphaele Foulle
RD Finn
S Hauf
Sergey Shcherbinin
SF Altschul
SY Lee
T Tukamoto
Tamas Rasko
U Stelzl
Ulrich Stelzl
US Tulu
W Li
Y Bai
Y Mao
Y Matsuura
Y Mimori-Kiyosue
Y Shomura
Y Wang
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

A growing number of solved protein structures display an elongated structural domain, denoted here as alpha-rod, composed of stacked pairs of anti-parallel alpha-helices. Alpha-rods are flexible and expose a large surface, which makes them suitable for protein interaction. Although most likely originating by tandem duplication of a two-helix unit, their detection using sequence similarity between repeats is poor. Here, we show that alpha-rod repeats can be detected using a neural network. The network detects more repeats than are identified by domain databases using multiple profiles, with a low level of false positives (<10%). We identify alpha-rod repeats in approximately 0.4% of proteins in eukaryotic genomes. We then investigate the results for all human proteins, identifying alpha-rod repeats for the first time in six protein families, including proteins STAG1-3, SERAC1, and PSMD1-2 & 5. We also characterize a short version of these repeats in eight protein families of Archaeal, Bacterial, and Fungal species. Finally, we demonstrate the utility of these predictions in directing experimental work to demarcate three alpha-rods in huntingtin, a protein mutated in Huntington's disease. Using yeast two hybrid analysis and an immunoprecipitation technique, we show that the huntingtin fragments containing alpha-rods associate with each other. This is the first definition of domains in huntingtin and the first validation of predicted interactions between fragments of huntingtin, which sets up directions toward functional characterization of this protein. An implementation of the repeat detection algorithm is available as a Web server with a simple graphical output: http://www.ogic.ca/projects/ard. This can be further visualized using BiasViz, a graphic tool for representation of multiple sequence alignments

Oxford University Research Archive

MDC Repository

MPG.PuRe

Solution Structure and Phylogenetics of Prod1, a Member of the Three-Finger Protein Superfamily Implicated in Salamander Limb Regeneration

Prod1 is a cell-surface molecule of the three-finger protein (TFP) superfamily involved in the specification of newt limb PD identity. The TFP superfamily is a highly diverse group of metazoan proteins that includes snake venom toxins, mammalian transmembrane receptors and miscellaneous signaling molecules..The available data suggest that Prod1, and thereby its role in encoding PD identity, is restricted to salamanders. The lack of comparable limb-regenerative capability in other adult vertebrates could be correlated with the absence of the Prod1 gene

UCL Discovery

High pressure near infrared study of the mutated light-harvesting complex LH2

Author: Braun P
Braun P
Cuff AL
Fowler GJ
Fyfe PK
Gall A
Gall A
Gall A
Gall A
Jones MR
Kwa LG
L. Kwa
McDermott G
Olsen JD
P. Braun
R. Gebhardt
Reddy NRS
Sauer K
Sturgis JN
W. Doster
Zollfrank J
Publication venue: 'FapUNIFESP (SciELO)'
Publication date
Field of study

Novel Peptide-Mediated Interactions Derived from High-Resolution 3-Dimensional Structures

Many biological responses to intra- and extracellular stimuli are regulated through complex networks of transient protein interactions where a globular domain in one protein recognizes a linear peptide from another, creating a relatively small contact interface. These peptide stretches are often found in unstructured regions of proteins, and contain a consensus motif complementary to the interaction surface displayed by their binding partners. While most current methods for the de novo discovery of such motifs exploit their tendency to occur in disordered regions, our work here focuses on another observation: upon binding to their partner domain, motifs adopt a well-defined structure. Indeed, through the analysis of all peptide-mediated interactions of known high-resolution three-dimensional (3D) structure, we found that the structure of the peptide may be as characteristic as the consensus motif, and help identify target peptides even though they do not match the established patterns. Our analyses of the structural features of known motifs reveal that they tend to have a particular stretched and elongated structure, unlike most other peptides of the same length. Accordingly, we have implemented a strategy based on a Support Vector Machine that uses this features, along with other structure-encoded information about binding interfaces, to search the set of protein interactions of known 3D structure and to identify unnoticed peptide-mediated interactions among them. We have also derived consensus patterns for these interactions, whenever enough information was available, and compared our results with established linear motif patterns and their binding domains. Finally, to cross-validate our identification strategy, we scanned interactome networks from four model organisms with our newly derived patterns to see if any of them occurred more often than expected. Indeed, we found significant over-representations for 64 domain-motif interactions, 46 of which had not been described before, involving over 6,000 interactions in total for which we could suggest the molecular details determining the binding

An Atlas of the Thioredoxin Fold Class Reveals the Complexity of Function-Enabling Adaptations

The group of proteins that contain a thioredoxin (Trx) fold is huge and diverse. Assessment of the variation in catalytic machinery of Trx fold proteins is essential in providing a foundation for understanding their functional diversity and predicting the function of the many uncharacterized members of the class. The proteins of the Trx fold class retain common features—including variations on a dithiol CxxC active site motif—that lead to delivery of function. We use protein similarity networks to guide an analysis of how structural and sequence motifs track with catalytic function and taxonomic categories for 4,082 representative sequences spanning the known superfamilies of the Trx fold. Domain structure in the fold class is varied and modular, with 2.8% of sequences containing more than one Trx fold domain. Most member proteins are bacterial. The fold class exhibits many modifications to the CxxC active site motif—only 56.8% of proteins have both cysteines, and no functional groupings have absolute conservation of the expected catalytic motif. Only a small fraction of Trx fold sequences have been functionally characterized. This work provides a global view of the complex distribution of domains and catalytic machinery throughout the fold class, showing that each superfamily contains remnants of the CxxC active site. The unifying context provided by this work can guide the comparison of members of different Trx fold superfamilies to gain insight about their structure-function relationships, illustrated here with the thioredoxins and peroxiredoxins