Search CORE

78 research outputs found

DIMA 3.0: Domain Interaction Map

Author: B. Vilne
D. Frishman
Davis
Edgar
Fodor
Gong
Halperin
Jefferson
Jothi
Kass
Ng
Olmea
P. Pagel
Pagel
Pellegrini
Q. Luo
Riley
Schuster-Bockler
Sonnhammer
Tatusov
Winter
Yeang
Zhao
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Domain Interaction MAp (DIMA, available at http://webclu.bio.wzw.tum.de/dima) is a database of predicted and known interactions between protein domains. It integrates 5807 structurally known interactions imported from the iPfam and 3did databases and 46 900 domain interactions predicted by four computational methods: domain phylogenetic profiling, domain pair exclusion algorithm correlated mutations and domain interaction prediction in a discriminative way. Additionally predictions are filtered to exclude those domain pairs that are reported as non-interacting by the Negatome database. The DIMA Web site allows to calculate domain interaction networks either for a domain of interest or for entire organisms, and to explore them interactively using the Flash-based Cytoscape Web software

Crossref

PubMed Central

PuSH

Riga Stradins university

Single cell RNA-seq reveals profound transcriptional similarity between Barrett's oesophagus and oesophageal submucosal glands

Author: Bailey A
Braden B
Buck D
Goldin R
Green A
Lu X
Maynard ND
Middleton MR
Owen RP
Piazza P
Ponting CP
Ruiz-Puig C
Schuster-Bockler B
Severson DT
Wang LM
White MJ
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/09/2018
Field of study

Barrett’s oesophagus is a precursor of oesophageal adenocarcinoma. In this common condition, squamous epithelium in the oesophagus is replaced by columnar epithelium in response to acid reflux. Barrett’s oesophagus is highly heterogeneous and its relationships to normal tissues are unclear. Here we investigate the cellular complexity of Barrett’s oesophagus and the upper gastrointestinal tract using RNA-sequencing of single cells from multiple biopsies from six patients with Barrett’s oesophagus and two patients without oesophageal pathology. We find that cell populations in Barrett’s oesophagus, marked by LEFTY1 and OLFM4, exhibit a profound transcriptional overlap with oesophageal submucosal gland cells, but not with gastric or duodenal cells. Additionally, SPINK4 and ITLN1 mark cells that precede morphologically identifiable goblet cells in colon and Barrett’s oesophagus, potentially aiding the identification of metaplasia. Our findings reveal striking transcriptional relationships between normal tissue populations and cells in a premalignant condition, with implications for clinical practice

Spiral - Imperial College Digital Repository

Domain-Domain Interactions Underlying Herpesvirus-Human Protein-Protein Interaction Networks

Author: A Chatr-aryamontri
A Chatr-aryamontri
A Schlicker
A Stein
B Aranda
B Schuster-Bockler
BD Greenbaum
C Uniprot
D Maglott
E Fossum
EJG Pitman
EV Koonin
EW Verschuren
I Bahir
LM Iyer
MA Calderwood
MD Dyer
RD Finn
RD Finn
Robert Belshaw
S Costa
S Redpath
SF Altschul
SI Yoon
T Driscoll
T Pawson
TM Nye
V Navratil
Y Nakamura
Z Itzhaki
Zohar Itzhaki
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Protein-domains play an important role in mediating protein-protein interactions. Furthermore, the same domain-pairs mediate different interactions in different contexts and in various organisms, and therefore domain-pairs are considered as the building blocks of interactome networks. Here we extend these principles to the host-virus interface and find the domain-pairs that potentially mediate human-herpesvirus interactions. Notably, we find that the same domain-pairs used by other organisms for mediating their interactions underlie statistically significant fractions of human-virus protein inter-interaction networks. Our analysis shows that viral domains tend to interact with human domains that are hubs in the human domain-domain interaction network. This may enable the virus to easily interfere with a variety of mechanisms and processes involving various and different human proteins carrying the relevant hub domain. Comparative genomics analysis provides hints at a molecular mechanism by which the virus acquired some of its interacting domains from its human host

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

A Score of the Ability of a Three-Dimensional Protein Model to Retrieve Its Own Sequence as a Quantitative Measure of Its Quality and Appropriateness

Author: A Abyzov
A Aguzzi
A Andreeva
B Elshorst
B Kuhlman
B Schuster-Bockler
B Wallner
CA Orengo
CM Leslin
CT Saunders
D Chivian
D Röthlisberger
DE Tronrud
DL Wheeler
F Abascal
F Melo
G Dantas
J Pei
JD Thompson
JS Richardson
K Karplus
K Karplus
K Karplus
K Karplus
K Raha
KD Pruitt
L Jiang
León P. Martínez-Castilla
M Wiederstein
MR Gómez-García
Niall James Haslam
R Lüthy
Rogelio Rodríguez-Sotres
S Guindon
SR Eddy
TC Pochapsky
VA Ilyin
WA Sheffler
Publication venue: Public Library of Science
Publication date: 07/09/2010
Field of study

BACKGROUND: Despite the remarkable progress of bioinformatics, how the primary structure of a protein leads to a three-dimensional fold, and in turn determines its function remains an elusive question. Alignments of sequences with known function can be used to identify proteins with the same or similar function with high success. However, identification of function-related and structure-related amino acid positions is only possible after a detailed study of every protein. Folding pattern diversity seems to be much narrower than sequence diversity, and the amino acid sequences of natural proteins have evolved under a selective pressure comprising structural and functional requirements acting in parallel. PRINCIPAL FINDINGS: The approach described in this work begins by generating a large number of amino acid sequences using ROSETTA [Dantas G et al. (2003) J Mol Biol 332:449-460], a program with notable robustness in the assignment of amino acids to a known three-dimensional structure. The resulting sequence-sets showed no conservation of amino acids at active sites, or protein-protein interfaces. Hidden Markov models built from the resulting sequence sets were used to search sequence databases. Surprisingly, the models retrieved from the database sequences belonged to proteins with the same or a very similar function. Given an appropriate cutoff, the rate of false positives was zero. According to our results, this protocol, here referred to as Rd.HMM, detects fine structural details on the folding patterns, that seem to be tightly linked to the fitness of a structural framework for a specific biological function. CONCLUSION: Because the sequence of the native protein used to create the Rd.HMM model was always amongst the top hits, the procedure is a reliable tool to score, very accurately, the quality and appropriateness of computer-modeled 3D-structures, without the need for spectroscopy data. However, Rd.HMM is very sensitive to the conformational features of the models' backbone

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Identification of residues in the N-terminal PAS domains important for dimerization of Arnt and AhR

Author: Amezcua
Amoutzias
Anne Chapman-Smith
Antonsson
Arpiainen
Buttani
Card
Chapman-Smith
Costanzo
DeYoung
Dhayalan
Dove
Dove
Erbel
Ian B. Dodd
Keith E. Shearwin
Kewley
Kewley
Kikuchi
Kim
Kirk
Lee
Lievens
Lindebro
Ma
Makino
McGuire
McIntosh
Moffett
Murray
Murray L. Whitelaw
Nair
Nan Hao
Numayama-Tsuruta
Palmer
Pereira-Leal
Pinkett
Pongratz
Raussens
Ravasi
Rigaut
Scheuermann
Schubert
Schuster-Bockler
Shih
Soshilov
Sun
Taylor
Ullmann
Vivat-Hannah
von Mering
Whelan
Yang
Yildiz
Zhong
Zoltowski
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

The basic helix–loop–helix (bHLH).PAS dimeric transcription factors have crucial roles in development, stress response, oxygen homeostasis and neurogenesis. Their target gene specificity depends in part on partner protein choices, where dimerization with common partner Aryl hydrocarbon receptor nuclear translocator (Arnt) is an essential step towards forming active, DNA binding complexes. Using a new bacterial two-hybrid system that selects for loss of protein interactions, we have identified 22 amino acids in the N-terminal PAS domain of Arnt that are involved in heterodimerization with aryl hydrocarbon receptor (AhR). Of these, Arnt E163 and Arnt S190 were selective for the AhR/Arnt interaction, since mutations at these positions had little effect on Arnt dimerization with other bHLH.PAS partners, while substitution of Arnt D217 affected the interaction with both AhR and hypoxia inducible factor-1α but not with single minded 1 and 2 or neuronal PAS4. Arnt uses the same face of the N-terminal PAS domain for homo- and heterodimerization and mutational analysis of AhR demonstrated that the equivalent region is used by AhR when dimerizing with Arnt. These interfaces differ from the PAS β-scaffold surfaces used for dimerization between the C-terminal PAS domains of hypoxia inducible factor-2α and Arnt, commonly used for PAS domain interactions

CiteSeerX

Crossref

Adelaide Research & Scholarship

PubMed Central

Incorporating background frequency improves entropy-based residue conservation measures

Author: B Schuster-Bockler
C Sander
CH Wu
CT Porter
CT Workman
D La
E Bindewald
G Cheng
GD Stormo
GE Crooks
H Yao
I Mihalek
J Pei
J Pei
J Pei
JD Watson
JM Johnson
JP Bielawski
K Sjolander
K Wang
Kai Wang
KW Plaxco
L Oliveira
LA Mirny
LA Mirny
M Clamp
M Gerstein
M Landau
O Lichtarge
OS Soyer
PC Ng
PS Shenkin
R Greaves
Ram Samudrala
RB Vilim
RM Williamson
S Jones
S Levy
SF Altschul
SR Eddy
SR Sunyaev
SS Hannenhalli
TM Cover
V Chelliah
WS Valdar
WS Valdar
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Several entropy-based methods have been developed for scoring sequence conservation in protein multiple sequence alignments. High scoring amino acid positions may correlate with structurally or functionally important residues. However, amino acid background frequencies are usually not taken into account in these entropy-based scoring schemes. RESULTS: We demonstrate that using a relative entropy measure that incorporates amino acid background frequency results in improved performance in identifying functional sites from protein multiple sequence alignments. CONCLUSION: Our results suggest that the application of appropriate background frequency information may lead to more biologically relevant results in many areas of bioinformatics

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

GPS-ARM: Computational Analysis of the APC/C Recognition Motif by Predicting D-Boxes and KEN-Boxes

Author: A Castro
A Reis
AH Kim
B Schuster-Bockler
BR Thornton
C Liot
CM Pfleger
CM Pfleger
D Barford
Fang Yuan
G Fang
G Fang
H Dinkel
H Naoe
HG Nguyen
J Ren
Jian Ren
JM Peters
JM Peters
Jun Cao
K Nasmyth
M Glotzer
M Pagano
M Silies
MD Gurden
MJ Kallio
NE Davey
Niall James Haslam
PC da Fonseca
Qing Yang
RW King
S Michael
S Tudzarova
SL Colombo
TJ Owens
V Sudakin
Y Kurasawa
Y Zhou
Yanhong Zhou
Yu Xue
Z Liu
Z Liu
Zexian Liu
Publication venue: Public Library of Science
Publication date: 29/03/2012
Field of study

Anaphase-promoting complex/cyclosome (APC/C), an E3 ubiquitin ligase incorporated with Cdh1 and/or Cdc20 recognizes and interacts with specific substrates, and faithfully orchestrates the proper cell cycle events by targeting proteins for proteasomal degradation. Experimental identification of APC/C substrates is largely dependent on the discovery of APC/C recognition motifs, e.g., the D-box and KEN-box. Although a number of either stringent or loosely defined motifs proposed, these motif patterns are only of limited use due to their insufficient powers of prediction. We report the development of a novel GPS-ARM software package which is useful for the prediction of D-boxes and KEN-boxes in proteins. Using experimentally identified D-boxes and KEN-boxes as the training data sets, a previously developed GPS (Group-based Prediction System) algorithm was adopted. By extensive evaluation and comparison, the GPS-ARM performance was found to be much better than the one using simple motifs. With this powerful tool, we predicted 4,841 potential D-boxes in 3,832 proteins and 1,632 potential KEN-boxes in 1,403 proteins from H. sapiens, while further statistical analysis suggested that both the D-box and KEN-box proteins are involved in a broad spectrum of biological processes beyond the cell cycle. In addition, with the co-localization information, we predicted hundreds of mitosis-specific APC/C substrates with high confidence. As the first computational tool for the prediction of APC/C-mediated degradation, GPS-ARM is a useful tool for information to be used in further experimental investigations. The GPS-ARM is freely accessible for academic researchers at: http://arm.biocuckoo.org

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The Francis Crick Institute

Comparative analysis of carboxysome shell proteins

Author: AK-C So
B Schuster-Bockler
BM Long
BM Long
C Lichtle
CA Kerfeld
CA Kerfeld
Cheryl A. Kerfeld
CS Crowley
CV Iancu
E Gantt
E Marco
F Cai
F Partensky
FR Tabita
GC Cannon
GC Cannon
GD Price
H Ashkenazy
HJ Tripp
James N. Kinney
JB Parsons
JM Shively
KL Peña
L Fridlyand
M Ludwig
M Sagermann
MF Schmid
MG Klein
MR Badger
MR Badger
MR Sawaya
NA Baker
OS Smart
RC Edgar
S Tanaka
S Tanaka
Seth D. Axen
SH Baker
SR Eddy
SS-W Cot
TO Yeates
WB Whitman
Y Marcus
Y Tsai
Y Tsai
Z Dou
Z Wunderlich
Publication venue: Springer Netherlands
Publication date: 01/01/2011
Field of study

Carboxysomes are metabolic modules for CO2 fixation that are found in all cyanobacteria and some chemoautotrophic bacteria. They comprise a semi-permeable proteinaceous shell that encapsulates ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) and carbonic anhydrase. Structural studies are revealing the integral role of the shell protein paralogs to carboxysome form and function. The shell proteins are composed of two domain classes: those with the bacterial microcompartment (BMC; Pfam00936) domain, which oligomerize to form (pseudo)hexamers, and those with the CcmL/EutN (Pfam03319) domain which form pentamers in carboxysomes. These two shell protein types are proposed to be the basis for the carboxysome’s icosahedral geometry. The shell proteins are also thought to allow the flux of metabolites across the shell through the presence of the small pore formed by their hexameric/pentameric symmetry axes. In this review, we describe bioinformatic and structural analyses that highlight the important primary, tertiary, and quaternary structural features of these conserved shell subunits. In the future, further understanding of these molecular building blocks may provide the basis for enhancing CO2 fixation in other organisms or creating novel biological nanostructures

Crossref

Springer - Publisher Connector

PubMed Central

Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans

Author: A Eyre-Walker
A Eyre-Walker
A Eyre-Walker
A Eyre-Walker
A Hodgkinson
A Hodgkinson
A Kong
A Kong
Adam Eyre-Walker
B Arbeithuber
B Paten
B Schuster-Bockler
C Seoighe
C TEP
DF Conrad
DL Bodian
E Kenigsberg
F Chiaromonte
F Pratto
F Supek
G Bernardi
G Bernardi
G McVicker
GP Holmquist
H Jonsson
I Hellmann
I Hellmann
J Filipski
J Filipski
J Meunier
JB Haldane
JC Dohm
JJ Cai
JJ Michaelson
K Harris
K Harris
K Wolfe
KE Lohmueller
KH Wolfe
L Duret
L Duret
LC Francioli
M Blanchette
MJ Lercher
MW Nachman
NV Terekhanova
P Moorjani
P Polak
Peter F. Arndt
R Burgess
RE Thurman
RS Hansen
S Besenbacher
S Glemin
S Katzman
S Tyekucheva
Shamil R. Sunyaev
Thomas C. A. Smith
TI Gossmann
TN Phung
V Aggarwala
VM Schaibley
WS Wong
Y Benjamini
YH Woo
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/03/2018
Field of study

It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. However, it is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We investi- gate a number of questions pertaining to the distribution of mutations using more than 130,000 DNMs from three large datasets. We demonstrate that the amount and pattern of variation differs between datasets at the 1MB and 100KB scales probably as a consequence of differences in sequencing technology and processing. In particular, datasets show differ- ent patterns of correlation to genomic variables such as replication time. Never-the-less there are many commonalities between datasets, which likely represent true patterns. We show that there is variation in the mutation rate at the 100KB, 1MB and 10MB scale that can- not be explained by variation at smaller scales, however the level of this variation is modest at large scales–at the 1MB scale we infer that ~90% of regions have a mutation rate within 50% of the mean. Different types of mutation show similar levels of variation and appear to vary in concert which suggests the pattern of mutation is relatively constant across the genome. We demonstrate that variation in the mutation rate does not generate large-scale variation in GC-content, and hence that mutation bias does not maintain the isochore struc- ture of the human genome. We find that genomic features explain less than 40% of the explainable variance in the rate of DNM. As expected the rate of divergence between spe- cies is correlated to the rate of DNM. However, the correlations are weaker than expected if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. In contrast to divergence, we find that most of the variation in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered

Crossref

ZENODO

Directory of Open Access Journals

Dryad Digital Repository (Duke University)

Electronic Archiving System

Sussex Research Online

MPG.PuRe

The Francis Crick Institute

Depletion of somatic mutations in splicing-associated sequences in cancer genomes

Author: A Busch
A Woolfe
B Schuster-Bockler
BJ Blencowe
Cancer Genome Atlas Research Network
DA Denisov
DB Carlini
E Sebestyen
E Sebestyen
EF Caceres
EP Rocha
F Pagani
F Supek
F Supek
H Jung
J-V Chamary
JJ Gartner
JL Parmley
JL Parmley
JL Parmley
JV Chamary
L Chen
Laurence D. Hurst
LB Alexandrov
LD Hurst
M Raponi
M Secrier
MS Lawrence
N Waddell
Nizar N. Batada
O Soukarieh
P Julien
P Polak
P Polak
PA Futreal
R Savisaar
R Savisaar
R Soemedi
RC Hunt
RD Schreiber
RS Hansen
S Kogan
S Nik-Zainal
S Subramanian
SH Lelieveld
T Derrien
T Khare
T Warnecke
VA Blomen
WG Fairbrother
WG Fairbrother
X Chen
X Wu
XM Wu
Y Xing
ZE Sauna
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2017
Field of study

Abstract Background An important goal of cancer genomics is to identify systematically cancer-causing mutations. A common approach is to identify sites with high ratios of non-synonymous to synonymous mutations; however, if synonymous mutations are under purifying selection, this methodology leads to identification of false-positive mutations. Here, using synonymous somatic mutations (SSMs) identified in over 4000 tumours across 15 different cancer types, we sought to test this assumption by focusing on coding regions required for splicing. Results Exon flanks, which are enriched for sequences required for splicing fidelity, have ~ 17% lower SSM density compared to exonic cores, even after excluding canonical splice sites. While it is impossible to eliminate a mutation bias of unknown cause, multiple lines of evidence support a purifying selection model above a mutational bias explanation. The flank/core difference is not explained by skewed nucleotide content, replication timing, nucleosome occupancy or deficiency in mismatch repair. The depletion is not seen in tumour suppressors, consistent with their role in positive tumour selection, but is otherwise observed in cancer-associated and non-cancer genes, both essential and non-essential. Consistent with a role in splicing modulation, exonic splice enhancers have a lower SSM density before and after controlling for nucleotide composition; moreover, flanks at the 5’ end of the exons have significantly lower SSM density than at the 3’ end. Conclusions These results suggest that the observable mutational spectrum of cancer genomes is not simply a product of various mutational processes and positive selection, but might also be shaped by negative selection

Crossref

Directory of Open Access Journals

Edinburgh Research Explorer