Search CORE

115 research outputs found

MemBrain: Improving the Accuracy of Predicting Transmembrane Helices

Author: A Krogh
A Makivirta
AA Schaffer
AK Chamberlain
B Rost
Bostjan Kobe
D Fu
DT Jones
G Shafer
H Zhou
HB Shen
HB Shen
Hongbin Shen
J Abramson
J Kyte
James J. Chou
JM Cuthbertson
KC Chou
KC Chou
L Kall
LM Zouhal
M Cserzo
MG Claros
RL Lieberman
SH White
T Denoeux
T Hirokawa
WC Wimley
Z Yuan
Publication venue: Public Library of Science
Publication date: 11/06/2008
Field of study

Prediction of transmembrane helices (TMH) in α helical membrane proteins provides valuable information about the protein topology when the high resolution structures are not available. Many predictors have been developed based on either amino acid hydrophobicity scale or pure statistical approaches. While these predictors perform reasonably well in identifying the number of TMHs in a protein, they are generally inaccurate in predicting the ends of TMHs, or TMHs of unusual length. To improve the accuracy of TMH detection, we developed a machine-learning based predictor, MemBrain, which integrates a number of modern bioinformatics approaches including sequence representation by multiple sequence alignment matrix, the optimized evidence-theoretic K-nearest neighbor prediction algorithm, fusion of multiple prediction window sizes, and classification by dynamic threshold. MemBrain demonstrates an overall improvement of about 20% in prediction accuracy, particularly, in predicting the ends of TMHs and TMHs that are shorter than 15 residues. It also has the capability to detect N-terminal signal peptides. The MemBrain predictor is a useful sequence-based analysis tool for functional and structural characterization of helical membrane proteins; it is freely available at http://chou.med.harvard.edu/bioinf/MemBrain/

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Taxonomic distribution and origins of the extended LHC (light-harvesting complex) antenna protein superfamily

Abstract Background The extended light-harvesting complex (LHC) protein superfamily is a centerpiece of eukaryotic photosynthesis, comprising the LHC family and several families involved in photoprotection, like the LHC-like and the photosystem II subunit S (PSBS). The evolution of this complex superfamily has long remained elusive, partially due to previously missing families. Results In this study we present a meticulous search for LHC-like sequences in public genome and expressed sequence tag databases covering twelve representative photosynthetic eukaryotes from the three primary lineages of plants (Plantae): glaucophytes, red algae and green plants (Viridiplantae). By introducing a coherent classification of the different protein families based on both, hidden Markov model analyses and structural predictions, numerous new LHC-like sequences were identified and several new families were described, including the red lineage chlorophyll <it>a/b</it>-binding-like protein (RedCAP) family from red algae and diatoms. The test of alternative topologies of sequences of the highly conserved chlorophyll-binding core structure of LHC and PSBS proteins significantly supports the independent origins of LHC and PSBS families via two unrelated internal gene duplication events. This result was confirmed by the application of cluster likelihood mapping. Conclusions The independent evolution of LHC and PSBS families is supported by strong phylogenetic evidence. In addition, a possible origin of LHC and PSBS families from different homologous members of the stress-enhanced protein subfamily, a diverse and anciently paralogous group of two-helix proteins, seems likely. The new hypothesis for the evolution of the extended LHC protein superfamily proposed here is in agreement with the character evolution analysis that incorporates the distribution of families and subfamilies across taxonomic lineages. Intriguingly, stress-enhanced proteins, which are universally found in the genomes of green plants, red algae, glaucophytes and in diatoms with complex plastids, could represent an important and previously missing link in the evolution of the extended LHC protein superfamily.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UPF Digital Repository

Digital.CSIC

A Combination of Compositional Index and Genetic Algorithm for Predicting Transmembrane Helical Segments

Author: A Krogh
A Thomas
B Rost
E Falkenauer
E Wallin
EL Sonnhammer
F Tekaia
G Tusnady
G von Heijne
GE Tusnady
H Berman
H Shen
H Zhou
J Holland
J Pylouster
JM Cuthbertson
L Kall
M Cserzo
M Suyama
MG Claros
Nazar Zaki
Pierandrea Temussi
R Garey
RY Kahsay
S Hosseini
S Jayasinghe
S Roy
Salah Bouktif
Sanja Lazarova-Molnar
T Hirokawa
T Nugent
T Taylor
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. Availability: The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Southern Denmark Research Output

The capsule polysaccharide structure and biogenesis for non-O1 Vibrio cholerae NRT36S: genes are embedded in the LPS region

BACKGROUND: In V. cholerae, the biogenesis of capsule polysaccharide is poorly understood. The elucidation of capsule structure and biogenesis is critical to understanding the evolution of surface polysaccharide and the internal relationship between the capsule and LPS in this species. V. cholerae serogroup O31 NRT36S, a human pathogen that produces a heat-stable enterotoxin (NAG-ST), is encapsulated. Here, we report the covalent structure and studies of the biogenesis of the capsule in V. cholerae NRT36S. RESULTS: The structure of the capsular (CPS) polysaccharide was determined by high resolution NMR spectroscopy and shown to be a complex structure with four residues in the repeating subunit. The gene cluster of capsule biogenesis was identified by transposon mutagenesis combined with whole genome sequencing data (GenBank accession DQ915177). The capsule gene cluster shared the same genetic locus as that of the O-antigen of lipopolysaccharide (LPS) biogenesis gene cluster. Other than V. cholerae O139, this is the first V. cholerae CPS for which a structure has been fully elucidated and the genetic locus responsible for biosynthesis identified. CONCLUSION: The co-location of CPS and LPS biosynthesis genes was unexpected, and would provide a mechanism for simultaneous emergence of new O and K antigens in a single strain. This, in turn, may be a key element for V. cholerae to evolve new strains that can escape immunologic detection by host populations

Crossref

University of Maryland, Baltimore County

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Maryland, Baltimore County: UMBC Digital Collections

Mycobacterium tuberculosis DosR Regulon Gene Rv0079 Encodes a Putative, ‘Dormancy Associated Translation Inhibitor (DATIN)’

Author: A Bernsel
A Roy
AD Harries
Ashutosh Kumar
Astrid Lewin
CK Stover
D Schneidman-Duhovny
DM Roberts
E Mashiach
EM Leyten
Insaf A. Qureshi
Javed N. Agrewala
JC Betts
JL Flynn
L Dumon-Seignovert
LG Wayne
LG Wayne
LG Wayne
M Arai
M Cserzo
MA Flores Valdez
MI Voskuil
MM Bradford
Mohammad Majid
Niyaz Ahmed
NY Yu
Pittu Sandhya Rani
RA Laskowski
RA Laskowski
Ralph Kunisch
S Chauhan
S Mishra
S Sharbati-Tehrani
Seyed E. Hasnain
TR Frieden
WM El-Sharoud
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Mycobacterium tuberculosis is a major human pathogen that has evolved survival mechanisms to persist in an immune-competent host under a dormant condition. The regulation of M. tuberculosis metabolism during latent infection is not clearly known. The dormancy survival regulon (DosR regulon) is chiefly responsible for encoding dormancy related functions of M. tuberculosis. We describe functional characterization of an important gene of DosR regulon, Rv0079, which appears to be involved in the regulation of translation through the interaction of its product with bacterial ribosomal subunits. The protein encoded by Rv0079, possibly, has an inhibitory role with respect to protein synthesis, as revealed by our experiments. We performed computational modelling and docking simulation studies involving the protein encoded by Rv0079 followed by in vitro translation and growth curve analysis experiments, involving recombinant E. coli and Bacille Calmette Guérin (BCG) strains that overexpressed Rv0079. Our observations concerning the interaction of the protein with the ribosomes are supportive of its role in regulation/inhibition of translation. We propose that the protein encoded by locus Rv0079 is a ‘dormancy associated translation inhibitor’ or DATIN

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Publikationsserver des Robert Koch-Instituts

Transcript Expression Analysis of Putative Trypanosoma brucei GPI-Anchored Surface Proteins during Development in the Tsetse and Mammalian Hosts

Author: A Acosta-Serrano
A Pierleoni
A Rojas
AC Ivens
Amy F. Savage
AP Jackson
B Eisenhaber
B Eisenhaber
CS Peacock
D Steverding
DP Nolan
E Vassella
EM Cordero
ES Nakayasu
G Poisson
Gustavo C. Cerqueira
Jesus G. Valenzuela
K Julenius
K Vickerman
KM Esser
LJ Morrison
M Berriman
M Cserzo
MA Ferguson
MG Paulick
MJ Lenardo
N Blom
NA Stephens
Najib M. El Sayed
NG Kolev
NM El-Sayed
OA Weisz
P Urquhart
R Sharma
S Aksoy
S Aksoy
S Chatterjee
S Pang
S Ruepp
S Urwyler
Sandesh Regmi
Serap Aksoy
SK Moloo
SM Lanham
SM Lanham
T Kinoshita
TK Smith
TN Petersen
Yineng Wu
Publication venue: Public Library of Science
Publication date: 19/06/2012
Field of study

Human African Trypanosomiasis is a devastating disease caused by the parasite Trypanosoma brucei. Trypanosomes live extracellularly in both the tsetse fly and the mammal. Trypanosome surface proteins can directly interact with the host environment, allowing parasites to effectively establish and maintain infections. Glycosylphosphatidylinositol (GPI) anchoring is a common posttranslational modification associated with eukaryotic surface proteins. In T. brucei, three GPI-anchored major surface proteins have been identified: variant surface glycoproteins (VSGs), procyclic acidic repetitive protein (PARP or procyclins), and brucei alanine rich proteins (BARP). The objective of this study was to select genes encoding predicted GPI-anchored proteins with unknown function(s) from the T. brucei genome and characterize the expression profile of a subset during cyclical development in the tsetse and mammalian hosts. An initial in silico screen of putative T. brucei proteins by Big PI algorithm identified 163 predicted GPI-anchored proteins, 106 of which had no known functions. Application of a second GPI-anchor prediction algorithm (FragAnchor), signal peptide and trans-membrane domain prediction software resulted in the identification of 25 putative hypothetical proteins. Eighty-one gene products with hypothetical functions were analyzed for stage-regulated expression using semi-quantitative RT-PCR. The expression of most of these genes were found to be upregulated in trypanosomes infecting tsetse salivary gland and proventriculus tissues, and 38% were specifically expressed only by parasites infecting salivary gland tissues. Transcripts for all of the genes specifically expressed in salivary glands were also detected in mammalian infective metacyclic trypomastigotes, suggesting a possible role for these putative proteins in invasion and/or establishment processes in the mammalian host. These results represent the first large-scale report of the differential expression of unknown genes encoding predicted T. brucei surface proteins during the complete developmental cycle. This knowledge may form the foundation for the development of future novel transmission blocking strategies against metacyclic parasites

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

BB0172, a Borrelia burgdorferi Outer Membrane Protein That Binds Integrin Α3Β1

Author: Antonara S
Arnaout MA
Askari JA
Bano-Polo M
Behera AK
Bernardo A
Bjerketorp J
Brown EL
Bryksin AV
Byrne MF
Castaman G
Coburn J
Coburn J
Coburn J
Coburn J
Cserzo M
DiPersio CM
Dulabon L
Emsley J
Esteve-Gassent MD
Fikrig E
Fischer JR
Glasner J
Gross DM
Guo BP
Hedin LE
Hessa T
Hessa T
Hirokawa T
Kenedy MR
Kiefer H
Kim JH
Kreidberg JA
Kreidberg JA
Loftus JC
Martinez-Gil L
Martinez-Gil L
Martinez-Gil L
Martinez-Gil L
Nilsson M
Nishio K
Pikas DS
Pimanda JE
Pinto AF
Plow EF
Ponting CP
Porter S
Probert WS
Raibaud S
Ruoslahti E
Saaf A
Sadler JE
Seshu J
Skare JT
Takada Y
Takagi J
Tamborero S
Vilar M
Xie L
Zheng X
Zheng X
Publication venue: 'American Society for Microbiology'
Publication date: 01/01/2013
Field of study

Lyme disease is a multisystemic disorder caused by Borrelia burgdorferi infection. Upon infection, some B. burgdorferi genes are upregulated, including members of the microbial surface components recognizing adhesive matrix molecule (MSCRAMM) protein family, which facilitate B. burgdorferi adherence to extracellular matrix components of the host. Comparative genome analysis has revealed a new family of B. burgdorferi proteins containing the von Willebrand factor A (vWFA) domain. In the present study, we characterized the expression and membrane association of the vWFA domain-containing protein BB0172 by using in vitro transcription/translation systems in the presence of microsomal membranes and with detergent phase separation assays. Our results showed evidence of BB0172 localization in the outer membrane, the orientation of the vWFA domain to the extracellular environment, and its function as a metal ion-dependent integrin-binding protein. This is the first report of a borrelial adhesin with a metal ion-dependent adhesion site (MIDAS) motif that is similar to those observed in eukaryotic integrins and has a similar function

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori d'Objectes Digitals per a l'Ensenyament la Recerca i la Cultura

Texas A&M Repository

PubMed Central

More Than 1,001 Problems with Protein Domain Databases: Transmembrane Regions, Signal Peptides and the Issue of Sequence Homology

Author: A Andreeva
A Bahr
A Bateman
A Bateman
A Bernsel
A Kihara
A Klug
A Marchler-Bauer
A Stojmirovic
AA Schaffer
AA Schaffer
AE Todd
AG Murzin
AL Cuff
AM Schnoes
AM Settles
B Eisenhaber
B Eisenhaber
B Eisenhaber
B Eisenhaber
B Scheres
C Bru
C Sander
C Xu
CA Ouzounis
CH Wu
CP Ponting
CP Ponting
CP Ponting
D Devos
D Ivanov
D Wilson
DA Uwanogho
DE de Oliveira
DL Burgess
E Portugaly
EL Sonnhammer
EL Sonnhammer
F Eisenhaber
F Eisenhaber
F Eisenhaber
Frank Eisenhaber
G Schneider
GC Clark
GE Tusnady
H Ashida
H Johansson
H Mi
H Nielsen
HS Ooi
I Letunic
IL Alberts
J Abendroth
J Gough
J Kota
J Ren
J Schultz
J Schultz
JC McNulty
JC Pizarro
JC Wootton
JD Bendtsen
JD Selengut
JG Henikoff
JH Weiner
JH Zar
JI Shin
JK Tie
L Aravind
L Kall
L Kall
L Sun
L Zhang
LF Ciufo
LJ Smith
M Cserzo
M Cserzo
M Fukuda
M Gruber
M Hedman
M Ikeda
MH Saier Jr
MR Yen
N Hulo
N Kageyama-Yahara
O Leon
P Bork
P Bork
P Bork
P Tompa
P Tompa
PH Krebsbach
Philip E. Bourne
R Albrecht
R Durbin
R Janssen
R Watanabe
RD Finn
RF Doolittle
RR Copley
RW Hooft
S Henikoff
S Iuchi
S Ohnishi
S Veretnik
SA Weston
Sebastian Maurer-Stroh
SF Altschul
SF Altschul
SJ Sammut
SR Eddy
SR Eddy
SS Krishna
T Nakai
TA Holland
TK Attwood
V Anantharaman
V Brendel
VV Lunin
W Li
W Verelst
Wing-Cheong Wong
WR Gilks
WR Gilks
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Large-scale genome sequencing gained general importance for life science because functional annotation of otherwise experimentally uncharacterized sequences is made possible by the theory of biomolecular sequence homology. Historically, the paradigm of similarity of protein sequences implying common structure, function and ancestry was generalized based on studies of globular domains. Having the same fold imposes strict conditions over the packing in the hydrophobic core requiring similarity of hydrophobic patterns. The implications of sequence similarity among non-globular protein segments have not been studied to the same extent; nevertheless, homology considerations are silently extended for them. This appears especially detrimental in the case of transmembrane helices (TMs) and signal peptides (SPs) where sequence similarity is necessarily a consequence of physical requirements rather than common ancestry. Thus, matching of SPs/TMs creates the illusion of matching hydrophobic cores. Therefore, inclusion of SPs/TMs into domain models can give rise to wrong annotations. More than 1001 domains among the 10,340 models of Pfam release 23 and 18 domains of SMART version 6 (out of 809) contain SP/TM regions. As expected, fragment-mode HMM searches generate promiscuous hits limited to solely the SP/TM part among clearly unrelated proteins. More worryingly, we show explicit examples that the scores of clearly false-positive hits, even in global-mode searches, can be elevated into the significance range just by matching the hydrophobic runs. In the PIR iProClass database v3.74 using conservative criteria, we find that at least between 2.1% and 13.6% of its annotated Pfam hits appear unjustified for a set of validated domain models. Thus, false-positive domain hits enforced by SP/TM regions can lead to dramatic annotation errors where the hit has nothing in common with the problematic domain model except the SP/TM region itself. We suggest a workflow of flagging problematic hits arising from SP/TM-containing models for critical reconsideration by annotation users

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

Topological Analysis of Small Leucine-Rich Repeat Proteoglycan Nyctalopin

Author: A Krogh
A Nomura
A Pierleoni
AJ Denzer
B Eisenhaber
B Kobe
C Appenzeller-Herzog
C Garnier
C Koike
C Zeitz
CM Pusch
CW Morgans
E Brandan
E O’Connor
GE Tusnady
I Hack
I Saraogi
I Stagljar
I Stagljar
J Bella
J Dancourt
J Liu
JC Lee
JH Brandstatter
JM Wahlberg
JM Windisch
JN Pearring
JX Zhu
K Hofmann
L Schaefer
LA Meadows
LH Jiang
M Cserzo
M Felkl
M Sakaguchi
M Yamashita
MA De Matteis
MG Claros
MM Le Goff
N Johnsson
N Perrimon
N Takahashi
N Vardi
Nick Gay
NT Bech-Hansen
Pasano Bojang
PF Egea
PG Scott
PG Scott
PG Scott
RA Shiells
RE Dalbey
RG Gregg
RG Gregg
Ronald G. Gregg
RV Iozzo
S Heesen
S Nawy
SH DeVries
T Hirokawa
W Cheng
Y Cao
Y Shen
Publication venue: Public Library of Science
Publication date: 02/04/2012
Field of study

Nyctalopin is a small leucine rich repeat proteoglycan (SLRP) whose function is critical for normal vision. The absence of nyctalopin results in the complete form of congenital stationary night blindness. Normally, glutamate released by photoreceptors binds to the metabotropic glutamate receptor type 6 (GRM6), which through a G-protein cascade closes the non-specific cation channel, TRPM1, on the dendritic tips of depolarizing bipolar cells (DBCs) in the retina. Nyctalopin has been shown to interact with TRPM1 and expression of TRPM1 on the dendritic tips of the DBCs is dependent on nyctalopin expression. In the current study, we used yeast two hybrid and biochemical approaches to investigate whether murine nyctalopin was membrane bound, and if so by what mechanism, and also whether the functional form was as a homodimer. Our results show that murine nyctalopin is anchored to the plasma membrane by a single transmembrane domain, such that the LRR domain is located in the extracellular space

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Membrane Topology and Predicted RNA-Binding Function of the ‘Early Responsive to Dehydration (ERD4)’ Plant Protein

Functional annotation of uncharacterized genes is the main focus of computational methods in the post genomic era. These tools search for similarity between proteins on the premise that those sharing sequence or structural motifs usually perform related functions, and are thus particularly useful for membrane proteins. Early responsive to dehydration (ERD) genes are rapidly induced in response to dehydration stress in a variety of plant species. In the present work we characterized function of Brassica juncea ERD4 gene using computational approaches. The ERD4 protein of unknown function possesses ubiquitous DUF221 domain (residues 312–634) and is conserved in all plant species. We suggest that the protein is localized in chloroplast membrane with at least nine transmembrane helices. We detected a globular domain of 165 amino acid residues (183–347) in plant ERD4 proteins and expect this to be posited inside the chloroplast. The structural-functional annotation of the globular domain was arrived at using fold recognition methods, which suggested in its sequence presence of two tandem RNA-recognition motif (RRM) domains each folded into βαββαβ topology. The structure based sequence alignment with the known RNA-binding proteins revealed conservation of two non-canonical ribonucleoprotein sub-motifs in both the putative RNA-recognition domains of the ERD4 protein. The function of highly conserved ERD4 protein may thus be associated with its RNA-binding ability during the stress response. This is the first functional annotation of ERD4 family of proteins that can be useful in designing experiments to unravel crucial aspects of stress tolerance mechanism

CiteSeerX

Crossref

Directory of Open Access Journals

PubMed Central

FigShare