Search CORE

24 research outputs found

Representative Proteomes: A Stable, Scalable and Unbiased Proteome Set for Sequence Analysis and Functional Annotation

Author: AN Nikolskaya
BE Suzek
BJ Tindall
Cathy H. Wu
Chuming Chen
Darren A. Natale
EW Sayers
H Huang
Hongzhan Huang
JF Imhoff
Jian Zhang
Jörg D. Hoheisel
P Escobar-Paramo
P Flicek
R Leinonen
R Mazumder
Raja Mazumder
Robert D. Finn
S Hunter
SJ Sammut
T Gabaldon
Publication venue: Public Library of Science
Publication date: 01/04/2011
Field of study

The accelerating growth in the number of protein sequences taxes both the computational and manual resources needed to analyze them. One approach to dealing with this problem is to minimize the number of proteins subjected to such analysis in a way that minimizes loss of information. To this end we have developed a set of Representative Proteomes (RPs), each selected from a Representative Proteome Group (RPG) containing similar proteomes calculated based on co-membership in UniRef50 clusters. A Representative Proteome is the proteome that can best represent all the proteomes in its group in terms of the majority of the sequence space and information. RPs at 75%, 55%, 35% and 15% co-membership threshold (CMT) are provided to allow users to decrease or increase the granularity of the sequence space based on their requirements. We find that a CMT of 55% (RP55) most closely follows standard taxonomic classifications. Further analysis of this set reveals that sequence space is reduced by more than 80% relative to UniProtKB, while retaining both sequence diversity (over 95% of InterPro domains) and annotation information (93% of experimentally characterized proteins). All sets can be browsed and are available for sequence similarity searches and download at http://www.proteininformationresource.org/rps, while the set of 637 RPs determined using a 55% CMT are also available for text searches. Potential applications include sequence similarity searches, protein classification and targeted protein annotation and characterization

Crossref

Directory of Open Access Journals

PubMed Central

Blueprint for a minimal photoautotrophic cell: conserved and variable genes in Synechococcus elongatus PCC 7942

Author: A Danchin
A Dufresne
A Dufresne
A Moya
A Tomitani
AN Nikolskaya
Andrés Moya
AY Mulkidjanian
C Sugita
Carmen M González-Domenech
CH Kuo
CK Holtman
E Szathmáry
EC Nowack
EV Koonin
Fernando de la Cruz
G Dong
G Pósfai
GC Kettler
GM Pao
H Tettelin
HA Schmidt
J Castresana
J Felsenstein
JE Stajich
JL Pellequer
Juli Peretó
K Tamura
KW von Nägeli
L Aravind
LB Koski
Luis Delaye
M Breitbart
M Podar
MA Ragan
María P Garcillán-Barcia
MG Langille
ML Coleman
MS Poptsova
NJ Robinson
O Zhaxybayeva
OA Koksharova
PD Karp
PJ Robinson
R Ghai
R Simm
RC Edgar
RD Finn
RDM Page
S Karlin
S Kurtz
S Waack
SF Altschul
SJ Giovannoni
SJ Giovannoni
T Shi
V Daubin
V Kolisnychenko
W Hsiao
WD Swingley
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Background: Simpler biological systems should be easier to understand and to engineer towards pre-defined goals. One way to achieve biological simplicity is through genome minimization. Here we looked for genomic islands in the fresh water cyanobacteria Synechococcus elongatus PCC 7942 (genome size 2.7 Mb) that could be used as targets for deletion. We also looked for conserved genes that might be essential for cell survival.Results: By using a combination of methods we identified 170 xenologs, 136 ORFans and 1401 core genes in the genome of S. elongatus PCC 7942. These represent 6.5%, 5.2% and 53.6% of the annotated genes respectively. We considered that genes in genomic islands could be found if they showed a combination of: a) unusual G+C content; b) unusual phylogenetic similarity; and/or c) a small number of the highly iterated palindrome 1 (HIP1) motif plus an unusual codon usage. The origin of the largest genomic island by horizontal gene transfer (HGT) could be corroborated by lack of coverage among metagenomic sequences from a fresh water microbialite. Evidence is also presented that xenologous genes tend to cluster in operons. Interestingly, most genes coding for proteins with a diguanylate cyclase domain are predicted to be xenologs, suggesting a role for horizontal gene transfer in the evolution of Synechococcus sensory systems.Conclusions: Our estimates of genomic islands in PCC 7942 are larger than those predicted by other published methods like SIGI-HMM. Our results set a guide to non-essential genes in S. elongatus PCC 7942 indicating a path towards the engineering of a model photoautotrophic bacterial cell.Financial support was provided by grants BFU2009-12895-C02-01/BMC (Ministerio de Ciencia e Innovación, Spain), the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement number 212894 and Prometeo/2009/092 (Conselleria d’Educació, Generalitat Valenciana, Spain) to A. Moya. Work in the FdlC laboratory was supported by grants BFU2008-00995/BMC (Spanish Ministry of Education), RD06/0008/1012 (RETICS research network, Instituto de Salud Carlos III, Spanish Ministry of Health) and LSHM-CT- 2005_019023 (European VI Framework Program). Dr. González-Domenech was supported by grant from the University of Granada. LD, thanks to financial support from Facultad de Ciencias, Universidad Nacional Autónoma de México

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Springer - Publisher Connector

Directory of Open Access Journals

Repositorio Institucional Universidad de Granada

PubMed Central

The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods

Author: A Andreeva
A Bairoch
A Chatr-aryamontri
A Gattiker
A Hamosh
A Kouranov
A Pitarch
AG Murzin
AL Cuff
AM Waterhouse
AN Nikolskaya
Andrei Kouranov
B Rost
BR Packer
C Bru
C Hoogland
C Hoogland
C Stark
C Vijayendran
C Yeats
CA Orengo
CF Schaefer
CF Thorn
CH Wu
D Pal
DA Benson
David I. Micallef
DH Haft
DH Haft
DL Wheeler
DS Wishart
E Chautard
E Hodis
EL Ulrich
G Evans
G Joshi-Tope
G Perriere
GA Thorisson
H Parkinson
Helen M. Berman
HM Berman
HM Berman
HM Berman
HW Mewes
I Letunic
I Mihalek
J Barthelmes
J Goll
J Schultz
J Sprague
JD Thompson
JE Celis
JJ Ward
John D. Westbrook
JT Eppig
Judith Flippen-Andersen
Juergen Haas
K Arnold
KD Pruitt
KE Rudd
Konstantin Arnold
KR Brown
L Chen
L Salwinski
L Slabinski
Lester G. Carter
Lida Gifford
Lorenza Bordoli
M Kanehisa
M Uhlen
MA Crosby
Margaret J. Gabanyi
Matthew Zimmerman
N Hulo
N Imin
P Flicek
P Shannon
Paul D. Adams
PD Karp
R Apweiler
R Karchin
R Wang
RA Laskowski
RA Laskowski
RA VanBogelen
Raship Shah
RC Edgar
RD Finn
RD Finn
RL Chisholm
S Kerrien
SL Liem
SN Twigger
ST Sherry
SY Rhee
T Bieri
T Hubbard
T Liu
The Gene Ontology Consortium
The UniProt Consortium
TK Attwood
Torsten Schwede
U Guldener
U Pieper
V Praz
William A. McLaughlin
Wladek Minor
WN Price 2nd
X Chen
XP Li
Y Ye
Yi-Ping Tao
Publication venue: Springer Netherlands
Publication date: 01/01/2011
Field of study

The Protein Structure Initiative’s Structural Biology Knowledgebase (SBKB, URL: http://sbkb.org) is an open web resource designed to turn the products of the structural genomics and structural biology efforts into knowledge that can be used by the biological community to understand living systems and disease. Here we will present examples on how to use the SBKB to enable biological research. For example, a protein sequence or Protein Data Bank (PDB) structure ID search will provide a list of related protein structures in the PDB, associated biological descriptions (annotations), homology models, structural genomics protein target status, experimental protocols, and the ability to order available DNA clones from the PSI:Biology-Materials Repository. A text search will find publication and technology reports resulting from the PSI’s high-throughput research efforts. Web tools that aid in research, including a system that accepts protein structure requests from the community, will also be described. Created in collaboration with the Nature Publishing Group, the Structural Biology Knowledgebase monthly update also provides a research library, editorials about new research advances, news, and an events calendar to present a broader view of structural genomics and structural biology

Crossref

Springer - Publisher Connector

edoc

PubMed Central

MrkH, a Novel c-di-GMP-Dependent Transcriptional Activator, Controls Klebsiella pneumoniae Biofilm Formation by Regulating Type 3 Fimbriae Expression

Author: A Abdelnour
A Heydorn
A Marchler-Bauer
A Ueda
Abigail Clements
AC Chang
Adam W. Jenney
AF Chalker
AH Sohn
AM Tarkkanen
AM Tarkkanen
Ambrose Cheung
AN Nikolskaya
AW Jenney
BL Allen
BS McCrary
Catherine E. James
CD Herring
CL Ong
CL Ong
Cynthia B. Whitchurch
D Amikam
D Mathai
DA Ryjenkov
DB Hornick
DB Hornick
DC Old
DC Old
DJ Sidote
DL Gally
F Bolivar
F Rao
F Rao
F Tao
FW Studier
G Kovacikova
G Waksman
H Mikkelsen
H Remaut
H Robinson
H Tagami
H Weinhouse
Hanwei Cao
J Benach
J Jagnow
J Ko
J Langstraat
J Marschall
J Yang
Jacinta L. Gabbe
JD Boddicker
JG Johnson
JG Malone
JH Merritt
JH Miller
Ji Yang
JL Carpenter
JL Leduc
JM Langley
Jonathan J. Wilksch
JP Duguid
JW Hickman
K Bryson
K Volz
KA Datsenko
Kirsty R. Short
KS Murakami
L Claret
L Hall-Stoodley
M Hammar
M Larkin
M Merighi
M Schembri
M van der Woude
Mark A. Schembri
Mary L. C. Chuah
MB Miller
MJ Casadaban
MS McClain
MY Galperin
Odilia L. Wijburg
P Di Martino
P Gouet
P Ross
PA Cotter
PP Cherepanov
PV Krasteva
R Hengge
R Mayer
R Simon
R Tamayo
RH Ebright
Richard A. Strugnell
RM Donlan
Rosalia Cavaliere
RP Novick
SE Lizewski
SG Grant
SP Nuccio
T Schirmer
TA Ramelot
TA Sebghati
TI Doran
Trevor Lithgow
U Gerstel
U Jenal
U Romling
U Romling
U Römling
U Römling
V de Lorenzo
VL Yu
VT Lee
W Sligl
WV Shaw
Y Qi
YM Kwon
Zhao-Xun Liang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Klebsiella pneumoniae causes significant morbidity and mortality worldwide, particularly amongst hospitalized individuals. The principle mechanism for pathogenesis in hospital environments involves the formation of biofilms, primarily on implanted medical devices. In this study, we constructed a transposon mutant library in a clinical isolate, K. pneumoniae AJ218, to identify the genes and pathways implicated in biofilm formation. Three mutants severely defective in biofilm formation contained insertions within the mrkABCDF genes encoding the main structural subunit and assembly machinery for type 3 fimbriae. Two other mutants carried insertions within the yfiN and mrkJ genes, which encode GGDEF domain- and EAL domain-containing c-di-GMP turnover enzymes, respectively. The remaining two isolates contained insertions that inactivated the mrkH and mrkI genes, which encode for novel proteins with a c-di-GMP-binding PilZ domain and a LuxR-type transcriptional regulator, respectively. Biochemical and functional assays indicated that the effects of these factors on biofilm formation accompany concomitant changes in type 3 fimbriae expression. We mapped the transcriptional start site of mrkA, demonstrated that MrkH directly activates transcription of the mrkA promoter and showed that MrkH binds strongly to the mrkA regulatory region only in the presence of c-di-GMP. Furthermore, a point mutation in the putative c-di-GMP-binding domain of MrkH completely abolished its function as a transcriptional activator. In vivo analysis of the yfiN and mrkJ genes strongly indicated their c-di-GMP-specific function as diguanylate cyclase and phosphodiesterase, respectively. In addition, in vitro assays showed that purified MrkJ protein has strong c-di-GMP phosphodiesterase activity. These results demonstrate for the first time that c-di-GMP can function as an effector to stimulate the activity of a transcriptional activator, and explain how type 3 fimbriae expression is coordinated with other gene expression programs in K. pneumoniae to promote biofilm formation to implanted medical devices

Public Library of Science (PLOS)

Crossref

OPUS - University of Technology Sydney

Directory of Open Access Journals

PubMed Central

Spiral - Imperial College Digital Repository

University of Melbourne Institutional Repository

University of Queensland eSpace

Characterization of the yehUT Two-Component Regulatory System of Salmonella enterica Serovar Typhi and Typhimurium

Author: A Krogh
AK Dubey
AN Nikolskaya
Andrew J. Page
AR Richardson
Axel Cloeckaert
BL Wanner
BR Bochner
C Kröger
CA MacLennan
Calman A. MacLennan
CE Noriega
Chris J. McGee
CO Tacket
D House
D Pickard
D Pickard
D Pickard
Derek J. Pickard
EJ Richardson
ER Kashket
GC Langridge
Gordon Dougan
H Hirakawa
H Hirakawa
H Zhang
J McFarland
J Parkhill
JE Schultz
JE Schultz
K Rutherford
K Yamamoto
KA Datsenko
Karthikeyan Sivaraman
Kathryn E. Holt
KC Kao
KE Holt
KE Holt
L Zhou
Lars Barquist
Leanne Kane
Linda J. Kenney
Louise Ellison
Lynda F. Mottram
M Abd El Ghany
M Barthel
M Dizdaroglu
M Goujon
M Rathman
MA Larkin
Mark J. Arends
OR Homann
P Maeba
P Roumagnac
Peter J. Hart
PH Blum
PM Wolanin
PP Cherepanov
PP Khil
PW Jones
R Edgar
R Gao
R Rad
RA Irizarry
RA Kingsley
RD Finn
RF Wang
Robert A. Kingsley
Ruben Bautista
S Hunter
S Mirold
S Uzzau
SA Linehan
Sally J. Kay
SE Winter
SK Hoiseth
T Barrett
T Kraxenberger
TD Schmittgen
Thomas M. Wileman
TT Perkins
TT Perkins
TT Perkins
Vanessa K. Wong
W Chan
W Deng
W Hsing
YS Ho
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

10.1371/journal.pone.0084567PLoS ONE812-POLN

Public Library of Science (PLOS)

Crossref

LSHTM Research Online

Directory of Open Access Journals

PubMed Central

Edinburgh Research Explorer

The Novartis Repository

University of Melbourne Institutional Repository

ScholarBank@NUS

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

Investigating perturbed pathway modules from gene expression data via structural equation models

Author: A Franceschini
AH Iglesias
AJ Hartemink
AL Barabási
AL Barabási
AL Sheldon
AL Tarca
AL Tarca
AN Nikolskaya
AS Chen-Plotkin
B Shipley
B Tackenberg
BS Meldrum
C Brito
C Terao
D Edwards
Daniele Pepe
DC Rao
E Kim
E Rigdon
EY Rosen
F Gilli
F Mori
F Nimmerjahn
G Csardi
G Di Paolo
GJ Rosa
GP Owens
HC von Büdingen
HT Kiiveri
I Ferrer
I Ferrer
J Kim
J Pearl
J van de Leemput
J Xie
JB Grace
KA Bollen
KA Bollen
L Lamorte
LT Hu
M Kalisch
M Kanehisa
M Mackay
M Xiong
MA Province
Mario Grassi
MC Wu
ME Adriaens
MR Hynd
MW Browne
P Khatri
P Martini
R Development Core Team: The R Stats package
R Gill
R Li
R Tibshirani
S Aburatani
SS Wright
VG Tusher
W Zhu
X Mi
X Wu
Y Benjamini
Y Gong
Y Gong
Y Rosseel
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

From sequence to enzyme mechanism using multi-label machine learning

Author: A Roussel
AN Nikolskaya
B Kerfelec
C Bru
CJA Sigrist
CS Leslie
CT Porter
CZ Cai
CZ Cai
D Aha
DH Haft
E Chea
E Spyromitros
FC Bernstein
FS Mickel
G Tsoumakas
G Tsoumakas
G Tsoumakas
G Tsoumakas
GH John
GL Holliday
GL Holliday
GL Holliday
H Mi
I Letunic
I Pedruzzi
IH Witten
J Fuernkranz
J Gough
J Mistry
J Platt
JA Barker
JG Lees
John BO Mitchell
K Choi
L Breiman
L De Ferrari
Luna De Ferrari
M Sokolova
ML Zhang
N Furnham
N Mulder
N Nagano
O Dekel
P Artimo
P Jaccard
P Rice
PW Rose
R Quinlan
RA Laskowski
RC Holte
RD Finn
S Brown
S Hunter
SB Needleman
SS Keerthi
T Hastie
T Traube
TK Attwood
U Consortium
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background: In this work we predict enzyme function at the level of chemical mechanism, providing a finer granularity of annotation than traditional Enzyme Commission (EC) classes. Hence we can predict not only whether a putative enzyme in a newly sequenced organism has the potential to perform a certain reaction, but how the reaction is performed, using which cofactors and with susceptibility to which drugs or inhibitors, details with important consequences for drug and enzyme design. Work that predicts enzyme catalytic activity based on 3D protein structure features limits the prediction of mechanism to proteins already having either a solved structure or a close relative suitable for homology modelling. Results: In this study, we evaluate whether sequence identity, InterPro or Catalytic Site Atlas sequence signatures provide enough information for bulk prediction of enzyme mechanism. By splitting MACiE (Mechanism, Annotation and Classification in Enzymes database) mechanism labels to a finer granularity, which includes the role of the protein chain in the overall enzyme complex, the method can predict at 96% accuracy (and 96% micro-averaged precision, 99.9% macro-averaged recall) the MACiE mechanism definitions of 248 proteins available in the MACiE, EzCatDb (Database of Enzyme Catalytic Mechanisms) and SFLD (Structure Function Linkage Database) databases using an off-theshelf K-Nearest Neighbours multi-label algorithm. Conclusion: We find that InterPro signatures are critical for accurate prediction of enzyme mechanism. We also find that incorporating Catalytic Site Atlas attributes does not seem to provide additional accuracy. The software code (ml2db), data and results are available online at http://sourceforge.net/projects/ml2db/ and as supplementary files.Publisher PDFPeer reviewe

Crossref

Springer - Publisher Connector

PubMed Central

Edinburgh Research Explorer

University of St. Andrews - Pure

St Andrews Research Repository