Search CORE

72 research outputs found

Retrieving sequences of enzymes experimentally characterized but erroneously annotated : the case of the putrescine carbamoyltransferase

Author: A Bairoch
A Sekowska
B Barcelona-Andres
B Labedan
B Labedan
B Wargnies
C Tricot
C Vander Wauven
GH Gonnet
I Paulsen
I Schomburg
J Felsenstein
JA Gerlt
JP Simon
L Grivell
M Kanehisa
M Zuniga
PC Babbitt
PD Karp
R Apweiler
R Cunin
RJ Roon
S Dashuang
SE Brenner
T Janowitz
TA Hall
The Gene Ontology Consortium
V Stalon
Y Nakada
Y Nakada
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

BACKGROUND: Annotating genomes remains an hazardous task. Mistakes or gaps in such a complex process may occur when relevant knowledge is ignored, whether lost, forgotten or overlooked. This paper exemplifies an approach which could help to ressucitate such meaningful data. RESULTS: We show that a set of closely related sequences which have been annotated as ornithine carbamoyltransferases are actually putrescine carbamoyltransferases. This demonstration is based on the following points : (i) use of enzymatic data which had been overlooked, (ii) rediscovery of a short NH(2)-terminal sequence allowing to reannotate a wrongly annotated ornithine carbamoyltransferase as a putrescine carbamoyltransferase, (iii) identification of conserved motifs allowing to distinguish unambiguously between the two kinds of carbamoyltransferases, and (iv) comparative study of the gene context of these different sequences. CONCLUSIONS: We explain why this specific case of misannotation had not yet been described and draw attention to the fact that analogous instances must be rather frequent. We urge to be especially cautious when high sequence similarity is coupled with an apparent lack of biochemical information. Moreover, from the point of view of genome annotation, proteins which have been studied experimentally but are not correlated with sequence data in current databases qualify as "orphans", just as unassigned genomic open reading frames do. The strategy we used in this paper to bridge such gaps in knowledge could work whenever it is possible to collect a body of facts about experimental data, homology, unnoticed sequence data, and accurate informations about gene context

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

DI-fusion

GenoQuery: a new querying module for functional annotation in a genomic warehouse

Author: Altschul
B. Labedan
Bairoch
Birkland
Bryson
C. Froidevaux
Cohen-Boulakia
Durinck
Etzold
F. Lemoine
Gasteiger
Gonnet
Kanehisa
Karp
Kasprzyk
Le Bouder-Langevin
Lee
Lemoine
Lespinet
Pennisi
Sterk
Stevens
Trissl
Publication venue: Oxford University Press
Publication date: 01/07/2008
Field of study

Motivation: We have to cope with both a deluge of new genome sequences and a huge amount of data produced by high-throughput approaches used to exploit these genomic features. Crossing and comparing such heterogeneous and disparate data will help improving functional annotation of genomes. This requires designing elaborate integration systems such as warehouses for storing and querying these data

Origination of the Split Structure of Spliceosomal Genes from Random Genetic Sequences

Author: A Bhasi
AJ McCullough
AR Robart
Ashwini Bhasi
B Labedan
B Lewin
CCF Blake
Chandan Kumar Singh
D Bhattacharya
Dawn Field
F Rodríguez-Trelles
G Caetano-Anollés
H Subak-Sharpe
IB Rogozin
JD Palmer
JM Logsdon Jr
JS Mattick
L Collins
L Fedorova
M Long
M Lynch
M Lynch
M Nei
N Glansdorff
P Forterre
P Senapathy
P Senapathy
P Senapathy
P Senapathy
P Sirand-Pugnet
Periannan Senapathy
Rahul Regulapati
SJ de Souza
SJ de Souza
SJ de Souza
SW Roy
SW Roy
SW Roy
SW Roy
SW Roy
T Cavalier-Smith
W Gilbert
W Gilbert
WF Doolittle
WG Qiu
Y Lu
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

The mechanism by which protein-coding portions of eukaryotic genes came to be separated by long non-coding stretches of DNA, and the purpose for this perplexing arrangement, have remained unresolved fundamental biological problems for three decades. We report here a plausible solution to this problem based on analysis of open reading frame (ORF) length constraints in the genomes of nine diverse species. If primordial nucleic acid sequences were random in sequence, functional proteins that are innately long would not be encoded due to the frequent occurrence of stop codons. The best possible way that a long protein-coding sequence could have been derived was by evolving a split-structure from the random DNA (or RNA) sequence. Results of the systematic analyses of nine complete genome sequences presented here suggests that perhaps the major underlying structural features of split-genes have evolved due to the indigenous occurrence of split protein-coding genes in primordial random nucleotide sequence. The results also suggest that intron-rich genes containing short exons may have been the original form of genes intrinsically occurring in random DNA, and that intron-poor genes containing long exons were perhaps derived from the original intron-rich genes

CiteSeerX

Public Library of Science (PLOS)

Crossref

PubMed Central

New Insight into the Transcarbamylase Family: The Structure of Putrescine Transcarbamylase, a Key Catalyst for Fermentative Utilization of Agmatine

Author: A Galkin
AD Keefe
Angel Cantín
AR Griswold
AR Griswold
AT Brunger
B Clantin
B de Las Rivas
B Labedan
B Wargnies
C Legrain
C Vander Wauven
CW Tabor
D Shi
DE Low
DG Naumoff
DG Naumoff
DR Evans
E Krissinel
E Krissinel
EA Robey
ER Kantrowitz
Fernando Gil-Ortiz
GN Murshudov
H Tigier
H Xi
J Chen
J Massant
J Painter
JD Thompson
JL Llacer
JM Landete
JP Simon
JP Simon
L Rulisek
Laszlo Buday
LC Kuo
Luis Mariano Polo
M Kotaka
M Marshall
MD Winn
ME Jones
MM Bradford
O Gileadi
P Emsley
P Goloubinoff
P Zhang
R Cunin
RA Laskowski
S Ramón-Maiques
UK Laemmli
V Villeret
V Villeret
Vicente Rubio
WN Lipscomb
Y Aoki
Y Ha
Y Liu
Y Liu
Y Xu
Z Otwinowski
Publication venue: Public Library of Science
Publication date: 20/02/2012
Field of study

Transcarbamylases reversibly transfer a carbamyl group from carbamylphosphate (CP) to an amine. Although aspartate transcarbamylase and ornithine transcarbamylase (OTC) are well characterized, little was known about putrescine transcarbamylase (PTC), the enzyme that generates CP for ATP production in the fermentative catabolism of agmatine. We demonstrate that PTC (from Enterococcus faecalis), in addition to using putrescine, can utilize L-ornithine as a poor substrate. Crystal structures at 2.5 Å and 2.0 Å resolutions of PTC bound to its respective bisubstrate analog inhibitors for putrescine and ornithine use, N-(phosphonoacetyl)-putrescine and δ-N-(phosphonoacetyl)-L-ornithine, shed light on PTC preference for putrescine. Except for a highly prominent C-terminal helix that projects away and embraces an adjacent subunit, PTC closely resembles OTCs, suggesting recent divergence of the two enzymes. Since differences between the respective 230 and SMG loops of PTC and OTC appeared to account for the differential preference of these enzymes for putrescine and ornithine, we engineered the 230-loop of PTC to make it to resemble the SMG loop of OTCs, increasing the activity with ornithine and greatly decreasing the activity with putrescine. We also examined the role of the C-terminal helix that appears a constant and exclusive PTC trait. The enzyme lacking this helix remained active but the PTC trimer stability appeared decreased, since some of the enzyme eluted as monomers from a gel filtration column. In addition, truncated PTC tended to aggregate to hexamers, as shown both chromatographically and by X-ray crystallography. Therefore, the extra C-terminal helix plays a dual role: it stabilizes the PTC trimer and, by shielding helix 1 of an adjacent subunit, it prevents the supratrimeric oligomerizations of obscure significance observed with some OTCs. Guided by the structural data we identify signature traits that permit easy and unambiguous annotation of PTC sequences

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

RiuNet

Digital.CSIC

The Francis Crick Institute

Gene fusions and gene duplications: relevance to genomic annotation and functional analysis

Author: A Bateman
A Dautry-Varsat
A Maruya
B Labedan
C Vogel
CF Higgins
F Titgemeyer
GH Gonnet
GH Gonnet
GH Thomas
H Salgado
I Saint-Girons
IP Crawford
J Gough
JA Gerlt
JD Glasner
K Fukami-Kobayashi
LA Nahum
M El Ghachi
M Madera
M Riley
M Riley
MH Serres
MH Serres
MY Galperin
NB Vartak
P Liang
P Liang
PD Karp
PJ Piggot
R Jaggi
RM Schwartz
RR Chaudhuri
S Sundararaj
SB Needleman
SF Altschul
SY Yang
TF Smith
WR Gilks
Y Fujita
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Escherichia coli a model organism provides information for annotation of other genomes. Our analysis of its genome has shown that proteins encoded by fused genes need special attention. Such composite (multimodular) proteins consist of two or more components (modules) encoding distinct functions. Multimodular proteins have been found to complicate both annotation and generation of sequence similar groups. Previous work overstated the number of multimodular proteins in E. coli. This work corrects the identification of modules by including sequence information from proteins in 50 sequenced microbial genomes. RESULTS: Multimodular E. coli K-12 proteins were identified from sequence similarities between their component modules and non-fused proteins in 50 genomes and from the literature. We found 109 multimodular proteins in E. coli containing either two or three modules. Most modules had standalone sequence relatives in other genomes. The separated modules together with all the single (un-fused) proteins constitute the sum of all unimodular proteins of E. coli. Pairwise sequence relationships among all E. coli unimodular proteins generated 490 sequence similar, paralogous groups. Groups ranged in size from 92 to 2 members and had varying degrees of relatedness among their members. Some E. coli enzyme groups were compared to homologs in other bacterial genomes. CONCLUSION: The deleterious effects of multimodular proteins on annotation and on the formation of groups of paralogs are emphasized. To improve annotation results, all multimodular proteins in an organism should be detected and when known each function should be connected with its location in the sequence of the protein. When transferring functions by sequence similarity, alignment locations must be noted, particularly when alignments cover only part of the sequences, in order to enable transfer of the correct function. Separating multimodular proteins into module units makes it possible to generate protein groups related by both sequence and function, avoiding mixing of unrelated sequences. Organisms differ in sizes of groups of sequence-related proteins. A sample comparison of orthologs to selected E. coli paralogous groups correlates with known physiological and taxonomic relationships between the organisms

Crossref

Woods Hole Open Access Server

Springer - Publisher Connector

PubMed Central

Using Quaternary Structures to Assess the Evolutionary History of Proteins: The Case of the Aspartate Carbamoyltransferase

Author: B. Labedan
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

Inter and intraspecies comparison of microbial proteins: learning about gene ancestry, protein function and species life style

Author: Labedan B.
Lespinet O.
Publication venue: HAL CCSD
Publication date: 01/01/2006
Field of study

HAL Descartes

Hal-Diderot