Search CORE

21 research outputs found

GLADX: An Automated Approach to Analyze the Lineage-Specific Loss and Pseudogenization of Genes

Author: A Mitchell
A Varki
AL Hughes
AL Hughes
B Paten
CM Zmasek
CP Ponting
D Brawand
D Derouet
D Sankoff
D Sankoff
DM Altshuler
DM Krylov
EM Zdobnov
EV Koonin
EW Sayers
F Ronquist
GP Paganini J
IK Jordan
J Zhu
Jacques Dainat
JH Degnan
Julien Paganini
L Aravind
L Zhang
M Brudno
M Goodman
M Lynch
M Nishikimi
M Oda
M Olson
MW Hahn
NA Moran
O Eulenstein
P Chamero
P Flicek
P Gouret
P Gouret
P Gouret
Philippe Gouret
Pierre Pontarotti
RDM Page
RR Freimuth
S Kuraku
SB Needleman
Sergios-Orestis Kolokotronis
SF Altschul
W-H Li
X Wang
XW Wu
Y Go
Z Yang
ZD Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

A well-established ancestral gene can usually be found, in one or multiple copies, in different descendant species. Sometimes during the course of evolution, all the representatives of a well-established ancestral gene disappear in specific lineages; such gene losses may occur in the genome by deletion of a DNA fragment or by pseudogenization. The loss of an entire gene family in a given lineage may reflect an important phenomenon, and could be due either to adaptation, or to a relaxation of selection that leads to neutral evolution. Therefore, the lineage-specific gene loss analyses are important to improve the understanding of the evolutionary history of genes and genomes. In order to perform this kind of study from the increasing number of complete genome sequences available, we developed a unique new software module called GLADX in the DAGOBAH framework, based on a comparative genomic approach. The software is able to automatically detect, for all the species of a phylum, the presence/absence of a representative of a well-established ancestral gene, and by systematic steps of re-annotation, confirm losses, detect and analyze pseudogenes and find novel genes. The approach is based on the use of highly reliable gene phylogenies, of protein predictions and on the analysis of genomic mutations. All the evidence associated to evolutionary approach provides accurate information for building an overall view of the evolution of a given gene in a selected phylum. The reliability of GLADX has been successfully tested on a benchmark analysis of 14 reported cases. It is the first tool that is able to fully automatically study the lineage-specific losses and pseudogenizations. GLADX is available at http://ioda.univ-provence.fr/IodaSite/gladx/

Public Library of Science (PLOS)

Crossref

HAL AMU

Directory of Open Access Journals

PubMed Central

More Than 1,001 Problems with Protein Domain Databases: Transmembrane Regions, Signal Peptides and the Issue of Sequence Homology

Author: A Andreeva
A Bahr
A Bateman
A Bateman
A Bernsel
A Kihara
A Klug
A Marchler-Bauer
A Stojmirovic
AA Schaffer
AA Schaffer
AE Todd
AG Murzin
AL Cuff
AM Schnoes
AM Settles
B Eisenhaber
B Eisenhaber
B Eisenhaber
B Eisenhaber
B Scheres
C Bru
C Sander
C Xu
CA Ouzounis
CH Wu
CP Ponting
CP Ponting
CP Ponting
D Devos
D Ivanov
D Wilson
DA Uwanogho
DE de Oliveira
DL Burgess
E Portugaly
EL Sonnhammer
EL Sonnhammer
F Eisenhaber
F Eisenhaber
F Eisenhaber
Frank Eisenhaber
G Schneider
GC Clark
GE Tusnady
H Ashida
H Johansson
H Mi
H Nielsen
HS Ooi
I Letunic
IL Alberts
J Abendroth
J Gough
J Kota
J Ren
J Schultz
J Schultz
JC McNulty
JC Pizarro
JC Wootton
JD Bendtsen
JD Selengut
JG Henikoff
JH Weiner
JH Zar
JI Shin
JK Tie
L Aravind
L Kall
L Kall
L Sun
L Zhang
LF Ciufo
LJ Smith
M Cserzo
M Cserzo
M Fukuda
M Gruber
M Hedman
M Ikeda
MH Saier Jr
MR Yen
N Hulo
N Kageyama-Yahara
O Leon
P Bork
P Bork
P Bork
P Tompa
P Tompa
PH Krebsbach
Philip E. Bourne
R Albrecht
R Durbin
R Janssen
R Watanabe
RD Finn
RF Doolittle
RR Copley
RW Hooft
S Henikoff
S Iuchi
S Ohnishi
S Veretnik
SA Weston
Sebastian Maurer-Stroh
SF Altschul
SF Altschul
SJ Sammut
SR Eddy
SR Eddy
SS Krishna
T Nakai
TA Holland
TK Attwood
V Anantharaman
V Brendel
VV Lunin
W Li
W Verelst
Wing-Cheong Wong
WR Gilks
WR Gilks
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Large-scale genome sequencing gained general importance for life science because functional annotation of otherwise experimentally uncharacterized sequences is made possible by the theory of biomolecular sequence homology. Historically, the paradigm of similarity of protein sequences implying common structure, function and ancestry was generalized based on studies of globular domains. Having the same fold imposes strict conditions over the packing in the hydrophobic core requiring similarity of hydrophobic patterns. The implications of sequence similarity among non-globular protein segments have not been studied to the same extent; nevertheless, homology considerations are silently extended for them. This appears especially detrimental in the case of transmembrane helices (TMs) and signal peptides (SPs) where sequence similarity is necessarily a consequence of physical requirements rather than common ancestry. Thus, matching of SPs/TMs creates the illusion of matching hydrophobic cores. Therefore, inclusion of SPs/TMs into domain models can give rise to wrong annotations. More than 1001 domains among the 10,340 models of Pfam release 23 and 18 domains of SMART version 6 (out of 809) contain SP/TM regions. As expected, fragment-mode HMM searches generate promiscuous hits limited to solely the SP/TM part among clearly unrelated proteins. More worryingly, we show explicit examples that the scores of clearly false-positive hits, even in global-mode searches, can be elevated into the significance range just by matching the hydrophobic runs. In the PIR iProClass database v3.74 using conservative criteria, we find that at least between 2.1% and 13.6% of its annotated Pfam hits appear unjustified for a set of validated domain models. Thus, false-positive domain hits enforced by SP/TM regions can lead to dramatic annotation errors where the hit has nothing in common with the problematic domain model except the SP/TM region itself. We suggest a workflow of flagging problematic hits arising from SP/TM-containing models for critical reconsideration by annotation users

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS