Search CORE

237 research outputs found

Gene conversion in human rearranged immunoglobulin genes

Author: A Martin
B Elliott
C-L Parng
CB Kobrin
CR Winstead
D Lenze
D Sehgal
David I. Stott
E Selsing
EJ Steenbergen
EJ Steenbergen
EJ Steenbergen
H Arakawa
H Zan
HF Tsai
IB Rogozin
IB Rogozin
JM Darlow
John M. Darlow
K Itoh
LJ Wysocki
M Reth
MR Lucier
MS Neuberger
N D’Avirro
N D’Avirro
PC Wilson
PD Weinstein
R Dildrop
R Kleinfield
R Rosenquist
R Wasserman
RC Beale
RS Becker
SJ Foster
U Krawinkel
V David
WT McCormack
Y Choi
Z Zhang
É Gontier
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/05/2006
Field of study

Over the past 20 years, many DNA sequences have been published suggesting that all or part of the VH segment of a rearranged immunoglobulin gene may be replaced in vivo. Two different mechanisms appear to be operating. One of these is very similar to primary V(D)J recombination, involving the RAG proteins acting upon recombination signal sequences, and this has recently been proven to occur. Other sequences, many of which show partial VH replacements with no addition of untemplated nucleotides at the VH–VH joint, have been proposed to occur by an unusual RAG-mediated recombination with the formation of hybrid (coding-to-signal) joints. These appear to occur in cells already undergoing somatic hypermutation in which, some authors are convinced, RAG genes are silenced. We recently proposed that the latter type of VH replacement might occur by homologous recombination initiated by the activity of AID (activation-induced cytidine deaminase), which is essential for somatic hypermutation and gene conversion. The latter has been observed in other species, but not in human Ig genes, so far. In this paper, we present a new analysis of sequences published as examples of the second type of rearrangement. This not only shows that AID recognition motifs occur in recombination regions but also that some sequences show replacement of central sections by a sequence from another gene, similar to gene conversion in the immunoglobulin genes of other species. These observations support the proposal that this type of rearrangement is likely to be AID-mediated rather than RAG-mediated and is consistent with gene conversion

Crossref

Enlighten

The Life-Cycle of Operons

Author: Adam P Arkin
de Hoon MJ Makita Y, Nakai K, Miyano S
Eric J Alm
Ivan Matic
Morgan N Price
Rogozin IB Makarova KS, Murvai J, Czabarka E, Wolf YI, et al.
Publication venue: Public Library of Science
Publication date: 18/11/2005
Field of study

Operons are a major feature of all prokaryotic genomes, but how and why operon structures vary is not well understood. To elucidate the life-cycle of operons, we compared gene order between Escherichia coli K12 and its relatives and identified the recently formed and destroyed operons in E. coli. This allowed us to determine how operons form, how they become closely spaced, and how they die. Our findings suggest that operon evolution may be driven by selection on gene expression patterns. First, both operon creation and operon destruction lead to large changes in gene expression patterns. For example, the removal of lysA and ruvA from ancestral operons that contained essential genes allowed their expression to respond to lysine levels and DNA damage, respectively. Second, some operons have undergone accelerated evolution, with multiple new genes being added during a brief period. Third, although genes within operons are usually closely spaced because of a neutral bias toward deletion and because of selection against large overlaps, genes in highly expressed operons tend to be widely spaced because of regulatory fine-tuning by intervening sequences. Although operon evolution may be adaptive, it need not be optimal: new operons often comprise functionally unrelated genes that were already in proximity before the operon formed

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

UNT Digital Library

A Detailed History of Intron-rich Eukaryotic Ancestors Inferred from a Global Survey of 100 Complete Genomes

Author: A Fedorov
A Fedorov
AB Rose
AG Simpson
BJ Blencowe
Chris P. Ponting
CP Robert
DC Jeffares
E Mossel
ET Wang
Eugene V. Koonin
EV Koonin
F Denoeud
F Lejeune
F Rodriguez-Trelles
G Ast
G Neu-Yilik
H Keren
H Le Hir
HD Nguyen
IB Rogozin
IB Rogozin
Igor B. Rogozin
J Felsenstein
J Muller
JE Nixon
JE Stajich
JS Farris
L Carmel
L Carmel
L Collins
LK Fritz-Laylin
M Csuros
M Csuros
M Csuros
M Irimia
M Irimia
M Lynch
M Lynch
Miklos Csuros
MP Hoeppner
PJ Keeling
PJ Keeling
R Nielsen
S Vanacova
SM Adl
SW Roy
SW Roy
SW Roy
SW Roy
T Mourier
W Li
WK Hastings
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

Protein-coding genes in eukaryotes are interrupted by introns, but intron densities widely differ between eukaryotic lineages. Vertebrates, some invertebrates and green plants have intron-rich genes, with 6–7 introns per kilobase of coding sequence, whereas most of the other eukaryotes have intron-poor genes. We reconstructed the history of intron gain and loss using a probabilistic Markov model (Markov Chain Monte Carlo, MCMC) on 245 orthologous genes from 99 genomes representing the three of the five supergroups of eukaryotes for which multiple genome sequences are available. Intron-rich ancestors are confidently reconstructed for each major group, with 53 to 74% of the human intron density inferred with 95% confidence for the Last Eukaryotic Common Ancestor (LECA). The results of the MCMC reconstruction are compared with the reconstructions obtained using Maximum Likelihood (ML) and Dollo parsimony methods. An excellent agreement between the MCMC and ML inferences is demonstrated whereas Dollo parsimony introduces a noticeable bias in the estimations, typically yielding lower ancestral intron densities than MCMC and ML. Evolution of eukaryotic genes was dominated by intron loss, with substantial gain only at the bases of several major branches including plants and animals. The highest intron density, 120 to 130% of the human value, is inferred for the last common ancestor of animals. The reconstruction shows that the entire line of descent from LECA to mammals was intron-rich, a state conducive to the evolution of alternative splicing

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Sm/Lsm Genes Provide a Glimpse into the Early Evolution of the Spliceosome

Author: A Drummond
A Roesner
A Stoltzfus
Andrey Rzhetsky
AR Robart
AV Sverdlov
AV Sverdlov
C Gille
C Kambach
Christopher Wills
CJ Wilusz
D Schümperli
I Törö
I Törö
IB Rogozin
IB Rogozin
J Yong
JD Beggs
JM Archibald
JS Nielsen
K Nagai
KS Makarova
L Aravind
L Carmel
L Collins
L Lo Conte
M Hetzer
M Hochstrasser
M Seetharaman
MA Schumacher
MP Spiller
MR Lerner
MT Franze de Fernandez
N Toor
P Khusial
PA Sharp
Philip E. Bourne
Philippe Youkharibache
R Edgar
R Tarrío
Ruben E. Valas
S Guindon
S Lin-Chao
S Thore
S Valadkhan
Stella Veretnik
TR Cech
U Narayanan
V Agrawal
W Martin
W Qiu
Y Sato
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

The spliceosome, a sophisticated molecular machine involved in the removal of intervening sequences from the coding sections of eukaryotic genes, appeared and subsequently evolved rapidly during the early stages of eukaryotic evolution. The last eukaryotic common ancestor (LECA) had both complex spliceosomal machinery and some spliceosomal introns, yet little is known about the early stages of evolution of the spliceosomal apparatus. The Sm/Lsm family of proteins has been suggested as one of the earliest components of the emerging spliceosome and hence provides a first in-depth glimpse into the evolving spliceosomal apparatus. An analysis of 335 Sm and Sm-like genes from 80 species across all three kingdoms of life reveals two significant observations. First, the eukaryotic Sm/Lsm family underwent two rapid waves of duplication with subsequent divergence resulting in 14 distinct genes. Each wave resulted in a more sophisticated spliceosome, reflecting a possible jump in the complexity of the evolving eukaryotic cell. Second, an unusually high degree of conservation in intron positions is observed within individual orthologous Sm/Lsm genes and between some of the Sm/Lsm paralogs. This suggests that functional spliceosomal introns existed before the emergence of the complete Sm/Lsm family of proteins; hence, spliceosomal machinery with considerably fewer components than today's spliceosome was already functional

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Phylogenomics: Gene Duplication, Unrecognized Paralogy and Outgroup Choice

Author: A McLysaght
C Zheng
D Bryant
D Penny
DH Huson
DJ Zwickl
DR Scannell
DR Scannell
E Mossel
H Philippe
IB Rogozin
JP Huelsenbeck
JW Leigh
KH Wolfe
M Goodman
M Hendy
M Lynch
M Pagel
M Sémon
MA Fares
MW Hahn
Niyaz Ahmed
Scott William Roy
WM Fitch
WP Maddison
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Comparative genomics has revealed the ubiquity of gene and genome duplication and subsequent gene loss. In the case of gene duplication and subsequent loss, gene trees can differ from species trees, thus frequent gene duplication poses a challenge for reconstruction of species relationships. Here I address the case of multi-gene sets of putative orthologs that include some unrecognized paralogs due to ancestral gene duplication, and ask how outgroups should best be chosen to reduce the degree of non-species tree (NST) signal. Consideration of expected internal branch lengths supports several conclusions: (i) when a single outgroup is used, the degree of NST signal arising from gene duplication is either independent of outgroup choice, or is minimized by use of a maximally closely related post-duplication (MCRPD) outgroup; (ii) when two outgroups are used, NST signal is minimized by using one MCRPD outgroup, while the position of the second outgroup is of lesser importance; and (iii) when two outgroups are used, the ability to detect gene trees that are inconsistent with known aspects of the species tree is maximized by use of one MCRPD, and is either independent of the position of the second outgroup, or is maximized for a more distantly related second outgroup. Overall, these results generalize the utility of closely-related outgroups for phylogenetic analysis

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Intron Evolution: Testing Hypotheses of Intron Evolution Using the Phylogenomics of Tetraspanins

BACKGROUND: Although large scale informatics studies on introns can be useful in making broad inferences concerning patterns of intron gain and loss, more specific questions about intron evolution at a finer scale can be addressed using a gene family where structure and function are well known. Genome wide surveys of tetraspanins from a broad array of organisms with fully sequenced genomes are an excellent means to understand specifics of intron evolution. Our approach incorporated several new fully sequenced genomes that cover the major lineages of the animal kingdom as well as plants, protists and fungi. The analysis of exon/intron gene structure in such an evolutionary broad set of genomes allowed us to identify ancestral intron structure in tetraspanins throughout the eukaryotic tree of life. METHODOLOGY/PRINCIPAL FINDINGS: We performed a phylogenomic analysis of the intron/exon structure of the tetraspanin protein family. In addition, to the already characterized tetraspanin introns numbered 1 through 6 found in animals, three additional ancient, phase 0 introns we call 4a, 4b and 4c were found. These three novel introns in combination with the ancestral introns 1 to 6, define three basic tetraspanin gene structures which have been conserved throughout the animal kingdom. Our phylogenomic approach also allows the estimation of the time at which the introns of the 33 human tetraspanin paralogs appeared, which in many cases coincides with the concomitant acquisition of new introns. On the other hand, we observed that new introns (introns other than 1-6, 4a, b and c) were not randomly inserted into the tetraspanin gene structure. The region of tetraspanin genes corresponding to the small extracellular loop (SEL) accounts for only 10.5% of the total sequence length but had 46% of the new animal intron insertions. CONCLUSIONS/SIGNIFICANCE: Our results indicate that tests of intron evolution are strengthened by the phylogenomic approach with specific gene families like tetraspanins. These tests add to our understanding of genomic innovation coupled to major evolutionary divergence events, functional constraints and the timing of the appearance of evolutionary novelty

Crossref

Directory of Open Access Journals

PubMed Central

Intron Dynamics in Ribosomal Protein Genes

Author: A Nakao
AG Russell
CJ Venter
D Brett
DC Jeffares
DH Nguyen
EM Zdobnov
ES Maxwell
FU Battistuzzi
H Philippe
HD Nguyen
Hung D. Nguyen
IB Rogozin
IG Wool
J Felsenstein
JD Thompson
JE Nixon
JS Mattick
KT Tycowski
M Csürös
M Yoshihama
M Yoshihama
Maki Yoshihama
N Kenmochi
Naoya Kenmochi
Oliver Hofmann
P Andolfatto
SB Hedges
SW Roy
SW Roy
VN Babenko
Publication venue: Public Library of Science
Publication date: 03/01/2007
Field of study

The role of spliceosomal introns in eukaryotic genomes remains obscure. A large scale analysis of intron presence/absence patterns in many gene families and species is a necessary step to clarify the role of these introns. In this analysis, we used a maximum likelihood method to reconstruct the evolution of 2,961 introns in a dataset of 76 ribosomal protein genes from 22 eukaryotes and validated the results by a maximum parsimony method. Our results show that the trends of intron gain and loss differed across species in a given kingdom but appeared to be consistent within subphyla. Most subphyla in the dataset diverged around 1 billion years ago, when the “Big Bang” radiation occurred. We speculate that spliceosomal introns may play a role in the explosion of many eukaryotes at the Big Bang radiation

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

A proteogenomic update to Yersinia: enhancing genome annotation

Author: AJ Link
AM Frank
C Ansong
C Sacerdot
C Wei
CA Ouzounis
D Perlman
IB Rogozin
J Crabtree
JD Bendtsen
JD Jaffe
JE Elias
M Aivaliotis
M Baudet
M Mann
N Gupta
NE Castellana
PR Jungblut
PS Chain
R Pieper
R Pieper
R Pieper
Rembert Pieper
RR Brubaker
S Gallien
S Tanner
Samuel H Payne
Shih-Ting Huang
SL Salzberg
T Dandekar
T Gaasterland
W Deng
Publication venue: BioMed Central
Publication date: 01/08/2010
Field of study

Abstract Background Modern biomedical research depends on a complete and accurate proteome. With the widespread adoption of new sequencing technologies, genome sequences are generated at a near exponential rate, diminishing the time and effort that can be invested in genome annotation. The resulting gene set contains numerous errors in even the most basic form of annotation: the primary structure of the proteins. Results The application of experimental proteomics data to genome annotation, called proteogenomics, can quickly and efficiently discover misannotations, yielding a more accurate and complete genome annotation. We present a comprehensive proteogenomic analysis of the plague bacterium, <it>Yersinia pestis KIM</it>. We discover non-annotated genes, correct protein boundaries, remove spuriously annotated ORFs, and make major advances towards accurate identification of signal peptides. Finally, we apply our data to 21 other <it>Yersinia </it>genomes, correcting and enhancing their annotations. Conclusions In total, 141 gene models were altered and have been updated in RefSeq and Genbank, which can be accessed seamlessly through any NCBI tool (e.g. blast) or downloaded directly. Along with the improved gene models we discover new, more accurate means of identifying signal peptides in proteomics data.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Localization of a bacterial group II intron-encoded protein in human cells

Author: A Giacomini
AM Lambowitz
AM Lambowitz
AM Pyle
DL Spector
E Muñoz-Adelantado
F Martínez-Abarca
F Michel
F Michel
F Michel
F Zhuang
FM García-Rodríguez
FM García-Rodríguez
G Qu
H Guo
I Chillon
IB Rogozin
J San Filippo
JD Boeke
KS Keating
M Mastroianni
MD Molina-Sánchez
MD Molina-Sánchez
N Toro
N Toro
R Nisa-Martínez
R Nisa-Martínez
RZ Jurkowska
T Cavalier-Smith
VR Chalamcharla
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/08/2015
Field of study

Group II introns are mobile retroelements that self-splice from precursor RNAs to form ribonucleoparticles (RNP), which can invade new specific genomic DNA sites. This specificity can be reprogrammed, for insertion into any desired DNA site, making these introns useful tools for bacterial genetic engineering. However, previous studies have suggested that these elements may function inefficiently in eukaryotes. We investigated the subcellular distribution, in cultured human cells, of the protein encoded by the group II intron RmInt1 (IEP) and several mutants. We created fusions with yellow fluorescent protein (YFP) and with a FLAG epitope. We found that the IEP was localized in the nucleus and nucleolus of the cells. Remarkably, it also accumulated at the periphery of the nuclear matrix. We were also able to identify spliced lariat intron RNA, which co-immunoprecipitated with the IEP, suggesting that functional RmInt1 RNPs can be assembled in cultured human cells.This work was supported by research grants CSD 2009–0006 from the Consolider-Ingenio, BIO2011-24401 and BIO2014-51953-P from the Spanish Ministerio de Economía y Competitividad all including ERDF (European Regional Development Funds). We thank Dr. Antonio Barrientos Durán for technical advice. MRC was supported by an FPI Ph.D grant. J.L.G.P´s laboratory is supported by CICE-FEDER-P09-CTS-4980, CICE-FEDER-P12-CTS-2256, Plan Nacional de I+D+I 2008–2011 and 2013–2016 (FIS-FEDER-PI11/01489 and FIS-FEDER-PI14/02152), PCIN-2014-115-ERA-NET NEURON II, the European Research Council (ERC-Consolidator ERC-STG-2012-233764) and by an International Early Career Scientist grant from the Howard Hughes Medical Institute (IECS-55007420).Peer Reviewe

Crossref

PubMed Central

Edinburgh Research Explorer

Digital.CSIC

AUG_hairpin: prediction of a downstream secondary structure influencing the recognition of a translation start site

Author: Akinori Sarai
Alex V Kochetov
Andrey Palyanov
AV Kochetov
AV Kochetov
AV Kochetov
AV Pisarev
C Touriol
Dmitry Grigorovich
HS Kwon
I Ventoso
IB Rogozin
Igor I Titov
IL Hofacker
IM Meyer
JL Riechmann
JS McCaskill
K Clyde
K Takahashi
K-N Zhao
L Yang
M Ciullo
M Kozak
M Kozak
M Kozak
M Lukaszewicz
M Nguyen
Nikolay A Kolchanov
RJ Jackson
SA Shabalina
SA Shabalina
SD Baird
SV Sawant
W-L Hwang
Y Kobayashi
Publication venue: BioMed Central
Publication date: 01/08/2007
Field of study

Abstract Background The translation start site plays an important role in the control of translation efficiency of eukaryotic mRNAs. The recognition of the start AUG codon by eukaryotic ribosomes is considered to depend on its nucleotide context. However, the fraction of eukaryotic mRNAs with the start codon in a suboptimal context is relatively large. It may be expected that mRNA should possess some features providing efficient translation, including the proper recognition of a translation start site. It has been experimentally shown that a downstream hairpin located in certain positions with respect to start codon can compensate in part for the suboptimal AUG context and also increases translation from non-AUG initiation codons. Prediction of such a compensatory hairpin may be useful in the evaluation of eukaryotic mRNA translation properties. Results We evaluated interdependency between the start codon context and mRNA secondary structure at the CDS beginning: it was found that a suboptimal start codon context significantly correlated with higher base pairing probabilities at positions 13 – 17 of CDS of human and mouse mRNAs. It is likely that the downstream hairpins are used to enhance translation of some mammalian mRNAs <it>in vivo</it>. Thus, we have developed a tool, <it>AUG_hairpin</it>, to predict local stem-loop structures located within the defined region at the beginning of mRNA coding part. The implemented algorithm is based on the available published experimental data on the CDS-located stem-loop structures influencing the recognition of upstream start codons. Conclusion An occurrence of a potential secondary structure downstream of start AUG codon in a suboptimal context (or downstream of a potential non-AUG start codon) may provide researchers with a testable assumption on the presence of additional regulatory signal influencing mRNA translation initiation rate and the start codon choice. <it>AUG_hairpin</it>, which has a convenient Web-interface with adjustable parameters, will make such an evaluation easy and efficient.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central