Search CORE

183 research outputs found

Evolutionary Mechanisms of Long-Term Genome Diversification Associated With Niche Partitioning in Marine Picocyanobacteria.

Author: Aury J-M
Bisch A
Brillet-Guéguen L
Choi DH
Corre E
Doré H
Eveillard D
Farrant GK
Garczarek L
Guyet U
Haguait J
Hoebeke M
Humily F
Labadie K
Le Corguillé G
Noh JH
Ostrowski M
Partensky F
Pitt FD
Ratin M
Scanlan DJ
Six C
Wincker P
Publication venue: 'Frontiers Media SA'
Publication date: 18/11/2020
Field of study

Marine picocyanobacteria of the genera Prochlorococcus and Synechococcus are the most abundant photosynthetic organisms on Earth, an ecological success thought to be linked to the differential partitioning of distinct ecotypes into specific ecological niches. However, the underlying processes that governed the diversification of these microorganisms and the appearance of niche-related phenotypic traits are just starting to be elucidated. Here, by comparing 81 genomes, including 34 new Synechococcus, we explored the evolutionary processes that shaped the genomic diversity of picocyanobacteria. Time-calibration of a core-protein tree showed that gene gain/loss occurred at an unexpectedly low rate between the different lineages, with for instance 5.6 genes gained per million years (My) for the major Synechococcus lineage (sub-cluster 5.1), among which only 0.71/My have been fixed in the long term. Gene content comparisons revealed a number of candidates involved in nutrient adaptation, a large proportion of which are located in genomic islands shared between either closely or more distantly related strains, as identified using an original network construction approach. Interestingly, strains representative of the different ecotypes co-occurring in phosphorus-depleted waters (Synechococcus clades III, WPC1, and sub-cluster 5.3) were shown to display different adaptation strategies to this limitation. In contrast, we found few genes potentially involved in adaptation to temperature when comparing cold and warm thermotypes. Indeed, comparison of core protein sequences highlighted variants specific to cold thermotypes, notably involved in carotenoid biosynthesis and the oxidative stress response, revealing that long-term adaptation to thermal niches relies on amino acid substitutions rather than on gene content variation. Altogether, this study not only deciphers the respective roles of gene gains/losses and sequence variation but also uncovers numerous gene candidates likely involved in niche partitioning of two key members of the marine phytoplankton

OPUS - University of Technology Sydney

Law of Genome Evolution Direction : Coding Information Quantity Grows

Author: A. F. A. Smit
A. G. Matera
A. Mira
B. Charlesworth
C. L. Organ
C. Nusbaum
D. A. Petrov
D. L. Marais Des
D. R. Scannell
E. Schrodinger
E. T. Dermitzakis
F. Clark
G. Bejerano
G. Liu
G. Storz
H. H. Chou
H. H. Kazazian
H. Ozkan
H. Winter
I. J. Leitch
I. Wapinski
I. Wickelgren
International Human Genome Sequencing Consortium
J. Filkowski
J.M. Aury
K. M. Devos
L. F. Luo
L. F. Luo
L. F. Luo
L. He
L. Patthy
L. R. Zhang
Liao-fu Luo
R. J. Taft
R. P. Bininda-Edmonds
S. E. Peters
T. C. Stadtman
T. Kouzarides
T. R. Gregory
The ENCODE Project Consortium
W. Deng
W. Enard
W. H. Li
W. Makalowski
X. Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/08/2008
Field of study

The problem of the directionality of genome evolution is studied. Based on the analysis of C-value paradox and the evolution of genome size we propose that the function-coding information quantity of a genome always grows in the course of evolution through sequence duplication, expansion of code, and gene transfer from outside. The function-coding information quantity of a genome consists of two parts, p-coding information quantity which encodes functional protein and n-coding information quantity which encodes other functional elements except amino acid sequence. The evidences on the evolutionary law about the function-coding information quantity are listed. The needs of function is the motive force for the expansion of coding information quantity and the information quantity expansion is the way to make functional innovation and extension for a species. So, the increase of coding information quantity of a genome is a measure of the acquired new function and it determines the directionality of genome evolution.Comment: 16 page

arXiv.org e-Print Archive

Crossref

Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing

Author: André Gilles
AR Quinlan
Christopher W. Wheat
D Hamilton
DA Hahn
Emese Meglécz
F Saeed
IW Saunders
J. C. Dohm
Jean-François Martin
JM Aury
JS Reis-Filho
KJ Hoff
KM Wegner
M Lynch
M Lynch
M Margulies
MA Larkin
Maxime Galan
Nicolas Pech
P McCullagh
PJ Campbell
SF Altschul
SM Huse
Steve Hoffmann
Stéphanie Ferreira
Susan M Huse
Sverker Lundin
Thibaut Malausa
V Kunin
W Babik
XiaoGuang Zhou
Y Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The rapid evolution of 454 GS-FLX sequencing technology has not been accompanied by a reassessment of the quality and accuracy of the sequences obtained. Current strategies for decision-making and error-correction are based on an initial analysis by Huse <it>et al. </it>in 2007, for the older GS20 system based on experimental sequences. We analyze here the quality of 454 sequencing data and identify factors playing a role in sequencing error, through the use of an extensive dataset for Roche control DNA fragments. Results We obtained a mean error rate for 454 sequences of 1.07%. More importantly, the error rate is not randomly distributed; it occasionally rose to more than 50% in certain positions, and its distribution was linked to several experimental variables. The main factors related to error are the presence of homopolymers, position in the sequence, size of the sequence and spatial localization in PT plates for insertion and deletion errors. These factors can be described by considering seven variables. No single variable can account for the error rate distribution, but most of the variation is explained by the combination of all seven variables. Conclusions The pattern identified here calls for the use of internal controls and error-correcting base callers, to correct for errors, when available (e.g. when sequencing amplicons). For shotgun libraries, the use of both sequencing primers and deep coverage, combined with the use of random sequencing primer sites should partly compensate for even high error rates, although it may prove more difficult than previous thought to distinguish between low-frequency alleles and errors.</p

Crossref

Springer - Publisher Connector

HAL AMU

Directory of Open Access Journals

Elusive Origins of the Extra Genes in Aspergillus oryzae

Author: AH Paterson
C Hall
C Simillion
DM Geiser
F Delsuc
G Ricard
GC Conant
H Nishida
J Castresana
J Felsenstein
J Kamper
JD Thompson
JE Galagan
JE Nixon
JJ Cai
JM Aury
JM Lee
JO Andersson
K Tamano
Kenneth H. Wolfe
KH Wolfe
KP Byrne
M Kellis
M Lynch
M Lynch
M Machida
MA Fares
Nora Khaldi
NU Frigaard
O Jaillon
P Dehal
PM Sharp
RB Langkjaer
S Garcia-Vallve
S Guindon
Sudhindra Gadagkar
T Hamada
T Kobayashi
WC Nierman
Y van de Peer
Y van de Peer
Z Yang
Z Yang
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

The genome sequence of Aspergillus oryzae revealed unexpectedly that this species has approximately 20% more genes than its congeneric species A. nidulans and A. fumigatus. Where did these extra genes come from? Here, we evaluate several possible causes of the elevated gene number. Many gene families are expanded in A. oryzae relative to A. nidulans and A. fumigatus, but we find no evidence of ancient whole-genome duplication or other segmental duplications, either in A. oryzae or in the common ancestor of the genus Aspergillus. We show that the presence of divergent pairs of paralogs is a feature peculiar to A. oryzae and is not shared with A. nidulans or A. fumigatus. In phylogenetic trees that include paralog pairs from A. oryzae, we frequently find that one of the genes in a pair from A. oryzae has the expected orthologous relationship with A. nidulans, A. fumigatus and other species in the subphylum Eurotiomycetes, whereas the other A. oryzae gene falls outside this clade but still within the Ascomycota. We identified 456 such gene pairs in A. oryzae. Further phylogenetic analysis did not however indicate a single consistent evolutionary origin for the divergent members of these pairs. Approximately one-third of them showed phylogenies that are suggestive of horizontal gene transfer (HGT) from Sordariomycete species, and these genes are closer together in the A. oryzae genome than expected by chance, but no unique Sordariomycete donor species was identifiable. The postulated HGTs from Sordariomycetes still leave the majority of extra A. oryzae genes unaccounted for. One possible explanation for our observations is that A. oryzae might have been the recipient of many separate HGT events from diverse donors

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

Irish Universities

PubMed Central

Multichromosomal median and halving problems under different genomic distances

Author: A Bergeron
A Bergeron
A Bergeron
A Caprara
C Zheng
C Zheng
C Zheng
C Zheng
Chunfang Zheng
D Bryant
D Sankoff
David Sankoff
E Ohlebusch
E Tannier
Eric Tannier
G Bourque
G Fertin
G Jean
G Tesler
G Watterson
I Pe'er
J Aury
J Mixtacki
L Lovasz
M Alekseyev
M Bernt
M Ozery-Flato
MR Garey
N El-Mabrouk
P Berman
P Pevzner
R Lenne
R Warren
S Hannenhalli
S Hannenhalli
S Otto
S Yancopoulos
W Xu
X Chen
Y Lin
YC Lin
Z Adam
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Genome median and genome halving are combinatorial optimization problems that aim at reconstructing ancestral genomes as well as the evolutionary events leading from the ancestor to extant species. Exploring complexity issues is a first step towards devising efficient algorithms. The complexity of the median problem for unichromosomal genomes (permutations) has been settled for both the breakpoint distance and the reversal distance. Although the multichromosomal case has often been assumed to be a simple generalization of the unichromosomal case, it is also a relaxation so that complexity in this context does not follow from existing results, and is open for all distances. Results We settle here the complexity of several genome median and halving problems, including a surprising polynomial result for the breakpoint median and guided halving problems in genomes with circular and linear chromosomes, showing that the multichromosomal problem is actually easier than the unichromosomal problem. Still other variants of these problems are NP-complete, including the DCJ double distance problem, previously mentioned as an open question. We list the remaining open problems. Conclusion This theoretical study clears up a wide swathe of the algorithmical study of genome rearrangements with multiple multichromosomal genomes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

INRIA a CCSD electronic archive server

PubMed Central

Hal-Diderot

The fitness cost of mis-splicing is the main determinant of alternative splicing patterns

Author: A Reyes
AK Ramani
Alexandra Popa
Anamaria Necsulea
Baptiste Saudemont
BJ Blencowe
BR Graveley
C Trapnell
Corinne Blugeon
CR Edwards
E Dubois
E Kim
E Melamud
Eric Meyer
F Abascal
FM Hamid
G Drechsel
I Ezkurdia
J Beisson
J Merkin
J Weischenfeldt
J-M Aury
JE Smith
JJ-L Wong
JJL Wong
JK Pickrell
JM Mudge
Joanna L. Parmley
JZ Ni
L Duret
Laurent Duret
LF Lareau
LF Lareau
M Bulmer
M Graille
M Irimia
M Kalyna
M Wang
ML Tress
ML Tress
MW-L Popp
N Stepankiw
NJ McGlincy
NL Barbosa-Morais
O Garnier
O Jaillon
O Kelemen
PL Boutz
RGH Lindeboom
The 1000 Genomes Project Consortium
TW Nilsen
U Braunschweig
Vincent Rocher
W Sung
W Sung
Y Ge
Y Marquez
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background Most eukaryotic genes are subject to alternative splicing (AS), which may contribute to the production of protein variants or to the regulation of gene expression via nonsense-mediated messenger RNA (mRNA) decay (NMD). However, a fraction of splice variants might correspond to spurious transcripts and the question of the relative proportion of splicing errors to functional splice variants remains highly debated. Results We propose a test to quantify the fraction of AS events corresponding to errors. This test is based on the fact that the fitness cost of splicing errors increases with the number of introns in a gene and with expression level. We analyzed the transcriptome of the intron-rich eukaryote Paramecium tetraurelia. We show that in both normal and in NMD-deficient cells, AS rates strongly decrease with increasing expression level and with increasing number of introns. This relationship is observed for AS events that are detectable by NMD as well as for those that are not, which invalidates the hypothesis of a link with the regulation of gene expression. Our results show that in genes with a median expression level, 92–98% of observed splice variants correspond to errors. We observed the same patterns in human transcriptomes and we further show that AS rates correlate with the fitness cost of splicing errors. Conclusions These observations indicate that genes under weaker selective pressure accumulate more maladaptive substitutions and are more prone to splicing errors. Thus, to a large extent, patterns of gene expression variants simply reflect the balance between selection, mutation, and drift

Central Archive at the University of Reading

Crossref

ZENODO

Directory of Open Access Journals

HAL-Inserm

INRIA a CCSD electronic archive server

HAL Descartes

Analysis of the P. lividus sea urchin genome highlights contrasting trends of genomic and regulatory evolution in deuterostomes

Author: Anello L
Arnone MI
Aury JM
Barbe V
Ben Tabou de Leon S
Besnardeau L
Borra M
Cavalieri V
Chessel A
Copley RR
Cormier P
Costa C
Couloux A
Croce J
Da Silva C
Di Bernardo M
Di Carlo M
Dru P
Exposito JY
Gache C
Gavriouchkina D
Geneviève AM
Labadie K
Le Gras S
Lepage T
Lhomond G
Lowe EK
Mangenot S
Marlétaz F
Martinez P
Matranga V
Molina MD
Morales J
Nicosia A
Noel B
Oliveri P
Pascual M
Pegueroles C
Poulain J
Poustka AJ
Ragusa MA
Russo R
Turon X
Wincker P
Ye T
Zito F
Publication venue: 'Elsevier BV'
Publication date: 12/04/2023
Field of study

Sea urchins are emblematic models in developmental biology and display several characteristics that set them apart from other deuterostomes. To uncover the genomic cues that may underlie these specificities, we generated a chromosome-scale genome assembly for the sea urchin Paracentrotus lividus and an extensive gene expression and epigenetic profiles of its embryonic development. We found that, unlike vertebrates, sea urchins retained ancestral chromosomal linkages but underwent very fast intrachromosomal gene order mixing. We identified a burst of gene duplication in the echinoid lineage and showed that some of these expanded genes have been recruited in novel structures (water vascular system, Aristotle's lantern, and skeletogenic micromere lineage). Finally, we identified gene-regulatory modules conserved between sea urchins and chordates. Our results suggest that gene-regulatory networks controlling development can be conserved despite extensive gene order rearrangement

UCL Discovery

The coffee genome provides insight into the convergent evolution of caffeine biosynthesis.

Made available in DSpace on 2018-07-12T01:02:16Z (GMT). No. of bitstreams: 1 Science2014Denoeud11814.pdf: 1559498 bytes, checksum: 6f5cdb42fd6c137acb3220e6d823a1f9 (MD5) Previous issue date: 2015-01-22bitstream/item/116215/1/Science-2014-Denoeud-1181-4.pd

Repository Open Access to Scientific Information from Embrapa

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Viral to metazoan marine plankton nucleotide sequences from the Tara Oceans expedition

Author: Acinas S.G.
Alberti A.
Albini G.
Amid C.
Aury J.M.
Belser C.
Bertrand A.
Bowler C.
Brum J.R.
Cochrane G.
Cornejo-Castillo F.M.
Cruaud C.
Da Silva C.
De Vargas C.
Desgranges E.
Dossat C.
Duhaime M.B.
Engelen S.
Fernandez-Gomez B.
Ferrera I.
Gas S.
Gavory F.
Grimsley N.
Guy J.
Haquelle M.
Hoopen P.T.
Hurwitz B.L.
Jacoby E.
Jaillon O.
Kandels-Lewis S.
Karsenti E.
Labadie K.
Lemainque A.
Logares R.
Ogata H.
Pelletier E.
Pesant S.
Poulain J.
Poulos B.T.
Poulton N.
Romac S.
Royo-Llonch M.
Samson G.
Sieracki M.E.
Stepanauskas R.
Sullivan M.B.
Wessner M.
Wincker P.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2017
Field of study

A unique collection of oceanic samples was gathered by the Tara Oceans expeditions (2009-2013), targeting plankton organisms ranging from viruses to metazoans, and providing rich environmental context measurements. Thanks to recent advances in the field of genomics, extensive sequencing has been performed for a deep genomic analysis of this huge collection of samples. A strategy based on different approaches, such as metabarcoding, metagenomics, single-cell genomics and metatranscriptomics, has been chosen for analysis of size-fractionated plankton communities. Here, we provide detailed procedures applied for genomic data generation, from nucleic acids extraction to sequence production, and we describe registries of genomics datasets available at the European Nucleotide Archive (ENA, www.ebi.ac.uk/ena). The association of these metadata to the experimental procedures applied for their generation will help the scientific community to access these data and facilitate their analysis. This paper complements other efforts to provide a full description of experiments and open science resources generated from the Tara Oceans project, further extending their value for the study of the world's planktonic ecosystems

MDC Repository

Distinct Gene Number-Genome Size Relationships for Eukaryotes and Non-Eukaryotes: Gene Content Estimation for Dinoflagellate Genomes

Author: AC Ivens
AG Hinnebusch
AR Loeblich III
CH Slamovits
CH Slamovits
D Lee
DC Sigee
DL Spector
DM Anderson
DW Coats
FM Van Dolah
H Moreau
H Zhang
H Zhang
H Zhang
J Archibald
J Lukes
J Ramsey
J Reichman
JD Hackett
JD Hackett
JM Aury
JR Allen
KH Wolfe
KT Konstantinidis
L Pfiester
L Xu
LY Liu
M Berriman
M Lynch
M Lynch
M McEwan
MJW Veldhuis
NJ Patron
NJ Patron
O Holm-Hansen
P Salois
PJ Rizzo
PJ Rizzo
QH Le
RE Steel
RJ Blank
Rosemary Jeanne Redfield
S Lin
S Lin
Senjie Lin
SR Santos
T Bertomeu
TC LaJeunesse
TM Roberts
TR Bachvaroff
TR Gregory
TR Gregory
TR Gregory
Y Bhaud
Y Bouligand
YH Chan
Yubo Hou
Publication venue: Public Library of Science
Publication date: 01/09/2009
Field of study

The ability to predict gene content is highly desirable for characterization of not-yet sequenced genomes like those of dinoflagellates. Using data from completely sequenced and annotated genomes from phylogenetically diverse lineages, we investigated the relationship between gene content and genome size using regression analyses. Distinct relationships between log10-transformed protein-coding gene number (Y′) versus log10-transformed genome size (X′, genome size in kbp) were found for eukaryotes and non-eukaryotes. Eukaryotes best fit a logarithmic model, Y′ = ln(-46.200+22.678X′, whereas non-eukaryotes a linear model, Y′ = 0.045+0.977X′, both with high significance (p<0.001, R2>0.91). Total gene number shows similar trends in both groups to their respective protein coding regressions. The distinct correlations reflect lower and decreasing gene-coding percentages as genome size increases in eukaryotes (82%–1%) compared to higher and relatively stable percentages in prokaryotes and viruses (97%–47%). The eukaryotic regression models project that the smallest dinoflagellate genome (3×106 kbp) contains 38,188 protein-coding (40,086 total) genes and the largest (245×106 kbp) 87,688 protein-coding (92,013 total) genes, corresponding to 1.8% and 0.05% gene-coding percentages. These estimates do not likely represent extraordinarily high functional diversity of the encoded proteome but rather highly redundant genomes as evidenced by high gene copy numbers documented for various dinoflagellate species

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central