49 research outputs found
A hybrid-hierarchical genome assembly strategy to sequence the invasive golden mussel Limnoperna fortunei
ABSTRACT Background: For more than 25 years, the golden mussel Limnoperna fortunei has aggressively invaded South American freshwaters, having travelled more than 5,000 km upstream across five countries. Along the way, the golden mussel has outcompeted native species and economically harmed aquaculture, hydroelectric powers, and ship transit. We have sequenced the complete genome of the golden mussel to understand the molecular basis of its invasiveness and search for ways to control it. Findings: We assembled the 1.6 Gb genome into 20548 scaffolds with an N50 length of 312 Kb using a hybrid and hierarchical assembly strategy from short and long DNA reads and transcriptomes. A total of 60717 coding genes were inferred from a customized transcriptome-trained AUGUSTUS run. We also compared predicted protein sets with those of complete molluscan genomes, revealing an exacerbation of proteinbinding domains in L. fortunei. Conclusions: We built one of the best bivalve genome assemblies available using a cost-effective approach using Illumina pair-end, mate pair, and PacBio long reads. We expect that the continuous and careful annotation of L. fortunei's genome will contribute to the investigation of bivalve genetics, evolution, and invasiveness, as well as to the development of biotechnological tools for aquatic pest control
A Long-Term Conserved Satellite DNA That Remains Unexpanded in Several Genomes of Characiformes Fish Is Actively Transcribed
Eukaryotic genomes contain large amounts of repetitive DNA sequences, such as tandemly repeated satellite DNAs (satDNAs). These sequences are highly dynamic and tend to be genus- or species-specific due to their particular evolutionary pathways, although there are few unusual cases of conserved satDNAs over long periods of time. Here, we used multiple approaches to reveal that an satDNA named CharSat01-52 originated in the last common ancestor of Characoidei fish, a superfamily within the Characiformes order, ∼140-78 Ma, whereas its nucleotide composition has remained considerably conserved in several taxa. We show that 14 distantly related species within Characoidei share the presence of this satDNA, which is highly amplified and clustered in subtelomeric regions in a single species (Characidium gomesi), while remained organized as small clusters in all the other species. Defying predictions of the molecular drive of satellite evolution, CharSat01-52 shows similar values of intra- and interspecific divergence. Although we did not provide evidence for a specific functional role of CharSat01-52, its transcriptional activity was demonstrated in different species. In addition, we identified short tandem arrays of CharSat01-52 embedded within single-molecule real-time long reads of Astyanax paranae (536 bp-3.1 kb) and A. mexicanus (501 bp-3.9 kb). Such arrays consisted of head-to-tail repeats and could be found interspersed with other sequences, inverted sequences, or neighbored by other satellites. Our results provide a detailed characterization of an old and conserved satDNA, challenging general predictions of satDNA evolution
The genome sequence of the Norway rat, Rattus norvegicus Berkenhout 1769.
We present a genome assembly from an individual male Rattus norvegicus (the Norway rat; Chordata; Mammalia; Rodentia; Muridae). The genome sequence is 2.44 gigabases in span. The majority of the assembly is scaffolded into 20 chromosomal pseudomolecules, with both X and Y sex chromosomes assembled. This genome assembly, mRatBN7.2, represents the new reference genome for R. norvegicus and has been adopted by the Genome Reference Consortium
A high-quality genome and comparison of short- versus long-read transcriptome of the palaearctic duck Aythya fuligula (tufted duck)
Background: The tufted duck is a non-model organism that experiences high mortality in highly pathogenic avian influenza outbreaks. It belongs to the same bird family (Anatidae) as the mallard, one of the best-studied natural hosts of low-pathogenic avian influenza viruses. Studies in non-model bird species are crucial to disentangle the role of the host response in avian influenza virus infection in the natural reservoir. Such endeavour requires a high-quality genome assembly and transcriptome. Findings: This study presents the first high-quality, chromosome-level reference genome assembly of the tufted duck using the Vertebrate Genomes Project pipeline. We sequenced RNA (complementary DNA) from brain, ileum, lung, ovary, spleen, and testis using Illumina short-read and Pacific Biosciences long-read sequencing platforms, which were used for annotation. We found 34 autosomes plus Z and W sex chromosomes in the curated genome assembly, with 99.6% of the sequence assigned to chromosomes. Functional annotation revealed 14,099 protein-coding genes that generate 111,934 transcripts, which implies a mean of 7.9 isoforms per gene. We also identified 246 small RNA families. Conclusions: This annotated genome contributes to continuing research into the host response in avian influenza virus infections in a natural reservoir. Our findings from a comparison between short-read and long -read reference transcriptomics contribute to a deeper understanding of these competing options. In this study, both technologies complemented each other. We expect this annotation to be a foundation for further comparative and evolutionary genomic studies, including many waterfowl relatives with differing susceptibilities to avian influenza viruses
The Earth BioGenome Project 2020: Starting the clock.
© The Author(s), 2022. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Lewin, H. A., Richards, S., Lieberman Aiden, E., Allende, M. L., Archibald, J. M., Bálint, M., Barker, K. B., Baumgartner, B., Belov, K., Bertorelle, G., Blaxter, Mark L., Cai, J., Caperello, N. D., Carlson, K., Castilla-Rubio, J. C., Chaw, S-M., Chen, L., Childers, A. K., Coddington, J. A., Conde, D. A., Corominas, M., Crandall, K. A., Crawford, A. J., DiPalma, F., Durbin, R., Ebenezer, T. E., Edwards, S. V., Fedrigo, O., Flicek, P., Formenti, G., Gibbs, R. A., Gilbert, M. Thomas P., Goldstein, M. M., Graves, J. M., Greely, H. T., Grigoriev, I. V., Hackett, K. J., Hall, N., Haussler, D., Helgen, K. M., Hogg, C. J., Isobe, S., Jakobsen, K. S., Janke, A., Jarvis, E. D., Johnson, W. E., Jones, S. J. M., Karlsson, E. K., Kersey, P. J., Kim, J-H., Kress, W. J., Kuraku, S., Lawniczak, M. K. N., Leebens-Mack, J. H., Li, X., Lindblad-Toh, K., Liu, X., Lopez, J. V., Marques-Bonet, T., Mazard, S., Mazet, J. A. K., Mazzoni, C. J., Myers, E. W., O’Neill, R. J., Paez, S., Park, H., Robinson, G. E., Roquet, C., Ryder, O. A., Sabir, J. S. M., Shaffer, H. B., Shank, T. M., Sherkow, J. S., Soltis, P. S., Tang, B., Tedersoo, L., Uliano-Silva, M., Wang, K., Wei, X., Wetzer, R., Wilson, J. L., Xu, X., Yang, H., Yoder, A. D., Zhang, G. The Earth BioGenome Project 2020: starting the clock. Proceedings of the National Academy of Sciences of the United States of America, 119(4), (2022): e2115635118, https://doi.org/10.1073/pnas.2115635118.November 2020 marked 2 y since the launch of the Earth BioGenome Project (EBP), which aims to sequence all known eukaryotic species in a 10-y timeframe. Since then, significant progress has been made across all aspects of the EBP roadmap, as outlined in the 2018 article describing the project’s goals, strategies, and challenges (1). The launch phase has ended and the clock has started on reaching the EBP’s major milestones. This Special Feature explores the many facets of the EBP, including a review of progress, a description of major scientific goals, exemplar projects, ethical legal and social issues, and applications of biodiversity genomics. In this Introduction, we summarize the current status of the EBP, held virtually October 5 to 9, 2020, including recent updates through February 2021. References to the nine Perspective articles included in this Special Feature are cited to guide the reader toward deeper understanding of the goals and challenges facing the EBP
Recommended from our members
Complete vertebrate mitogenomes reveal widespread repeats and gene duplications
Abstract: Background: Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly. Results: As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100–300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization. Conclusions: Our results indicate that even in the “simple” case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone
The Genetics of the Golden Mussel (Limnoperna fortunei): Are Genes Related to Invasiveness?
Gene Discovery through Transcriptome Sequencing for the Invasive Mussel <i>Limnoperna fortunei</i>
<div><p>The success of the Asian bivalve <i>Limnoperna fortunei</i> as an invader in South America is related to its high acclimation capability. It can inhabit waters with a wide range of temperatures and salinity and handle long-term periods of air exposure. We describe the transcriptome of <i>L. fortunei</i> aiming to give a first insight into the phenotypic plasticity that allows non-native taxa to become established and widespread. We sequenced 95,219 reads from five main tissues of the mussel <i>L. fortunei</i> using Roche’s 454 and assembled them to form a set of 84,063 unigenes (contigs and singletons) representing partial or complete gene sequences. We annotated 24,816 unigenes using a BLAST sequence similarity search against a NCBI nr database. Unigenes were divided into 20 eggNOG functional categories and 292 KEGG metabolic pathways. From the total unigenes, 1,351 represented putative full-length genes of which 73.2% were functionally annotated. We described the first partial and complete gene sequences in order to start understanding bivalve invasiveness. An expansion of the hsp70 gene family, seen also in other bivalves, is present in <i>L. fortunei</i> and could be involved in its adaptation to extreme environments, e.g. during intertidal periods. The presence of toll-like receptors gives a first insight into an immune system that could be more complex than previously assumed and may be involved in the prevention of disease and extinction when population densities are high. Finally, the apparent lack of special adaptations to extremely low O<sub>2</sub> levels is a target worth pursuing for the development of a molecular control approach.</p></div