13 research outputs found

    Into the Dark: Exploring the Deep Ocean with Single-Virus Genomics

    Get PDF
    Single-virus genomics (SVGs) has been successfully applied to ocean surface samples allowing the discovery of widespread dominant viruses overlooked for years by metagenomics, such as the uncultured virus vSAG 37-F6 infecting the ubiquitous Pelagibacter spp. In SVGs, one uncultured virus at a time is sorted from the environmental sample, whole-genome amplified, and sequenced. Here, we have applied SVGs to deep-ocean samples (200–4000 m depth) from global Malaspina and MEDIMAX expeditions, demonstrating the feasibility of this method in deep-ocean samples. A total of 1328 virus-like particles were sorted from the North Atlantic Ocean, the deep Mediterranean Sea, and the Pacific Ocean oxygen minimum zone (OMZ). For this proof of concept, sixty single viruses were selected at random for sequencing. Genome annotation identified 27 of these genomes as bona fide viruses, and detected three auxiliary metabolic genes involved in nucleotide biosynthesis and sugar metabolism. Massive protein profile analysis confirmed that these viruses represented novel viral groups not present in databases. Although they were not previously assembled by viromics, global fragment recruitment analysis showed a conserved profile of relative abundance of these viruses in all analyzed samples spanning different oceans. Altogether, these results reveal the feasibility in using SVGs in this vast environment to unveil the genomes of relevant viruses.This research was funded by the Spanish Ministry of Science and Innovation (RTI2018-094248-B-I00), and Generalitat Valenciana (ACIF/2015/332 and APOSTD/2020/237)

    Benchmarking of single‐virus genomics: a new tool for uncovering the virosphere

    Get PDF
    Metagenomics and single‐cell genomics have enabled the discovery of relevant uncultured microbes. Recently, single‐virus genomics (SVG), although still in an incipient stage, has opened new avenues in viral ecology by allowing the sequencing of one single virus at a time. The investigation of methodological alternatives and optimization of existing procedures for SVG is paramount to deliver high‐quality genomic data. We report a sequencing dataset of viral single‐amplified genomes (vSAGs) from cultured and uncultured viruses obtained by applying different conditions in each SVG step, from viral preservation and novel whole‐genome amplification (WGA) to sequencing platforms and genome assembly. Sequencing data showed that cryopreservation and mild fixation were compatible with WGA, although fresh samples delivered better genome quality data. The novel TruPrime WGA, based on primase‐polymerase features, and WGA‐X employing a thermostable phi29 polymerase, were proven to be with sufficient sensitivity in SVG. The Oxford Nanopore (ON) sequencing platform did not provide a significant improvement of vSAG assembly compared to Illumina alone. Finally, the SPAdes assembler performed the best. Overall, our results represent a valuable genomic dataset that will help to standardized and advance new tools in viral ecology.This work has been supported by Gordon and Betty Moore Foundation (grant 5334) and Spanish Ministry of Economy and Competitiveness (refs CGL2013‐40564‐R, RTI2018‐094248‐B‐I00 and SAF2013‐49267‐EXP). Work at CRG, BIST and UPF was in part funded by the Spanish Ministry of Economy and Competitiveness, ‘Centro de Excelencia Severo Ochoa 2013‐2017’ and the Spanish Ministry of Economy and Competitiveness, ‘Centro de Excelencia Maria de Maeztu 2016‐2019’

    A new chromosome-assigned Mongolian gerbil genome allows characterization of complete centromeres and a fully heterochromatic chromosome

    Get PDF
    This is the final version. Available on open access from Oxford University Press via the DOI in this recordData Availability: All sequencing data and the genome are available under SRA BioProject PRJNA397533. Specific accession numbers can be found in supplementary material S1, Supplementary Material online. This Whole Genome Shotgun project has been deposited at DDBJ/ENA/GenBank under the accession JAODIK000000000. The version described in this paper is version JAODIK010000000. The genetic map, a vcf of the genetic markers and their genotypes in the mapping panel, the gff of the gene annotations, the gff of the repetitive element annotations, and “Supplemental_Material 3_codebase.zip”, can be found in the Dryad repository here: Brekke, Thomas D. (2022), Data for “The origin of a new chromosome in gerbils”, Dryad, Dataset, https://doi.org/10.5061/dryad.1vhhmgqws.Chromosome-scale genome assemblies based on ultralong-read sequencing technologies are able to illuminate previously intractable aspects of genome biology such as fine-scale centromere structure and large-scale variation in genome features such as heterochromatin, GC content, recombination rate, and gene content. We present here a new chromosome-scale genome of the Mongolian gerbil (Meriones unguiculatus), which includes the complete sequence of all centromeres. Gerbils are thus the one of the first vertebrates to have their centromeres completely sequenced. Gerbil centromeres are composed of four different repeats of length 6, 37, 127, or 1,747 bp, which occur in simple alternating arrays and span 1-6 Mb. Gerbil genomes have both an extensive set of GC-rich genes and chromosomes strikingly enriched for constitutive heterochromatin. We sought to determine if there was a link between these two phenomena and found that the two heterochromatic chromosomes of the Mongolian gerbil have distinct underpinnings: Chromosome 5 has a large block of intraarm heterochromatin as the result of a massive expansion of centromeric repeats, while chromosome 13 is comprised of extremely large (>150 kb) repeated sequences. In addition to characterizing centromeres, our results demonstrate the importance of including karyotypic features such as chromosome number and the locations of centromeres in the interpretation of genome sequence data and highlight novel patterns involved in the evolution of chromosomes.Leverhulme TrustNatural Environment Research Council (NERC)Ministerio de Economía y Competitivida

    Transcriptome innovations in primates revealed by single-molecule long-read sequencing

    Get PDF
    Transcriptomic diversity greatly contributes to the fundamentals of disease, lineage-specific biology, and environmental adaptation. However, much of the actual isoform repertoire contributing to shaping primate evolution remains unknown. Here, we combined deep long- and short-read sequencing complemented with mass spectrometry proteomics in a panel of lymphoblastoid cell lines (LCLs) from human, three other great apes, and rhesus macaque, producing the largest full-length isoform catalog in primates to date. Around half of the captured isoforms are not annotated in their reference genomes, significantly expanding the gene models in primates. Furthermore, our comparative analyses unveil hundreds of transcriptomic innovations and isoform usage changes related to immune function and immunological disorders. The confluence of these evolutionary innovations with signals of positive selection and their limited impact in the proteome points to changes in alternative splicing in genes involved in immune response as an important target of recent regulatory divergence in primates. changes in alternative splicing in genes involved in immune response as an important target of recent regulatory divergence in primates.This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB31020000); National Key R&D Program of China (China's Ministry of Science and Technology [MoST]) grant 2018YFC1406901; the International Partnership Program of the Chinese Academy of Sciences (no. 152453KYSB20170002); the Carlsberg Foundation (CF16-0663); the Villum Foundation (no. 25900) to G.Z.; and the La Caixa Foundation (ID 100010434) Fellowship Code LCF/BQ/DE16/11570011 (L.F.-P.). The Center for Genomic Regulation (CRG) / Universitat Pompeu Fabra (UPF) Proteomics Unit is part of the Spanish Infrastructure for Omics Technologies (National Map of Unique Scientific and Technical Infrastructures [ICTS] OmicsTech) and a member of the ProteoRed PRB3 Consortium, which is supported by grant PT17/0019 of the PE I + D + i 2013–2016 from the Instituto de Salud Carlos III (ISCIII), European Regional Development Fund (ERDF), and “Secretaria d'Universitats i Recerca del Departament d'Economia i Coneixement de la Generalitat de Catalunya” (2017SGR595). T.M.-B. is supported by funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement no. 864203), BFU2017-86471-P (MINECO/FEDER, UE); “Unidad de Excelencia María de Maeztu,” funded by the Agencia Estatal de Investigación (AEI) (CEX2018-000792-M); Howard Hughes International Early Career; National Institutes of Health 1R01HG010898-01A1; and Secretaria d'Universitats i Recerca and Centres de Recerca de Catalunya (CERCA) Programme del Departament d'Economia i Coneixement de la Generalitat de Catalunya (GRC 2017 SGR 880)

    Deciphering the Human Virome with Single-Virus Genomics and Metagenomics

    No full text
    Single-cell genomics has unveiled the metabolic potential of dominant microbes inhabiting different environments, including the human body. The lack of genomic information for predominant microbes of the human body, such as bacteriophages, hinders our ability to answer fundamental questions about our viral communities. Here, we applied single-virus genomics (SVGs) to natural human salivary samples in combination with viral metagenomics to gain some insights into the viral community structure of the oral cavity. Saliva samples were processed for viral metagenomics (n = 15) and SVGs (n = 3). A total of 1328 uncultured single viruses were sorted by fluorescence-activated virus sorting followed by whole genome amplification. Sequencing of 24 viral single amplified genomes (vSAGs) showed that half of the vSAGs contained viral hallmark genes. Among those bona fide viruses, the uncultured single virus 92-C13 putatively infecting oral Streptococcus-like species was within the top ≈10 most abundant viruses in the oral virome. Viral gene network and viral metagenomics analyses of 439 oral viruses from cultures, metagenomics, and SVGs revealed that salivary viruses were tentatively structured into ≈200 major viral clusters, corresponding to approximately genus-level groupings. Data showed that none of the publicly available viral isolates, excepting an Actinomyces phage, were significantly abundant in the oral viromes. In addition, none of the obtained viral contigs and vSAGs from this study were present in all viromes. Overall, the data demonstrates that most viral isolates are not naturally abundant in saliva, and furthermore, the predominant viruses in the oral cavity are yet uncharacterized. Results suggest a variable, complex, and interpersonal viral profile. Finally, we demonstrated the power of SVGs in combination with viral metagenomics to unveil the genetic information of the uncultured viruses of the human virome.This work has been supported by the Spanish Ministry of Economy and Competitiveness (refs. CGL2013-40564-R and SAF2013-49267-EXP), the Generalitat Valenciana (refs. ACOM/2015/133 and ACIF/2015/332), and the Gordon and Betty Moore Foundation (grant 5334). We thank the English editor Karen Neller

    Ecogenomics of the SAR11 clade

    Get PDF
    Members of the SAR11 clade, despite their high abundance, are often poorly represented by metagenome-assembled genomes. This fact has hampered our knowledge about their ecology and genetic diversity. Here we examined 175 SAR11 genomes, including 47 new single-amplified genomes. The presence of the first genomes associated with subclade IV suggests that, in the same way as subclade V, they might be outside the proposed Pelagibacterales order. An expanded phylogenomic classification together with patterns of metagenomic recruitment at a global scale have allowed us to define new ecogenomic units of classification (genomospecies), appearing at different, and sometimes restricted, metagenomic data sets. We detected greater microdiversity across the water column at a single location than in samples collected from similar depth across the global ocean, suggesting little influence of biogeography. In addition, pangenome analysis revealed that the flexible genome was essential to shape genomospecies distribution. In one genomospecies preferentially found within the Mediterranean, a set of genes involved in phosphonate utilization was detected. While another, with a more cosmopolitan distribution, was unique in having an aerobic purine degradation pathway. Together, these results provide a glimpse of the enormous genomic diversity within this clade at a finer resolution than the currently defined clades.This work was supported by grant ‘VIREVO’ CGL2016‐76273‐P [AEI/FEDER, EU] (cofounded with FEDER funds) from the Spanish Ministerio de Economía, Industria y Competitividad to FRV, and grants CGL2013‐40564‐R and SAF2013‐49267‐EXP from the Spanish Ministerio de Economía, Industria y Competitividad, grant ACIF/2015/332 from Generalitat Valenciana and grant 5334 from the Betty Moore Foundation to MMG. FRV was also a beneficiary of the 5top100‐program of the Ministry for Science and Education of Russia. JHM was supported by a Ph.D. fellowship from the Spanish Ministerio de Economía y Competitividad (BES‐2014‐067828). MLP was supported by a postdoctoral fellowship from the Spanish Ministerio de Economía, Industria y Competitividad (IJCI‐2017‐34002)

    Flow sorting enrichment and nanopore sequencing of chromosome 1 from a Chinese individual

    Get PDF
    Sorting of individual chromosomes by Flow Cytometry (flow-sorting) is an enrichment method to potentially simplify genome assembly by isolating chromosomes from the context of the genome. We have recently developed a workflow to sequence native, unamplified DNA and applied it to the smallest human chromosome, the Y chromosome. Here, we modify improve upon that workflow to increase DNA recovery from chromosome sorting as well as sequencing yield. We apply it to sequence and assemble the largest human chromosome - chromosome 1 - of a Chinese individual using a single Oxford Nanopore MinION flow cell. We generate a selective and highly continuous assembly whose continuity reaches into the order of magnitude of the human reference GRCh38. We then use this assembly to call candidate structural variants against the reference and find 685 putative novel SV candidates. We propose this workflow as a potential solution to assemble structurally complex chromosomes, or the study of very large plant or animal genomes that might challenge traditional assembly strategies.This study was funded by grants RTI2018-096824-B-C22 from the Agencia Estatal de Investigación-Ministerio de Ciencia, Innovación y Universidades (Spain) and FEDER (EU) to OF and FC, SAF2015-68472-C2-2-R from the Ministerio de Economía y Competitividad (Spain) and FEDER (EU) to FC, the Centro de Excelencia Severo Ochoa, and by Direcció General de Recerca, Generalitat de Catalunya (2017SGR-702). TM-B is supported by BFU2017-86471-P (MINECO/FEDER, UE), U01 MH106874 grant, Howard Hughes International Early Career, Obra Social “La Caixa” and Secretaria d'Universitats i Recerca and CERCA Programme del Departament d'Economia i Coneixement de la Generalitat de Catalunya (GRC 2017 SGR 880). LK is supported by an FPI fellowship associated with BFU2014-55090-P (MINECO/FEDER, UE) and by an EMBO Short-Term Fellowship STF-8286. LB-M is supported by a Formació de personal Investigador fellowship from Generalitat de Catalunya (2018_FI_B00072). MS-M is supported by the María de Maetzu Programme (MDM-2014-0370-16-3

    Single-virus genomics reveals hidden cosmopolitan and abundant viruses

    No full text
    Microbes drive ecosystems under constraints imposed by viruses. However, a lack of virus genome information hinders our ability to answer fundamental, biological questions concerning microbial communities. Here we apply single-virus genomics (SVGs) to assess whether portions of marine viral communities are missed by current techniques. The majority of the here-identified 44 viral single-amplified genomes (vSAGs) are more abundant in global ocean virome data sets than published metagenome-assembled viral genomes or isolates. This indicates that vSAGs likely best represent the dsDNA viral populations dominating the oceans. Species-specific recruitment patterns and virome simulation data suggest that vSAGs are highly microdiverse and that microdiversity hinders the metagenomic assembly, which could explain why their genomes have not been identified before. Altogether, SVGs enable the discovery of some of the likely most abundant and ecologically relevant marine viral species, such as vSAG 37-F6, which were overlooked by other methodologies.This work has been supported by Spanish Ministry of Economy and Competitiveness (refs CGL2013-40564-R and SAF2013-49267-EXP), Generalitat Valenciana (ref. ACOM/2015/133 and ACIF/2015/332), the USA National Science Foundation (OCE#1536989), the USA Department of Energy (DE-SC0010580), and Gordon and Betty Moore Foundation (grants 3305, 3790, and 5334). The Ohio Supercomputer supported gene-sharing network high performance compute time. Work at BBMO was funded by Spanish project CT2015-70340-R. Work at CRG, BIST and UPF was in part funded by the Spanish Ministry of Economy and Competitiveness, ‘Centro de Excelencia Severo Ochoa 2013-2017’ and the Spanish Ministry of Economy and Competitiveness, ‘Centro de Excelencia Maria de Maeztu 2016-2019’

    Transcriptome innovations in primates revealed by single-molecule long-read sequencing

    No full text
    Transcriptomic diversity greatly contributes to the fundamentals of disease, lineage-specific biology, and environmental adaptation. However, much of the actual isoform repertoire contributing to shaping primate evolution remains unknown. Here, we combined deep long- and short-read sequencing complemented with mass spectrometry proteomics in a panel of lymphoblastoid cell lines (LCLs) from human, three other great apes, and rhesus macaque, producing the largest full-length isoform catalog in primates to date. Around half of the captured isoforms are not annotated in their reference genomes, significantly expanding the gene models in primates. Furthermore, our comparative analyses unveil hundreds of transcriptomic innovations and isoform usage changes related to immune function and immunological disorders. The confluence of these evolutionary innovations with signals of positive selection and their limited impact in the proteome points to changes in alternative splicing in genes involved in immune response as an important target of recent regulatory divergence in primates.This work was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (XDB31020000); National Key R&D Program of China (China's Ministry of Science and Technology [MoST]) grant 2018YFC1406901; the International Partnership Program of the Chinese Academy of Sciences (no. 152453KYSB20170002); the Carlsberg Foundation (CF16-0663); the Villum Foundation (no. 25900) to G.Z.; and the La Caixa Foundation (ID 100010434) Fellowship Code LCF/BQ/DE16/11570011 (L.F.-P.). The Center for Genomic Regulation (CRG) / Universitat Pompeu Fabra (UPF) Proteomics Unit is part of the Spanish Infrastructure for Omics Technologies (National Map of Unique Scientific and Technical Infrastructures [ICTS] OmicsTech) and a member of the ProteoRed PRB3 Consortium, which is supported by grant PT17/0019 of the PE I + D + i 2013–2016 from the Instituto de Salud Carlos III (ISCIII), European Regional Development Fund (ERDF), and “Secretaria d'Universitats i Recerca del Departament d'Economia i Coneixement de la Generalitat de Catalunya” (2017SGR595). T.M.-B. is supported by funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement no. 864203), BFU2017-86471-P (MINECO/FEDER, UE); “Unidad de Excelencia María de Maeztu,” funded by the Agencia Estatal de Investigación (AEI) (CEX2018-000792-M); Howard Hughes International Early Career; National Institutes of Health 1R01HG010898-01A1; and Secretaria d'Universitats i Recerca and Centres de Recerca de Catalunya (CERCA) Programme del Departament d'Economia i Coneixement de la Generalitat de Catalunya (GRC 2017 SGR 880)

    Selective single molecule sequencing and assembly of a human Y chromosome of African origin

    No full text
    Mammalian Y chromosomes are often neglected from genomic analysis. Due to their inherent assembly difficulties, high repeat content, and large ampliconic regions, only a handful of species have their Y chromosome properly characterized. To date, just a single human reference quality Y chromosome, of European ancestry, is available due to a lack of accessible methodology. To facilitate the assembly of such complicated genomic territory, we developed a novel strategy to sequence native, unamplified flow sorted DNA on a MinION nanopore sequencing device. Our approach yields a highly continuous assembly of the first human Y chromosome of African origin. It constitutes a significant improvement over comparable previous methods, increasing continuity by more than 800%. Sequencing native DNA also allows to take advantage of the nanopore signal data to detect epigenetic modifications in situ. This approach is in theory generalizable to any species simplifying the assembly of extremely large and repetitive genomes
    corecore