21 research outputs found
Genomic comparison of the temperate coral Astrangia poculata with tropical corals yields insights into winter quiescence, innate immunity, and sexual reproduction
Facultatively symbiotic corals provide important experimental models to explore the establishment, maintenance, and breakdown of the mutualism between corals and members of the algal family Symbiodiniaceae. The temperate coral Astrangia poculata is one such model as it is not only facultatively symbiotic, but also occurs across a broad temperature and latitudinal gradient. Here, we report the de novo chromosome-scale assembly and annotation of the A. poculata genome. Though widespread segmental/tandem duplications of genomic regions were detected, we did not find strong evidence of a whole genome duplication (WGD) event. Comparison of the gene arrangement between A. poculata and the tropical coral Acropora millepora revealed 56.38% of the orthologous genes were conserved in syntenic blocks despite ~415 million years of divergence. Gene families related to sperm hyperactivation and innate immunity, including lectins, were found to contain more genes in A. millepora relative to A. poculata. Sperm hyperactivation in A. millepora is expected given the extreme requirements of gamete competition during mass spawning events in tropical corals, while lectins are important in the establishment of coral-algal symbiosis. By contrast, gene families involved in sleep promotion, feeding suppression, and circadian sleep/wake cycle processes were expanded in A. poculata. These expanded gene families may play a role in A. poculataâs ability to enter a dormancy-like state (âwinter quiescenceâ) to survive freezing temperatures at the northern edges of the speciesâ range.IOS-1354935 - National Science FoundationFirst author draf
The European Reference Genome Atlas: piloting a decentralised approach to equitable biodiversity genomics.
ABSTRACT: A global genome database of all of Earthâs species diversity could be a treasure trove of scientific discoveries. However, regardless of the major advances in genome sequencing technologies, only a tiny fraction of species have genomic information available. To contribute to a more complete planetary genomic database, scientists and institutions across the world have united under the Earth BioGenome Project (EBP), which plans to sequence and assemble high-quality reference genomes for all âŒ1.5 million recognized eukaryotic species through a stepwise phased approach. As the initiative transitions into Phase II, where 150,000 species are to be sequenced in just four years, worldwide participation in the project will be fundamental to success. As the European node of the EBP, the European Reference Genome Atlas (ERGA) seeks to implement a new decentralised, accessible, equitable and inclusive model for producing high-quality reference genomes, which will inform EBP as it scales. To embark on this mission, ERGA launched a Pilot Project to establish a network across Europe to develop and test the first infrastructure of its kind for the coordinated and distributed reference genome production on 98 European eukaryotic species from sample providers across 33 European countries. Here we outline the process and challenges faced during the development of a pilot infrastructure for the production of reference genome resources, and explore the effectiveness of this approach in terms of high-quality reference genome production, considering also equity and inclusion. The outcomes and lessons learned during this pilot provide a solid foundation for ERGA while offering key learnings to other transnational and national genomic resource projects.info:eu-repo/semantics/publishedVersio
GraphUnzip: unzipping assembly graphs with long reads and Hi-C
International audienceLong reads and Hi-C have revolutionized the field of genome assembly as they have made highly contiguous assemblies accessible even for challenging genomes. As haploid chromosome-level assemblies are now commonly achieved for all types of organisms, phasing assemblies has become the new frontier for genome reconstruction. Several tools have already been released using long reads and/or Hi-C to phase assemblies, but they all start from a set of linear sequences and are ill-suited for non-model organisms with high levels of heterozygosity. We present GraphUnzip, a fast, memory-efficient and flexible tool to unzip assembly graphs into their constituent haplotypes using long reads and/or Hi-C data. As GraphUnzip only connects sequences that already had a potential link in the assembly graph, it yields high-quality gap-less supercontigs. To demonstrate the efficiency of GraphUnzip, we tested it on the human HG00733 and the potato Solanum tuberosum. In both cases, GraphUnzip yielded phased assemblies with improved contiguity
GraphUnzip: unzipping assembly graphs with long reads and Hi-C
International audienceLong reads and Hi-C have revolutionized the field of genome assembly as they have made highly contiguous assemblies accessible even for challenging genomes. As haploid chromosome-level assemblies are now commonly achieved for all types of organisms, phasing assemblies has become the new frontier for genome reconstruction. Several tools have already been released using long reads and/or Hi-C to phase assemblies, but they all start from a set of linear sequences and are ill-suited for non-model organisms with high levels of heterozygosity. We present GraphUnzip, a fast, memory-efficient and flexible tool to unzip assembly graphs into their constituent haplotypes using long reads and/or Hi-C data. As GraphUnzip only connects sequences that already had a potential link in the assembly graph, it yields high-quality gap-less supercontigs. To demonstrate the efficiency of GraphUnzip, we tested it on the human HG00733 and the potato Solanum tuberosum. In both cases, GraphUnzip yielded phased assemblies with improved contiguity
Overcoming uncollapsed haplotypes in long-read assemblies of non-model organisms
BACKGROUND: Long-read sequencing is revolutionizing genome assembly: as PacBio and Nanopore technologies become more accessible in technicity and in cost, long-read assemblers flourish and are starting to deliver chromosome-level assemblies. However, these long reads are usually error-prone, making the generation of a haploid reference out of a diploid genome a difficult enterprise. Failure to properly collapse haplotypes results in fragmented and structurally incorrect assemblies and wreaks havoc on orthology inference pipelines, yet this serious issue is rarely acknowledged and dealt with in genomic projects, and an independent, comparative benchmark of the capacity of assemblers and post-processing tools to properly collapse or purge haplotypes is still lacking. RESULTS: We tested different assembly strategies on the genome of the rotifer Adineta vaga, a non-model organism for which high coverages of both PacBio and Nanopore reads were available. The assemblers we tested (Canu, Flye, NextDenovo, Ra, Raven, Shasta and wtdbg2) exhibited strikingly different behaviors when dealing with highly heterozygous regions, resulting in variable amounts of uncollapsed haplotypes. Filtering reads generally improved haploid assemblies, and we also benchmarked three post-processing tools aimed at detecting and purging uncollapsed haplotypes in long-read assemblies: HaploMerger2, purge_haplotigs and purge_dups. CONCLUSIONS: We provide a thorough evaluation of popular assemblers on a non-model eukaryote genome with variable levels of heterozygosity. Our study highlights several strategies using pre and post-processing approaches to generate haploid assemblies with high continuity and completeness. This benchmark will help users to improve haploid assemblies of non-model organisms, and evaluate the quality of their own assemblies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-021-04118-3
DataSheet1_Revisiting genomes of non-model species with long reads yields new insights into their biology and evolution.pdf
High-quality genomes obtained using long-read data allow not only for a better understanding of heterozygosity levels, repeat content, and more accurate gene annotation and prediction when compared to those obtained with short-read technologies, but also allow to understand haplotype divergence. Advances in long-read sequencing technologies in the last years have made it possible to produce such high-quality assemblies for non-model organisms. This allows us to revisit genomes, which have been problematic to scaffold to chromosome-scale with previous generations of data and assembly software. Nematoda, one of the most diverse and speciose animal phyla within metazoans, remains poorly studied, and many previously assembled genomes are fragmented. Using long reads obtained with Nanopore R10.4.1 and PacBio HiFi, we generated highly contiguous assemblies of a diploid nematode of the Mermithidae family, for which no closely related genomes are available to date, as well as a collapsed assembly and a phased assembly for a triploid nematode from the Panagrolaimidae family. Both genomes had been analysed before, but the fragmented assemblies had scaffold sizes comparable to the length of long reads prior to assembly. Our new assemblies illustrate how long-read technologies allow for a much better representation of species genomes. We are now able to conduct more accurate downstream assays based on more complete gene and transposable element predictions.</p
SeSAM: software for automatic construction of order-robust linkage maps
Genotyping and sequencing technologies produce increasingly large numbers of genetic markers with potentially high rates of missing or erroneous data. Therefore, the construction of linkage maps is more and more complex. Moreover, the size of segregating populations remains constrained by cost issues and is less and less commensurate with the numbers of SNPs available. Thus, guaranteeing a statistically robust marker order requires that maps include only a carefully selected subset of SNPs.In this context, the SeSAM software allows automatic genetic map construction using seriation and placement approaches, to produce (1) a high-robustness framework map which includes as many markers as possible while keeping the order robustness beyond a given statistical threshold, and (2) a high-density total map including the framework plus almost all polymorphic markers. During this process, care is taken to limit the impact of genotyping errors and of missing data on mapping quality. SeSAM can be used with a wide range of biparental populations including from outcrossing species for which phases are inferred on-the-fly by maximum-likelihood during map elongation. The package also includes functions to simulate data sets, convert data formats, detect putative genotyping errors, visualize data and map quality (including graphical genotypes), and merge several maps into a consensus. SeSAM is also suitable for interactive map construction, by providing lower-level functions for 2-point and multipoint EM analyses. The software is implemented in a R package including functions in C++.SeSAM is a fully automatic linkage mapping software designed to (1) produce a framework map as robust as desired by optimizing the selection of a subset of markers, and (2) produce a high-density map including almost all polymorphic markers. The software can be used with a wide range of biparental mapping populations including cases from outcrossing. SeSAM is freely available under a GNU GPL v3 license and works on Linux, Windows, and macOS platforms. It is available as Additional file 1 and can be downloaded together with its user-manual and quick-start tutorial from ForgeMIA (SeSAM project) athttps://forgemia.inra.fr/gqe-acep/sesam/-/release
Maternal inheritance of functional centrioles in two parthenogenetic nematodes
International audienceCentrioles are the core constituent of centrosomes, microtubule-organizing centers involved in directing mitotic spindle assembly and chromosome segregation in animal cells. In sexually reproducing species, centrioles degenerate during oogenesis and female meiosis is usually acentrosomal. Centrioles are retained during male meiosis and, in most species, are reintroduced with the sperm during fertilization, restoring centriole numbers in embryos. In contrast, the presence, origin, and function of centrioles in parthenogenetic species is unknown. We found that centrioles are maternally inherited in two species of asexual parthenogenetic nematodes and identified two different strategies for maternal inheritance evolved in the two species. In Rhabditophanes diutinus , centrioles organize the poles of the meiotic spindle and are inherited by both the polar body and embryo. In Disploscapter pachys , the two pairs of centrioles remain close together and are inherited by the embryo only. Our results suggest that maternally-inherited centrioles organize the embryonic spindle poles and act as a symmetry-breaking cue to induce embryo polarization. Thus, in these parthenogenetic nematodes, centrioles are maternally-inherited and functionally replace their sperm-inherited counterparts in sexually reproducing species
Chromosome-level genome assembly and annotation of two lineages of the ant Cataglyphis hispanica: stepping stones towards genomic studies of hybridogenesis and thermal adaptation in desert ants
Cataglyphis are thermophilic ants that forage during the day when temperatures are highest and sometimes close to their critical thermal limit. Several Cataglyphis species have evolved unusual reproductive systems such as facultative queen parthenogenesis or social hybridogenesis, which have not yet been investigated in detail at the molecular level. We generated high-quality genome assemblies for two hybridogenetic lineages of the Iberian ant Cataglyphis hispanica using long-read Nanopore sequencing and exploited chromosome conformation capture (3C) sequencing to assemble contigs into 26 and 27 chromosomes, respectively. Further karyotype analyses confirm this difference in chromosome numbers between lineages; however, they also suggest it may not be fixed among lineages. We obtained transcriptomic data to assist gene annotation and built custom repeat libraries for each of the two assemblies. Comparative analyses with 19 other published ant genomes were also conducted. These new genomic resources pave the way for exploring the genetic mechanisms underlying the remarkable thermal adaptation and the molecular mechanisms associated with transitions between different genetic systems characteristic of the ant genus Cataglyphis