19 research outputs found

    Efficient whole genome haplotyping and single molecule phasing with barcode-linked reads

    No full text
    The future of human genomics is one that seeks to resolve the entirety of genetic variation through sequencing. The prospect of utilizing genomics for medical purposes require cost-efficient and accurate base calling, long-range haplotyping capability, and reliable calling of structural variants. Short-read sequencing has lead the development towards such a future but has struggled to meet the latter two of these needs. To address this limitation, we developed a technology that preserves the molecular origin of short sequencing reads, with an insignificant increase to sequencing costs. We demonstrate a library preparation method which enables whole genome haplotyping, long-range phasing of single DNA molecules, and de novo genome assembly through barcode-linked reads (BLR). Millions of random barcodes are used to reconstruct megabase-scale phase blocks and call structural variants. We also highlight the versatility of our technology by generating libraries from different organisms using picograms to nanograms of input material.QC 20180919</p

    Efficient whole genome haplotyping and single molecule phasing with barcode-linked reads

    No full text
    The future of human genomics is one that seeks to resolve the entirety of genetic variation through sequencing. The prospect of utilizing genomics for medical purposes require cost-efficient and accurate base calling, long-range haplotyping capability, and reliable calling of structural variants. Short-read sequencing has lead the development towards such a future but has struggled to meet the latter two of these needs. To address this limitation, we developed a technology that preserves the molecular origin of short sequencing reads, with an insignificant increase to sequencing costs. We demonstrate a library preparation method which enables whole genome haplotyping, long-range phasing of single DNA molecules, and de novo genome assembly through barcode-linked reads (BLR). Millions of random barcodes are used to reconstruct megabase-scale phase blocks and call structural variants. We also highlight the versatility of our technology by generating libraries from different organisms using picograms to nanograms of input material.QC 20180919</p

    Genomic basis for skin phenotype and cold adaptation in the extinct Steller’s sea cow

    Get PDF
    Steller’s sea cow, an extinct sirenian and one of the largest Quaternary mammals, was described by Georg Steller in 1741 and eradicated by humans within 27 years. Here, we complement Steller’s descriptions with paleogenomic data from 12 individuals. We identified convergent evolution between Steller’s sea cow and cetaceans but not extant sirenians, suggesting a role of several genes in adaptation to cold aquatic (or marine) environments. Among these are inactivations of lipoxygenase genes, which in humans and mouse models cause ichthyosis, a skin disease characterized by a thick, hyperkeratotic epidermis that recapitulates Steller’s sea cows’ reportedly bark-like skin. We also found that Steller’s sea cows’ abundance was continuously declining for tens of thousands of years before their description, implying that environmental changes also contributed to their extinction

    Chromosomal genome assembly of the ethanol production strain CBS 11270 indicates a highly dynamic genome structure in the yeast species Brettanomyces bruxellensis

    Get PDF
    Here, we present the genome of the industrial ethanol production strain Brettanomyces bruxellensis CBS 11270. The nuclear genome was found to be diploid, containing four chromosomes with sizes of ranging from 2.2 to 4.0 Mbp. A 75 Kbp mitochondrial genome was also identified. Comparing the homologous chromosomes, we detected that 0.32% of nucleotides were polymorphic, i.e. formed single nucleotide polymorphisms (SNPs), 40.6% of them were found in coding regions (i.e. 0.13% of all nucleotides formed SNPs and were in coding regions). In addition, 8,538 indels were found. The total number of protein coding genes was 4897, of them, 4,284 were annotated on chromosomes; and the mitochondrial genome contained 18 protein coding genes. Additionally, 595 genes, which were annotated, were on contigs not associated with chromosomes. A number of genes was duplicated, most of them as tandem repeats, including a six-gene cluster located on chromosome 3. There were also examples of interchromosomal gene duplications, including a duplication of a six-gene cluster, which was found on both chromosomes 1 and 4. Gene copy number analysis suggested loss of heterozygosity for 372 genes. This may reflect adaptation to relatively harsh but constant conditions of continuous fermentation. Analysis of gene topology showed that most of these losses occurred in clusters of more than one gene, the largest cluster comprising 33 genes. Comparative analysis against the wine isolate CBS 2499 revealed 88,534 SNPs and 8,133 indels. Moreover, when the scaffolds of the CBS 2499 genome assembly were aligned against the chromosomes of CBS 11270, many of them aligned completely, some have chunks aligned to different chromosomes, and some were in fact rearranged. Our findings indicate a highly dynamic genome within the species B. bruxellensis and a tendency towards reduction of gene number in long-term continuous cultivation

    Linked-read sequencing enables haplotype-resolved resequencing at population scale

    No full text
    The feasibility to sequence entire genomes of virtually any organism provides unprecedented insights into the evolutionary history of populations and species. Nevertheless, many population genomic inferences – including the quantification and dating of admixture, introgression and demographic events, and inference of selective sweeps – are still limited by the lack of high-quality haplotype information. The newest generation of sequencing technology now promises significant progress. To establish the feasibility of haplotype-resolved genome resequencing at population scale, we investigated properties of linked-read sequencing data of songbirds of the genus Oenanthe across a range of sequencing depths. Our results based on the comparison of downsampled (25x, 20x, 15x, 10x, 7x, and 5x) with high-coverage data (46-68x) of seven bird genomes mapped to a reference suggest that phasing contiguities and accuracies adequate for most population genomic analyses can be reached already with moderate sequencing effort. At 15x coverage, phased haplotypes span about 90% of the genome assembly, with 50 and 90 percent of phased sequences located in phase blocks longer than 1.25-4.6 Mb (N50) and 0.27-0.72 Mb (N90). Phasing accuracy reaches beyond 99% starting from 15x coverage. Higher coverages yielded higher contiguities (up to about 7 Mb/1Mb (N50/N90) at 25x coverage), but only marginally improved phasing accuracy. Phase block contiguity improved with input DNA molecule length; thus, higher-quality DNA may help keeping sequencing costs at bay. In conclusion, even for organisms with gigabase-sized genomes like birds, linked-read sequencing at moderate depth opens an affordable avenue towards haplotype-resolved genome resequencing at population scale.Funding provided by: German Research Foundation*Crossref Funder Registry ID: Award Number: BU3456/3-1Funding provided by: Science for Life Laboratory Swedish Biodiversity Program*Crossref Funder Registry ID: Award Number: 2015-R14Funding provided by: German Research FoundationCrossref Funder Registry ID: http://dx.doi.org/10.13039/501100001659Award Number: BU3456/3-1Funding provided by: Science for Life Laboratory Swedish Biodiversity ProgramCrossref Funder Registry ID: Award Number: 2015-R1410X Genomics linked-reads (60x coverage) were assembled using the Supernova 2.1 assembler. To remove duplicate scaffolds of at least 99% identity from the pseudohaploid assembly, we ran the dedupe procedure in BBTools (https://sourceforge.net/projects/bbmap/) allowing up to 7,000 edits. This reduced the assembly to 11,030 scaffolds. We then aimed to ensure that all duplicate scaffolds were removed and retain only scaffolds whose integrity can be confirmed by the presence of syntenic regions in another songbird genome. To this end, we performed a lastz alignment against the collared flycatcher assembly version 1.5, which is the highest-quality assembly available from the Muscicapidae family. For this we used lastz 1.04 with settings M=254, K=4500, L=3000, Y=15000, C=2, T=2, and --matchcount=10000. This resulted in 295 scaffolds with unique hits in the flycatcher assembly

    Interspecific Gene Flow and the Evolution of Specialization in Black and White Rhinoceros

    Get PDF
    Africa’s black (Diceros bicornis) and white (Ceratotherium simum) rhinoceros are closely related sister-taxa that evolved highly divergent obligate browsing and grazing feeding strategies. Although their precursor species Diceros praecox and Ceratotherium mauritanicum appear in the fossil record ∼5.2 Ma, by 4 Ma both were still mixed feeders, and were even spatiotemporally sympatric at several Pliocene sites in what is today Africa’s Rift Valley. Here, we ask whether or not D. praecox and C. mauritanicum were reproductively isolated when they came into Pliocene secondary contact. We sequenced and de novo assembled the first annotated black rhinoceros reference genome and compared it with available genomes of other black and white rhinoceros. We show that ancestral gene flow between D. praecox and C. mauritanicum ceased sometime between 3.3 and 4.1 Ma, despite conventional methods for the detection of gene flow from whole genome data returning false positive signatures of recent interspecific migration due to incomplete lineage sorting. We propose that ongoing Pliocene genetic exchange, for up to 2 My after initial divergence, could have potentially hindered the development of obligate feeding strategies until both species were fully reproductively isolated, but that the more severe and shifting paleoclimate of the early Pleistocene was likely the ultimate driver of ecological specialization in African rhinoceros

    Moose genomes reveal past glacial demography and the origin of modern lineages

    No full text
    Numerous megafauna species from northern latitudes went extinct during the Pleistocene/Holocene transition as a result of climate-induced habitat changes. However, several ungulate species managed to successfully track their habitats during this period to eventually flourish and recolonise the holarctic regions. So far, the genomic impacts of these climate fluctuations on ungulates from high latitudes have been little explored. Here, we assemble a de-novo genome for the European moose (Alces alces) and analyse it together with re-sequenced nuclear genomes and ancient and modern mitogenomes from across the moose range in Eurasia and North America

    Moose genomes reveal past glacial demography and the origin of modern lineages

    No full text
    Abstract Background: Numerous megafauna species from northern latitudes went extinct during the Pleistocene/Holocene transition as a result of climate-induced habitat changes. However, several ungulate species managed to successfully track their habitats during this period to eventually flourish and recolonise the holarctic regions. So far, the genomic impacts of these climate fluctuations on ungulates from high latitudes have been little explored. Here, we assemble a de-novo genome for the European moose (Alces alces) and analyse it together with re-sequenced nuclear genomes and ancient and modern mitogenomes from across the moose range in Eurasia and North America. Results: We found that moose demographic history was greatly influenced by glacial cycles, with demographic responses to the Pleistocene/Holocene transition similar to other temperate ungulates. Our results further support that modern moose lineages trace their origin back to populations that inhabited distinct glacial refugia during the Last Glacial Maximum (LGM). Finally, we found that present day moose in Europe and North America show low to moderate inbreeding levels resulting from post-glacial bottlenecks and founder effects, but no evidence for recent inbreeding resulting from human-induced population declines. Conclusions: Taken together, our results highlight the dynamic recent evolutionary history of the moose and provide an important resource for further genomic studies