12 research outputs found

    Whole-genome sequencing of <em>Oryza brachyantha</em> reveals mechanisms underlying <em>Oryza</em> genome evolution

    Get PDF
    The wild species of the genus Oryza contain a largely untapped reservoir of agronomically important genes for rice improvement. Here we report the 261-Mb de novo assembled genome sequence of Oryza brachyantha. Low activity of long-terminal repeat retrotransposons and massive internal deletions of ancient long-terminal repeat elements lead to the compact genome of Oryza brachyantha. We model 32,038 protein-coding genes in the Oryza brachyantha genome, of which only 70% are located in collinear positions in comparison with the rice genome. Analysing breakpoints of non-collinear genes suggests that double-strand break repair through non-homologous end joining has an important role in gene movement and erosion of collinearity in the Oryza genomes. Transition of euchromatin to heterochromatin in the rice genome is accompanied by segmental and tandem duplications, further expanded by transposable element insertions. The high-quality reference genome sequence of Oryza brachyantha provides an important resource for functional and evolutionary studies in the genus Oryza

    Two Antarctic penguin genomes reveal insights into their evolutionary history and molecular changes related to the Antarctic environment. GigaScience

    Get PDF
    Abstract Background: Penguins are flightless aquatic birds widely distributed in the Southern Hemisphere. The distinctive morphological and physiological features of penguins allow them to live an aquatic life, and some of them have successfully adapted to the hostile environments in Antarctica. To study the phylogenetic and population history of penguins and the molecular basis of their adaptations to Antarctica, we sequenced the genomes of the two Antarctic dwelling penguin species, the Adélie penguin [Pygoscelis adeliae] and emperor penguin [Aptenodytes forsteri]. Results: Phylogenetic dating suggests that early penguins arose~60 million years ago, coinciding with a period of global warming. Analysis of effective population sizes reveals that the two penguin species experienced population expansions from~1 million years ago to~100 thousand years ago, but responded differently to the climatic cooling of the last glacial period. Comparative genomic analyses with other available avian genomes identified molecular changes in genes related to epidermal structure, phototransduction, lipid metabolism, and forelimb morphology. Conclusions: Our sequencing and initial analyses of the first two penguin genomes provide insights into the timing of penguin origin, fluctuations in effective population sizes of the two penguin species over the past 10 million years, and the potential associations between these biological patterns and global climate change. The molecular changes compared with other avian genomes reflect both shared and diverse adaptations of the two penguin species to the Antarctic environment

    Comparison of long-range PE sequencing methods.

    No full text
    <p>(A–D) Long-range PE sequencing with linker oligonucleotides. In these methods, biotin-labeled linker oligonucleotides are added to the two ends of long-range DNA fragments, followed by enzymes-induced intra-molecule circularization, and recovery of the paired-end for sequencing. The addition of linker oligonucleotides and subsequent complex enzyme reactions require 5–8 recoveries before capturing the paired-ends from circularized DNA fragments. In addition, the use of expensive enzymes involves additional costs. (E), Long-range PE sequencing by direct intra-molecule ligation or molecular linker-free circularization. In the method, the 3′ends of long-range DNA fragments were biotin-labeled, followed by direct intra-molecule circularization and recovery of PE ends. This method requires less recovery steps (3–4) and no complex enzyme reaction system. The steps for DNA recovery are in bold. We applied the method E in this research.</p

    Two long insertions in YH genome detected by long-range PE.

    No full text
    <p>Mapping the long-range PE reads back to the human genome (NCBI build 37) resulted in the detection of a previously identified ∼8 kb insertion in chromosome 7 (A) and a novel ∼7 kb insertion in chromosome 14 (B) in the YH genome. The abnormally mapped PE reads that supported the insertions by showing unexpected short insert size are shown.</p

    Paired-End Sequencing of Long-Range DNA Fragments for <em>De Novo</em> Assembly of Large, Complex Mammalian Genomes by Direct Intra-Molecule Ligation

    No full text
    <div><h3>Background</h3><p>The relatively short read lengths from next generation sequencing (NGS) technologies still pose a challenge for <em>de novo</em> assembly of complex mammal genomes. One important solution is to use paired-end (PE) sequence information experimentally obtained from long-range DNA fragments (>1 kb). Here, we characterize and extend a long-range PE library construction method based on direct intra-molecule ligation (or molecular linker-free circularization) for NGS.</p> <h3>Results</h3><p>We found that the method performs stably for PE sequencing of 2- to 5- kb DNA fragments, and can be extended to 10–20 kb (and even in extremes, up to ∼35 kb). We also characterized the impact of low quality input DNA on the method, and develop a whole-genome amplification (WGA) based protocol using limited input DNA (<1 µg). Using this PE dataset, we accurately assembled the YanHuang (YH) genome, the first sequenced Asian genome, into a scaffold N50 size of >2 Mb, which is over100-times greater than the initial size produced with only small insert PE reads(17 kb). In addition, we mapped two 7- to 8- kb insertions in the YH genome using the larger insert sizes of the long-range PE data.</p> <h3>Conclusions</h3><p>In conclusion, we demonstrate here the effectiveness of this long-range PE sequencing method and its use for the <em>de novo</em> assembly of a large, complex genome using NGS short reads.</p> </div

    Summary of <i>de novo</i> YH genome assembly.

    No full text
    <p>The data from the YH project was used for the contig and initial scaffold assembly. Then, the long-range PE data were added step by step for scaffold construction. Genome coverage and gene coverage was calculated using the NCBI build 37 and RefSeq gene set as reference, respectively. The X and Y chromosomes were excluded while calculating genome coverage and gene coverage. For calculation of scaffold N50, N90 and total length, the intra-scaffold gaps were included.</p

    <i>De novo</i> assembly of the YH genome.

    No full text
    <p>(A), The YH scaffold N50 (green bar) and N90 (blue bar) sizes were dramatically improvement with the addition of long-range PE information (from 2 kb to 35 kb). The trends of improvement are shown as a dashed line. (B), Alignment between the assembled YH scaffolds (y-axis) and the reference human genome (NCBI build 37, x-xis) on chr8. Local repeat level in the reference chr8 (calculated in a 1-kb window) is showed in color along the chromosome at the top-up bar. The white blocks in the bar represent the gaps in the reference genome. (C), Alignment of the YH scaffold 320 onto the reference chr8. Local repeat level on the region of the reference chr8 is also shown in color along the sequence (calculated in a 1-kb window).</p

    Insert-size distributions of long-range PE sequencing libraries.

    No full text
    <p>(A), 2- to 35-kb libraries; (B), 10 kb-WGA and 10 kb-dam libraries. The read-pairs that were uniquely mapped to the human genome (NCBI build 37) were used for this analysis. The insert size of a library and its corresponding small insert read contamination are shown in the ‘−’ and ‘+’direction of the x-axis, respectively. The ‘−’ direction represents the orientation relationship between PEs from circularized long-range DNA molecules (>1 kb) when mapped to the human genome, while ‘+’ represents that between the two ends from linear small DNA fragments (∼500 bp).</p

    Genome sequence of ground tit Pseudopodoces humilis and its adaptation to high altitude

    Get PDF
    Background: The mechanism of high-altitude adaptation has been studied in certain mammals. However, in avian species like the ground tit Pseudopodoces humilis, the adaptation mechanism remains unclear. The phylogeny of the ground tit is also controversial.Results: Using next generation sequencing technology, we generated and assembled a draft genome sequence of the ground tit. The assembly contained 1.04 Gb of sequence that covered 95.4% of the whole genome and had higher N50 values, at the level of both scaffolds and contigs, than other sequenced avian genomes. About 1.7 million SNPs were detected, 16,998 protein-coding genes were predicted and 7% of the genome was identified as repeat sequences. Comparisons between the ground tit genome and other avian genomes revealed a conserved genome structure and confirmed the phylogeny of ground tit as not belonging to the Corvidae family. Gene family expansion and positively selected gene analysis revealed genes that were related to cardiac function. Our findings contribute to our understanding of the adaptation of this species to extreme environmental living conditions.Conclusions: Our data and analysis contribute to the study of avian evolutionary history and provide new insights into the adaptation mechanisms to extreme conditions in animals
    corecore