Search CORE

28,884 research outputs found

Multi-platform discovery of haplotype-resolved structural variation in human genomes

Author: Ding Li
Publication venue: Digital Commons@Becker
Publication date: 01/01/2019
Field of study

Recovering complete and draft population genomes from metagenome datasets.

Author: Gilbert Jack A
Sangwan Naseer
Xia Fangfang
Publication venue: eScholarship, University of California
Publication date: 01/03/2016
Field of study

Assembly of metagenomic sequence data into microbial genomes is of fundamental value to improving our understanding of microbial ecology and metabolism by elucidating the functional potential of hard-to-culture microorganisms. Here, we provide a synthesis of available methods to bin metagenomic contigs into species-level groups and highlight how genetic diversity, sequencing depth, and coverage influence binning success. Despite the computational cost on application to deeply sequenced complex metagenomes (e.g., soil), covarying patterns of contig coverage across multiple datasets significantly improves the binning process. We also discuss and compare current genome validation methods and reveal how these methods tackle the problem of chimeric genome bins i.e., sequences from multiple species. Finally, we explore how population genome assembly can be used to uncover biogeographic trends and to characterize the effect of in situ functional constraints on the genome-wide evolution

Woods Hole Open Access Server

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

De novo assembly of transcriptomes from a B73 maize line introgressed with a QTL for resistance to gray leaf spot disease reveals a candidate allele of a lectin receptor-like kinase

Author: Berger Dave K.
Pierneef Rian
Scheepers Kevin Daniel
Sterck Lieven
Swart Velushka
Van de Peer Yves
Welgemoed Tanya
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2020
Field of study

Gray leaf spot (GLS) disease in maize, caused by the fungus Cercospora zeina, is a threat to maize production globally. Understanding the molecular basis for quantitative resistance to GLS is therefore important for food security. We developed a de novo assembly pipeline to identify candidate maize resistance genes. Near-isogenic maize lines with and without a QTL for GLS resistance on chromosome 10 from inbred CML444 were produced in the inbred B73 background. The B73-QTL line showed a 20% reduction in GLS disease symptoms compared to B73 in the field (p = 0.01). B73-QTL leaf samples from this field experiment conducted under GLS disease pressure were RNA sequenced. The reads that did not map to the B73 or C. zeina genomes were expected to contain novel defense genes and were de novo assembled. A total of 141 protein-coding sequences with B73-like or plant annotations were identified from the B73-QTL plants exposed to C. zeina. To determine whether candidate gene expression was induced by C. zeina, the RNAseq reads from C. zeina-challenged and control leaves were mapped to a master assembly of all of the B73-QTL reads, and differential gene expression analysis was conducted. Combining results from both bioinformatics approaches led to the identification of a likely candidate gene, which was a novel allele of a lectin receptor-like kinase named L-RLK-CML that (i) was induced by C. zeina, (ii) was positioned in the QTL region, and (iii) had functional domains for pathogen perception and defense signal transduction. The 817AA L-RLK-CML protein had 53 amino acid differences from its 818AA counterpart in B73. A second "B73-like" allele of L-RLK was expressed at a low level in B73-QTL. Gene copy-specific RT-qPCR confirmed that the l-rlk-cml transcript was the major product induced four-fold by C. zeina. Several other expressed defense-related candidates were identified, including a wall-associated kinase, two glutathione s-transferases, a chitinase, a glucan beta-glucosidase, a plasmodesmata callose-binding protein, several other receptor-like kinases, and components of calcium signaling, vesicular trafficking, and ethylene biosynthesis. This work presents a bioinformatics protocol for gene discovery from de novo assembled transcriptomes and identifies candidate quantitative resistance genes

Ghent University Academic Bibliography

UPSpace at the University of Pretoria

Landscape of standing variation for tandem duplications in Drosophila yakuba and Drosophila simulans

Author: Andolfatto Peter
Cridland Julie M.
Hu Tina T.
Rogers Rebekah L.
Shao Ling
Thornton Kevin R.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 07/04/2014
Field of study

We have used whole genome paired-end Illumina sequence data to identify tandem duplications in 20 isofemale lines of D. yakuba, and 20 isofemale lines of D. simulans and performed genome wide validation with PacBio long molecule sequencing. We identify 1,415 tandem duplications that are segregating in D. yakuba as well as 975 duplications in D. simulans, indicating greater variation in D. yakuba. Additionally, we observe high rates of secondary deletions at duplicated sites, with 8% of duplicated sites in D. simulans and 17% of sites in D. yakuba modified with deletions. These secondary deletions are consistent with the action of the large loop mismatch repair system acting to remove polymorphic tandem duplication, resulting in rapid dynamics of gain and loss in duplicated alleles and a richer substrate of genetic novelty than has been previously reported. Most duplications are present in only single strains, suggesting deleterious impacts are common. D. simulans shows larger numbers of whole gene duplications in comparison to larger proportions of gene fragments in D. yakuba. D. simulans displays an excess of high frequency variants on the X chromosome, consistent with adaptive evolution through duplications on the D. simulans X or demographic forces driving duplicates to high frequency. We identify 78 chimeric genes in D. yakuba and 38 chimeric genes in D. simulans, as well as 143 cases of recruited non-coding sequence in D. yakuba and 96 in D. simulans, in agreement with rates of chimeric gene origination in D. melanogaster. Together, these results suggest that tandem duplications often result in complex variation beyond whole gene duplications that offers a rich substrate of standing variation that is likely to contribute both to detrimental phenotypes and disease, as well as to adaptive evolutionary change.Comment: Revised Version- Accepted at Molecular Biology and Evolutio

arXiv.org e-Print Archive

Princeton University Open Access Repository

PubMed Central

eScholarship - University of California

Recommended from our members

Ultraaccurate genome sequencing and haplotyping of single human cells.

Author: Bafna Vineet
Bansal Vikas
Chu Wai Keung
Edge Peter
Huang Xiaohua
Lee Ho Suk
Zhang Kun
Publication venue: eScholarship, University of California
Publication date: 01/11/2017
Field of study

Accurate detection of variants and long-range haplotypes in genomes of single human cells remains very challenging. Common approaches require extensive in vitro amplification of genomes of individual cells using DNA polymerases and high-throughput short-read DNA sequencing. These approaches have two notable drawbacks. First, polymerase replication errors could generate tens of thousands of false-positive calls per genome. Second, relatively short sequence reads contain little to no haplotype information. Here we report a method, which is dubbed SISSOR (single-stranded sequencing using microfluidic reactors), for accurate single-cell genome sequencing and haplotyping. A microfluidic processor is used to separate the Watson and Crick strands of the double-stranded chromosomal DNA in a single cell and to randomly partition megabase-size DNA strands into multiple nanoliter compartments for amplification and construction of barcoded libraries for sequencing. The separation and partitioning of large single-stranded DNA fragments of the homologous chromosome pairs allows for the independent sequencing of each of the complementary and homologous strands. This enables the assembly of long haplotypes and reduction of sequence errors by using the redundant sequence information and haplotype-based error removal. We demonstrated the ability to sequence single-cell genomes with error rates as low as 10-8 and average 500-kb-long DNA fragments that can be assembled into haplotype contigs with N50 greater than 7 Mb. The performance could be further improved with more uniform amplification and more accurate sequence alignment. The ability to obtain accurate genome sequences and haplotype information from single cells will enable applications of genome sequencing for diverse clinical needs

eScholarship - University of California

Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly

Author: Albracht Derek
et al
Fulton Robert S
Graves-Lindsay Tina
Kremitzki Milinn
Magrini Vincent
Markovic Chris
McGrath Sean
Steinberg Karyn Meltz
Wilson Richard K
Publication venue: Digital Commons@Becker
Publication date: 01/01/2017
Field of study

Digital Commons@Becker

Distinct expression and methylation patterns for genes with different fates following a single whole-genome duplication in flowering plants

Author: Chen Jinming
Gugger Paul F.
Li Hui
Li Zhizhong
Marchal Kathleen
Rahmani Razgar Seyed
Shi Tao
Van de Peer Yves
Wang Muhua
Wang Qingfeng
Zhang Yue
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2020
Field of study

For most sequenced flowering plants, multiple whole-genome duplications (WGDs) are found. Duplicated genes following WGD often have different fates that can quickly disappear again, be retained for long(er) periods, or subsequently undergo small-scale duplications. However, how different expression, epigenetic regulation, and functional constraints are associated with these different gene fates following a WGD still requires further investigation due to successive WGDs in angiosperms complicating the gene trajectories. In this study, we investigate lotus (Nelumbo nucifera), an angiosperm with a single WGD during the K–pg boundary. Based on improved intraspecific-synteny identification by a chromosome-level assembly, transcriptome, and bisulfite sequencing, we explore not only the fundamental distinctions in genomic features, expression, and methylation patterns of genes with different fates after a WGD but also the factors that shape post-WGD expression divergence and expression bias between duplicates. We found that after a WGD genes that returned to single copies show the highest levels and breadth of expression, gene body methylation, and intron numbers, whereas the long-retained duplicates exhibit the highest degrees of protein–protein interactions and protein lengths and the lowest methylation in gene flanking regions. For those long-retained duplicate pairs, the degree of expression divergence correlates with their sequence divergence, degree in protein–protein interactions, and expression level, whereas their biases in expression level reflecting subgenome dominance are associated with the bias of subgenome fractionation. Overall, our study on the paleopolyploid nature of lotus highlights the impact of different functional constraints on gene fate and duplicate divergence following a single WGD in plant

Ghent University Academic Bibliography

UPSpace at the University of Pretoria