107 research outputs found

    Teaser: Individualized benchmarking and optimization of read mapping results for NGS data

    Get PDF
    Mapping reads to a genome remains challenging, especially for non-model organisms with lower quality assemblies, or for organisms with higher mutation rates. While most research has focused on speeding up the mapping process, little attention has been paid to optimize the choice of mapper and parameters for a user's dataset. Here, we present Teaser, a software that assists in these choices through rapid automated benchmarking of different mappers and parameter settings for individualized data. Within minutes, Teaser completes a quantitative evaluation of an ensemble of mapping algorithms and parameters. We use Teaser to demonstrate how Bowtie2 can be optimized for different data

    The Candida albicans Histone Acetyltransferase Hat1 Regulates Stress Resistance and Virulence via Distinct Chromatin Assembly Pathways

    Get PDF
    Human fungal pathogens like Candida albicans respond to host immune surveillance by rapidly adapting their transcriptional programs. Chromatin assembly factors are involved in the regulation of stress genes by modulating the histone density at these loci. Here, we report a novel role for the chromatin assembly-associated histone acetyltransferase complex NuB4 in regulating oxidative stress resistance, antifungal drug tolerance and virulence in C. albicans. Strikingly, depletion of the NuB4 catalytic subunit, the histone acetyltransferase Hat1, markedly increases resistance to oxidative stress and tolerance to azole antifungals. Hydrogen peroxide resistance in cells lacking Hat1 results from higher induction rates of oxidative stress gene expression, accompanied by reduced histone density as well as subsequent increased RNA polymerase recruitment. Furthermore, hat1Delta/Delta cells, despite showing growth defects in vitro, display reduced susceptibility to reactive oxygen-mediated killing by innate immune cells. Thus, clearance from infected mice is delayed although cells lacking Hat1 are severely compromised in killing the host. Interestingly, increased oxidative stress resistance and azole tolerance are phenocopied by the loss of histone chaperone complexes CAF-1 and HIR, respectively, suggesting a central role for NuB4 in the delivery of histones destined for chromatin assembly via distinct pathways. Remarkably, the oxidative stress phenotype of hat1Delta/Delta cells is a species-specific trait only found in C. albicans and members of the CTG clade. The reduced azole susceptibility appears to be conserved in a wider range of fungi. Thus, our work demonstrates how highly conserved chromatin assembly pathways can acquire new functions in pathogenic fungi during coevolution with the host

    Paragraph: A graph-based structural variant genotyper for short-read sequence data

    Get PDF
    Accurate detection and genotyping of structural variations (SVs) from short-read data is a long-standing area of development in genomics research and clinical sequencing pipelines. We introduce Paragraph, an accurate genotyper that models SVs using sequence graphs and SV annotations. We demonstrate the accuracy of Paragraph on whole-genome sequence data from three samples using long-read SV calls as the truth set, and then apply Paragraph at scale to a cohort of 100 short-read sequenced samples of diverse ancestry. Our analysis shows that Paragraph has better accuracy than other existing genotypers and can be applied to population-scale studies. © 2019 The Author(s)

    Complex mosaic structural variations in human fetal brains

    Get PDF
    Somatic mosaicism, manifesting as single nucleotide variants (SNVs), mobile element insertions and structural changes in the DNA, is a common phenomenon in human brain cells, with potential functional consequences. Using a clonal approach, we previously detected 200-400 mosaic SNVs per cell in three human fetal brains (15 to 21 weeks post-conception). However, structural variation in the human fetal brain has not yet been investigated. Here, we discover and validate four mosaic structural variants (SVs) in the same brains and resolve their precise breakpoints. The SVs were of kilobase scale and complex, consisting of deletion(s) and rearranged genomic fragments, which sometimes originated from different chromosomes. Sequences at the breakpoints of these rearrangements had microhomologies, suggesting their origin from replication errors. One SV was found in two clones and we timed its origin to ~14 weeks post-conception. No large scale mosaic copy number variants (CNVs) were detectable in normal fetal human brains, suggesting that previously reported megabase-scale CNVs in neurons arise at later stages of development. By reanalysis of public single nuclei data from adult brain neurons, we detected an extra-chromosomal circular DNA event. Our study reveals the existence of mosaic SVs in the developing human brain, likely arising from cell proliferation during mid-neurogenesis. Although relatively rare compared to SNVs, and present in ~10% neurons, SVs in developing human brain affect a comparable number of bases in the genome (~6,200 vs ~4,000 bps), implying that they may have similar functional consequences

    RaGOO: fast and accurate reference-guided scaffolding of draft genomes

    Get PDF
    We present RaGOO, a reference-guided contig ordering and orienting tool that leverages the speed and sensitivity of Minimap2 to accurately achieve chromosome-scale assemblies in minutes. After the pseudomolecules are constructed, RaGOO identifies structural variants, including those spanning sequencing gaps. We show that RaGOO accurately orders and orients 3 de novo tomato genome assemblies, including the widely used M82 reference cultivar. We then demonstrate the scalability and utility of RaGOO with a pan-genome analysis of 103 Arabidopsis thaliana accessions by examining the structural variants detected in the newly assembled pseudomolecules. RaGOO is available open source at https://github.com/malonge/RaGOO

    Fully Resolved assembly of Cryptosporidium Parvum

    Get PDF
    BACKGROUND: Cryptosporidium parvum is an apicomplexan parasite commonly found across many host species with a global infection prevalence in human populations of 7.6%. Understanding its diversity and genomic makeup can help in fighting established infections and prohibiting further transmission. The basis of every genomic study is a high-quality reference genome that has continuity and completeness, thus enabling comprehensive comparative studies. FINDINGS: Here, we provide a highly accurate and complete reference genome of Cryptosporidium parvum. The assembly is based on Oxford Nanopore reads and was improved using Illumina reads for error correction. We also outline how to evaluate and choose from different assembly methods based on 2 main approaches that can be applied to other Cryptosporidium species. The assembly encompasses 8 chromosomes and includes 13 telomeres that were resolved. Overall, the assembly shows a high completion rate with 98.4% single-copy BUSCO genes. CONCLUSIONS: This high-quality reference genome of a zoonotic IIaA17G2R1 C. parvum subtype isolate provides the basis for subsequent comparative genomic studies across the Cryptosporidium clade. This will enable improved understanding of diversity, functional, and association studies

    Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

    Get PDF
    We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics

    NucTools: analysis of chromatin feature occupancy profiles from high-throughput sequencing data

    Get PDF
    Background: Biomedical applications of high-throughput sequencing methods generate a vast amount of data in which numerous chromatin features are mapped along the genome. The results are frequently analysed by creating binary data sets that link the presence/absence of a given feature to specific genomic loci. However, the nucleosome occupancy or chromatin accessibility landscape is essentially continuous. It is currently a challenge in the field to cope with continuous distributions of deep sequencing chromatin readouts and to integrate the different types of discrete chromatin features to reveal linkages between them. Results: Here we introduce the NucTools suite of Perl scripts as well as MATLAB- and R-based visualization programs for a nucleosome-centred downstream analysis of deep sequencing data. NucTools accounts for the continuous distribution of nucleosome occupancy. It allows calculations of nucleosome occupancy profiles averaged over several replicates, comparisons of nucleosome occupancy landscapes between different experimental conditions, and the estimation of the changes of integral chromatin properties such as the nucleosome repeat length. Furthermore, NucTools facilitates the annotation of nucleosome occupancy with other chromatin features like binding of transcription factors or architectural proteins, and epigenetic marks like histone modifications or DNA methylation. The applications of NucTools are demonstrated for the comparison of several datasets for nucleosome occupancy in mouse embryonic stem cells (ESCs) and mouse embryonic fibroblasts (MEFs). Conclusions: The typical workflows of data processing and integrative analysis with NucTools reveal information on the interplay of nucleosome positioning with other features such as for example binding of a transcription factor CTCF, regions with stable and unstable nucleosomes, and domains of large organized chromatin K9me2 modifications (LOCKs). As potential limitations and problems we discuss how inter-replicate variability of MNase-seq experiments can be addressed
    • …
    corecore