202 research outputs found

    Single-molecule real-time sequencing combined with optical mapping yields completely finished fungal genome

    Get PDF
    Next-generation sequencing (NGS) technologies have increased the scalability, speed, and resolution of genomic sequencing and, thus, have revolutionized genomic studies. However, eukaryotic genome sequencing initiatives typically yield considerably fragmented genome assemblies. Here, we assessed various state-of-the-art sequencing and assembly strategies in order to produce a contiguous and complete eukaryotic genome assembly, focusing on the filamentous fungus Verticillium dahliae. Compared with Illumina-based assemblies of the V. dahliae genome, hybrid assemblies that also include PacBio- generated long reads establish superior contiguity. Intriguingly, provided that sufficient sequence depth is reached, assemblies solely based on PacBio reads outperform hybrid assemblies and even result in fully assembled chromosomes. Furthermore, the addition of optical map data allowed us to produce a gapless and complete V. dahliae genome assembly of the expected eight chromosomes from telomere to telomere. Consequently, we can now study genomic regions that were previously not assembled or poorly assembled, including regions that are populated by repetitive sequences, such as transposons, allowing us to fully appreciate an organism’s biological complexity. Our data show that a combination of PacBio-generated long reads and optical mapping can be used to generate complete and gapless assemblies of fungal genomes. IMPORTANCE Studying whole-genome sequences has become an important aspect of biological research. The advent of nextgeneration sequencing (NGS) technologies has nowadays brought genomic science within reach of most research laboratories, including those that study nonmodel organisms. However, most genome sequencing initiatives typically yield (highly) fragmented genome assemblies. Nevertheless, considerable relevant information related to genome structure and evolution is likely hidden in those nonassembled regions. Here, we investigated a diverse set of strategies to obtain gapless genome assemblies, using the genome of a typical ascomycete fungus as the template. Eventually, we were able to show that a combination of PacBiogenerated long reads and optical mapping yields a gapless telomere-to-telomere genome assembly, allowing in-depth genome sanalyses to facilitate functional studies into an organism’s biology

    Towards a New Science of a Clinical Data Intelligence

    Full text link
    In this paper we define Clinical Data Intelligence as the analysis of data generated in the clinical routine with the goal of improving patient care. We define a science of a Clinical Data Intelligence as a data analysis that permits the derivation of scientific, i.e., generalizable and reliable results. We argue that a science of a Clinical Data Intelligence is sensible in the context of a Big Data analysis, i.e., with data from many patients and with complete patient information. We discuss that Clinical Data Intelligence requires the joint efforts of knowledge engineering, information extraction (from textual and other unstructured data), and statistics and statistical machine learning. We describe some of our main results as conjectures and relate them to a recently funded research project involving two major German university hospitals.Comment: NIPS 2013 Workshop: Machine Learning for Clinical Data Analysis and Healthcare, 201

    Meiosis Drives Extraordinary Genome Plasticity in the Haploid Fungal Plant Pathogen Mycosphaerella graminicola

    Get PDF
    Meiosis in the haploid plant-pathogenic fungus Mycosphaerella graminicola results in eight ascospores due to a mitotic division following the two meiotic divisions. The transient diploid phase allows for recombination among homologous chromosomes. However, some chromosomes of M. graminicola lack homologs and do not pair during meiosis. Because these chromosomes are not present universally in the genome of the organism they can be considered to be dispensable. To analyze the meiotic transmission of unequal chromosome numbers, two segregating populations were generated by crossing genetically unrelated parent isolates originating from Algeria and The Netherlands that had pathogenicity towards durum or bread wheat, respectively. Detailed genetic analyses of these progenies using high-density mapping (1793 DArT, 258 AFLP and 25 SSR markers) and graphical genotyping revealed that M. graminicola has up to eight dispensable chromosomes, the highest number reported in filamentous fungi. These chromosomes vary from 0.39 to 0.77 Mb in size, and represent up to 38% of the chromosomal complement. Chromosome numbers among progeny isolates varied widely, with some progeny missing up to three chromosomes, while other strains were disomic for one or more chromosomes. Between 15–20% of the progeny isolates lacked one or more chromosomes that were present in both parents. The two high-density maps showed no recombination of dispensable chromosomes and hence, their meiotic processing may require distributive disjunction, a phenomenon that is rarely observed in fungi. The maps also enabled the identification of individual twin isolates from a single ascus that shared the same missing or doubled chromosomes indicating that the chromosomal polymorphisms were mitotically stable and originated from nondisjunction during the second division and, less frequently, during the first division of fungal meiosis. High genome plasticity could be among the strategies enabling this versatile pathogen to quickly overcome adverse biotic and abiotic conditions in wheat field

    Whole Genome Profiling provides a robust framework for physical mapping and sequencing in the highly complex and repetitive wheat genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Sequencing projects using a clone-by-clone approach require the availability of a robust physical map. The SNaPshot technology, based on pair-wise comparisons of restriction fragments sizes, has been used recently to build the first physical map of a wheat chromosome and to complete the maize physical map. However, restriction fragments sizes shared randomly between two non-overlapping BACs often lead to chimerical contigs and mis-assembled BACs in such large and repetitive genomes. Whole Genome Profiling (WGP™) was developed recently as a new sequence-based physical mapping technology and has the potential to limit this problem.</p> <p>Results</p> <p>A subset of the wheat 3B chromosome BAC library covering 230 Mb was used to establish a WGP physical map and to compare it to a map obtained with the SNaPshot technology. We first adapted the WGP-based assembly methodology to cope with the complexity of the wheat genome. Then, the results showed that the WGP map covers the same length than the SNaPshot map but with 30% less contigs and, more importantly with 3.5 times less mis-assembled BACs. Finally, we evaluated the benefit of integrating WGP tags in different sequence assemblies obtained after Roche/454 sequencing of BAC pools. We showed that while WGP tag integration improves assemblies performed with unpaired reads and with paired-end reads at low coverage, it does not significantly improve sequence assemblies performed at high coverage (25x) with paired-end reads.</p> <p>Conclusions</p> <p>Our results demonstrate that, with a suitable assembly methodology, WGP builds more robust physical maps than the SNaPshot technology in wheat and that WGP can be adapted to any genome. Moreover, WGP tag integration in sequence assemblies improves low quality assembly. However, to achieve a high quality draft sequence assembly, a sequencing depth of 25x paired-end reads is required, at which point WGP tag integration does not provide additional scaffolding value. Finally, we suggest that WGP tags can support the efficient sequencing of BAC pools by enabling reliable assignment of sequence scaffolds to their BAC of origin, a feature that is of great interest when using BAC pooling strategies to reduce the cost of sequencing large genomes.</p

    ENSO and Pacific decadal variability in the Community Climate System Model Version 4

    Get PDF
    Author Posting. © American Meteorological Society, 2012. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Climate 25 (2012): 2622–2651, doi:10.1175/JCLI-D-11-00301.1.This study presents an overview of the El Niño–Southern Oscillation (ENSO) phenomenon and Pacific decadal variability (PDV) simulated in a multicentury preindustrial control integration of the NCAR Community Climate System Model version 4 (CCSM4) at nominal 1° latitude–longitude resolution. Several aspects of ENSO are improved in CCSM4 compared to its predecessor CCSM3, including the lengthened period (3–6 yr), the larger range of amplitude and frequency of events, and the longer duration of La Niña compared to El Niño. However, the overall magnitude of ENSO in CCSM4 is overestimated by ~30%. The simulated ENSO exhibits characteristics consistent with the delayed/recharge oscillator paradigm, including correspondence between the lengthened period and increased latitudinal width of the anomalous equatorial zonal wind stress. Global seasonal atmospheric teleconnections with accompanying impacts on precipitation and temperature are generally well simulated, although the wintertime deepening of the Aleutian low erroneously persists into spring. The vertical structure of the upper-ocean temperature response to ENSO in the north and south Pacific displays a realistic seasonal evolution, with notable asymmetries between warm and cold events. The model shows evidence of atmospheric circulation precursors over the North Pacific associated with the “seasonal footprinting mechanism,” similar to observations. Simulated PDV exhibits a significant spectral peak around 15 yr, with generally realistic spatial pattern and magnitude. However, PDV linkages between the tropics and extratropics are weaker than observed.M. Alexander, A. Capotondi, and J. Scott’s participation was supported by a grant from the NSF Climate and Large-scale Dynamics Program. Y.-O. Kwon gratefully acknowledges support from a WHOI Heyman fellowship and a grant from the NSF Climate and Largescale Dynamics Program. The CESM project is supported by the National Science Foundation and the Office of Science (BER) of the U.S. Department of Energy.2012-10-1

    Appropriation and subversion: pre-communist literacy, communist party saturation, and post-communist democratic outcomes

    Get PDF
    Twenty-five years after the collapse of communism in Europe, few scholars disagree that the past continues to shape the democratic trajectories of postcommunist states. Precommunist education has featured prominently in this literature’s bundle of “good” legacies because it ostensibly helped foster resistance to communism. The authors propose a different causal mechanism—appropriation and subversion—that challenges the linearity of the above assumptions by analyzing the effects of precommunist literacy on patterns of Communist Party recruitment in Russia’s regions. Rather than regarding precommunist education as a source of latent resistance to communism, the authors highlight the Leninist regime’s successful appropriation of the more literate strata of the precommunist orders, in the process subverting the past democratic edge of the hitherto comparatively more developed areas. The linear regression analysis of author-assembled statistics from the first Russian imperial census of 1897 supports prior research: precommunist literacy has a strong positive association with postcommunist democratic outcomes. Nevertheless, in pursuing causal mediation analysis, the authors find, in addition, that the above effect is mediated by Communist Party saturation in Russia’s regions. Party functionaries were likely to be drawn from areas that had been comparatively more literate in tsarist times, and party saturation in turn had a dampening effect on the otherwise positive effects of precommunist education on postcommunist democracy

    Accounting for Centennial Scale Variability when Detecting Changes in ENSO: a study of the Pliocene

    Get PDF
    The El Niño Southern Oscillation (ENSO) is the dominant mode of interannual climate variability. However, climate models are inconsistent in future predictions of ENSO, and long term variations in ENSO cannot be quantified from the short instrumental records available. Here we analyse ENSO behaviour in millennial-scale climate simulations of a warm climate of the past, the mid-Pliocene Warm Period (mPWP; ∼3.3 − 3.0Ma). We consider centennial-scale variability in ENSO for both the mPWP and the preindustrial, and consider which changes between the two climates are detectable above this variability. We find that El Niño typically occurred 12% less frequently in the mPWP but with a 20% longer duration, and with stronger amplitude in precipitation and temperature. However low frequency variability in ENSO meant that Pliocene-preindustrial changes in El Niño temperature amplitude in the NINO3.4 region (5° N-5° S, 170° W-120° W) were not always detectable. The Pliocene-preindustrial El Niño temperature signal in the NINO4 region (5° N-5° S, 160° E-150° W) and the El Niño precipitation signal are usually larger than centennial scale variations of El Niño amplitude, and provide consistent indications of ENSO amplitude change. The enhanced mPWP temperature signal in the NINO4 region is associated with an increase in Central Pacific El Niño events similar to those observed in recent decades and predicted for the future. This study highlights the importance of considering centennial scale variability when comparing ENSO changes between two climate states. If centennial scale variability in ENSO has not first been established, results suggesting changes in ENSO behaviour may not be robust

    Diversity arrays technology (DArT) markers in apple for genetic linkage maps

    Get PDF
    Diversity Arrays Technology (DArT) provides a high-throughput whole-genome genotyping platform for the detection and scoring of hundreds of polymorphic loci without any need for prior sequence information. The work presented here details the development and performance of a DArT genotyping array for apple. This is the first paper on DArT in horticultural trees. Genetic mapping of DArT markers in two mapping populations and their integration with other marker types showed that DArT is a powerful high-throughput method for obtaining accurate and reproducible marker data, despite the low cost per data point. This method appears to be suitable for aligning the genetic maps of different segregating populations. The standard complexity reduction method, based on the methylation-sensitive PstI restriction enzyme, resulted in a high frequency of markers, although there was 52–54% redundancy due to the repeated sampling of highly similar sequences. Sequencing of the marker clones showed that they are significantly enriched for low-copy, genic regions. The genome coverage using the standard method was 55–76%. For improved genome coverage, an alternative complexity reduction method was examined, which resulted in less redundancy and additional segregating markers. The DArT markers proved to be of high quality and were very suitable for genetic mapping at low cost for the apple, providing moderate genome coverage
    corecore