99 research outputs found

    AdapterRemoval v2:rapid adapter trimming, identification, and read merging

    Get PDF
    BACKGROUND: As high-throughput sequencing platforms produce longer and longer reads, sequences generated from short inserts, such as those obtained from fossil and degraded material, are increasingly expected to contain adapter sequences. Efficient adapter trimming algorithms are also needed to process the growing amount of data generated per sequencing run. FINDINGS: We introduce AdapterRemoval v2, a major revision of AdapterRemoval v1, which introduces (i) striking improvements in throughput, through the use of single instruction, multiple data (SIMD; SSE1 and SSE2) instructions and multi-threading support, (ii) the ability to handle datasets containing reads or read-pairs with different adapters or adapter pairs, (iii) simultaneous demultiplexing and adapter trimming, (iv) the ability to reconstruct adapter sequences from paired-end reads for poorly documented data sets, and (v) native gzip and bzip2 support. CONCLUSIONS: We show that AdapterRemoval v2 compares favorably with existing tools, while offering superior throughput to most alternatives examined here, both for single and multi-threaded operations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13104-016-1900-2) contains supplementary material, which is available to authorized users

    Fast, accurate and automatic ancient nucleosome and methylation maps with epiPALEOMIX

    Get PDF
    The first epigenomes from archaic hominins (AH) and ancient anatomically modern humans (AMH) have recently been characterized, based, however, on a limited number of samples. The extent to which ancient genome-wide epigenetic landscapes can be reconstructed thus remains contentious. Here, we present epiPALEOMIX, an open-source and user-friendly pipeline that exploits post-mortem DNA degradation patterns to reconstruct ancient methylomes and nucleosome maps from shotgun and/or capture-enrichment data. Applying epiPALEOMIX to the sequence data underlying 35 ancient genomes including AMH, AH, equids and aurochs, we investigate the temporal, geographical and preservation range of ancient epigenetic signatures. We first assess the quality of inferred ancient epigenetic signatures within well-characterized genomic regions. We find that tissue-specific methylation signatures can be obtained across a wider range of DNA preparation types than previously thought, including when no particular experimental procedures have been used to remove deaminated cytosines prior to sequencing. We identify a large subset of samples for which DNA associated with nucleosomes is protected from post-mortem degradation, and nucleosome positioning patterns can be reconstructed. Finally, we describe parameters and conditions such as DNA damage levels and sequencing depth that limit the preservation of epigenetic signatures in ancient samples. When such conditions are met, we propose that epigenetic profiles of CTCF binding regions can be used to help data authentication. Our work, including epiPALEOMIX, opens for further investigations of ancient epigenomes through time especially aimed at tracking possible epigenetic changes during major evolutionary, environmental, socioeconomic, and cultural shifts

    Genomic characterization of a South American <i>Phytophthora </i>hybrid mandates reassessment of the geographic origins of <i>Phytophthora infestans</i>

    Get PDF
    As the oomycete pathogen causing potato late blight disease, Phytophthora infestans triggered the famous 19th-century Irish potato famine and remains the leading cause of global commercial potato crop destruction. But the geographic origin of the genotype that caused this devastating initial outbreak remains disputed, as does the New World center of origin of the species itself. Both Mexico and South America have been proposed, generating considerable controversy. Here, we readdress the pathogen’s origins using a genomic data set encompassing 71 globally sourced modern and historical samples of P. infestans and the hybrid species P. andina, a close relative known only from the Andean highlands. Previous studies have suggested that the nuclear DNA lineage behind the initial outbreaks in Europe in 1845 is now extinct. Analysis of P. andina’s phased haplotypes recovered eight haploid genome sequences, four of which represent a previously unknown basal lineage of P. infestans closely related to the famine-era lineage. Our analyses further reveal that clonal lineages of both P. andina and historical P. infestans diverged earlier than modern Mexican lineages, casting doubt on recent claims of a Mexican center of origin. Finally, we use haplotype phasing to demonstrate that basal branches of the clade comprising Mexican samples are occupied by clonal isolates collected from wild Solanum hosts, suggesting that modern Mexican P. infestans diversified on Solanum tuberosum after a host jump from a wild species and that the origins of P. infestans are more complex than was previously thought

    Identification of genetic variants associated with a wide spectrum of phenotypes clinically diagnosed as Sanfilippo and Morquio syndromes using whole genome sequencing

    Get PDF
    Mucopolysaccharidoses (MPSs) are inherited lysosomal storage disorders (LSDs). MPSs are caused by excessive accumulation of mucopolysaccharides due to missing or deficiency of enzymes required for the degradation of specific macromolecules. MPS I-IV, MPS VI, MPS VII, and MPS IX are sub-types of mucopolysaccharidoses. Among these, MPS III (also known as Sanfilippo) and MPS IV (Morquio) syndromes are lethal and prevalent sub-types. This study aimed to identify causal genetic variants in cases of MPS III and MPS IV and characterize genotype-phenotype relations in Pakistan. We performed clinical, biochemical and genetic analysis using Whole Genome Sequencing (WGS) in 14 Pakistani families affected with MPS III or MPS IV. Patients were classified into MPS III by history of aggressive behaviors, dementia, clear cornea and into MPS IV by short trunk, short stature, reversed ratio of upper segment to lower segment with a short upper segment. Data analysis and variant selections were made based on segregation analysis, examination of known MPS III and MPS IV genes, gene function, gene expression, the pathogenicity of variants based on ACMG guidelines and in silico analysis. In total, 58 individuals from 14 families were included in the present study. Six families were clinically diagnosed with MPS III and eight families with MPS IV. WGS revealed variants in MPS-associated genes including NAGLU, SGSH, GALNS, GNPTG as well as the genes VWA3B, BTD, and GNPTG which have not previously associated with MPS. One family had causal variants in both GALNS and BTD. Accurate and early diagnosis of MPS in children represents a helpful step for designing therapeutic strategies to protect different organs from permanent damage. In addition, pre-natal screening and identification of genetic etiology will facilitate genetic counselling of the affected families. Identification of novel causal MPS genes might help identifying new targeted therapies to treat LSDs

    Ancient genomics

    Get PDF
    The past decade has witnessed a revolution in ancient DNA (aDNA) research. Although the field's focus was previously limited to mitochondrial DNA and a few nuclear markers, whole genome sequences from the deep past can now be retrieved. This breakthrough is tightly connected to the massive sequence throughput of next generation sequencing platforms and the ability to target short and degraded DNA molecules. Many ancient specimens previously unsuitable for DNA analyses because of extensive degradation can now successfully be used as source materials. Additionally, the analytical power obtained by increasing the number of sequence reads to billions effectively means that contamination issues that have haunted aDNA research for decades, particularly in human studies, can now be efficiently and confidently quantified. At present, whole genomes have been sequenced from ancient anatomically modern humans, archaic hominins, ancient pathogens and megafaunal species. Those have revealed important functional and phenotypic information, as well as unexpected adaptation, migration and admixture patterns. As such, the field of aDNA has entered the new era of genomics and has provided valuable information when testing specific hypotheses related to the past.No Full Tex

    Early divergent strains of Yersinia pestis in Eurasia 5,000 years ago.

    Get PDF
    The bacteria Yersinia pestis is the etiological agent of plague and has caused human pandemics with millions of deaths in historic times. How and when it originated remains contentious. Here, we report the oldest direct evidence of Yersinia pestis identified by ancient DNA in human teeth from Asia and Europe dating from 2,800 to 5,000 years ago. By sequencing the genomes, we find that these ancient plague strains are basal to all known Yersinia pestis. We find the origins of the Yersinia pestis lineage to be at least two times older than previous estimates. We also identify a temporal sequence of genetic changes that lead to increased virulence and the emergence of the bubonic plague. Our results show that plague infection was endemic in the human populations of Eurasia at least 3,000 years before any historical recordings of pandemics

    Comparative genomics reveals insights into avian genome evolution and adaptation

    Get PDF
    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits

    Dire wolves were the last of an ancient New World canid lineage

    Get PDF
    Dire wolves are considered to be one of the most common and widespread large carnivores in Pleistocene America1, yet relatively little is known about their evolution or extinction. Here, to reconstruct the evolutionary history of dire wolves, we sequenced five genomes from sub-fossil remains dating from 13,000 to more than 50,000 years ago. Our results indicate that although they were similar morphologically to the extant grey wolf, dire wolves were a highly divergent lineage that split from living canids around 5.7 million years ago. In contrast to numerous examples of hybridization across Canidae2,3, there is no evidence for gene flow between dire wolves and either North American grey wolves or coyotes. This suggests that dire wolves evolved in isolation from the Pleistocene ancestors of these species. Our results also support an early New World origin of dire wolves, while the ancestors of grey wolves, coyotes and dholes evolved in Eurasia and colonized North America only relatively recently
    corecore