81 research outputs found

    Hyb-Seq for flowering plant systematics

    Get PDF
    High-throughput DNA sequencing (HTS) presents great opportunities for plant systematics, yet genomic complexity needs to be reduced for HTS to be effectively applied. We highlight Hyb-Seq as a promising approach, especially in light of the recent development of probes enriching 353 low-copy nuclear genes from any flowering plant taxon

    The ecology of palm genomes: repeat-associated genome size expansion is constrained by aridity

    Get PDF
    Genome size varies 2400-fold across plants, influencing their evolution through changes in cell size and cell division rates which impact plants' environmental stress tolerance. Repetitive element expansion explains much genome size diversity, and the processes structuring repeat "communities" are analogous to those structuring ecological communities. However, which environmental stressors influence repeat community dynamics has not yet been examined from an ecological perspective. We measured genome size and leveraged climatic data for 91% of genera within the ecologically diverse palm family (Arecaceae). We then generated genomic repeat profiles for 141 palm species, and analysed repeats using phylogenetically informed linear models to explore relationships between repeat dynamics and environmental factors. We show that palm genome size and repeat "community" composition are best explained by aridity. Specifically, Ty3-gypsy and TIR elements were more abundant in palm species from wetter environments, which generally had larger genomes, suggesting amplification. By contrast, Ty1-copia and LINE elements were more abundant in drier environments. Our results suggest that water stress inhibits repeat expansion through selection on upper genome size limits. However, elements that may associate with stress-response genes (e.g. Ty1-copia) have amplified in arid-adapted palm species. Overall, we provide novel evidence of climate influencing the assembly of repeat "communities".JP was supported by a Ramón y Cajal Fellowship (RYC-2017-2274) funded by MCIN/AEI/10.13039/501100011033 and by ‘ESF Investing in your future’. SB was funded by a Garfield Weston Foundation postdoctoral fellowship. PN and JM were supported by the ELIXIR CZ Research Infrastructure Project (Czech Ministry of Education, Youth and Sports; grant no. LM2018131).IntroductionMaterials and Methods Plant material collection and genome size measurement Phylogenetic, environmental and genomic data collection Modelling relationships between genome size and environmental variables DNA repeat profiling Assessing repeat dynamics in palm genomesResults Palm genome size variation Aridity preferences of palm species help explain genome size variation Ecological metrics of palm repeat ‘communities’ vary with genome size Repeat abundances correlate with genome size Aridity preferences of palm species explain abundances of certain repeat lineagesDiscussion Palm genome size variation Aridity thresholds best explain palm genome size diversity The ‘community ecology’ of repeats correlates with genome size Repeat dynamics may be modulated by aridityConclusionsAcknowledgementsAuthor contributionsPeer reviewe

    Comparing Quantitative Methods for Analyzing Sediment DNA Records of Cyanobacteria in Experimental and Reference Lakes

    Get PDF
    Sediment DNA (sedDNA) analyses are rapidly emerging as powerful tools for the reconstruction of environmental and evolutionary change. While there are an increasing number of studies using molecular genetic approaches to track changes over time, few studies have compared the coherence between quantitative polymerase chain reaction (PCR) methods and metabarcoding techniques. Primer specificity, bioinformatic analyses, and PCR inhibitors in sediments could affect the quantitative data obtained from these approaches. We compared the performance of droplet digital polymerase chain reaction (ddPCR) and high-throughput sequencing (HTS) for the quantification of target genes of cyanobacteria in lake sediments and tested whether the two techniques similarly reveal expected patterns through time. Absolute concentrations of cyanobacterial 16S rRNA genes were compared between ddPCR and HTS using dated sediment cores collected from two experimental (Lake 227, fertilized since 1969 and Lake 223, acidified from 1976 to 1983) and two reference lakes (Lakes 224 and 442) in the Experimental Lakes Area (ELA), Canada. Relative abundances of Microcystis 16S rRNA (MICR) genes were also compared between the two methods. Moderate to strong positive correlations were found between the molecular approaches among all four cores but results from ddPCR were more consistent with the known history of lake manipulations. A 100-fold increase in ddPCR estimates of cyanobacterial gene abundance beginning in ~1968 occurred in Lake 227, in keeping with experimental addition of nutrients and increase in planktonic cyanobacteria. In contrast, no significant rise in cyanobacterial abundance associated with lake fertilization was observed with HTS. Relative abundances of Microcystis between the two techniques showed moderate to strong levels of coherence in top intervals of the sediment cores. Both ddPCR and HTS approaches are suitable for sedDNA analysis, but studies aiming to quantify absolute abundances from complex environments should consider using ddPCR due to its high tolerance to PCR inhibitors

    A roadmap for global synthesis of the plant tree of life

    Get PDF
    Providing science and society with an integrated, up-to-date, high quality, open, reproducible and sustainable plant tree of life would be a huge service that is now coming within reach. However, synthesizing the growing body of DNA sequence data in the public domain and disseminating the trees to a diverse audience are often not straightforward due to numerous informatics barriers. While big synthetic plant phylogenies are being built, they remain static and become quickly outdated as new data are published and tree-building methods improve. Moreover, the body of existing phylogenetic evidence is hard to navigate and access for non-experts. We propose that our community of botanists, tree builders, and informaticians should converge on a modular framework for data integration and phylogenetic analysis, allowing easy collaboration, updating, data sourcing and flexible analyses. With support from major institutions, this pipeline should be re-run at regular intervals, storing trees and their metadata long-term. Providing the trees to a diverse global audience through user-friendly front ends and application development interfaces should also be a priority. Interactive interfaces could be used to solicit user feedback and thus improve data quality and to coordinate the generation of new data. We conclude by outlining a number of steps that we suggest the scientific community should take to achieve global phylogenetic synthesis

    A universal probe set for targeted sequencing of 353 nuclear genes from any flowering plant designed using k-medoids clustering

    Get PDF
    Sequencing of target-enriched libraries is an efficient and cost-effective method for obtaining DNA sequence data from hundreds of nuclear loci for phylogeny reconstruction. Much of the cost of developing targeted sequencing approaches is associated with the generation of preliminary data needed for the identification of orthologous loci for probe design. In plants, identifying orthologous loci has proven difficult due to a large number of whole-genome duplication events, especially in the angiosperms (flowering plants).We used multiple sequence alignments from over 600 angiosperms for 353 putatively single-copy protein-coding genes identified by the One Thousand Plant Transcriptomes Initiative to design a set of targeted sequencing probes for phylogenetic studies of any angiosperm group. To maximize the phylogenetic potential of the probes, while minimizing the cost of production, we introduce a k-medoids clustering approach to identify the minimum number of sequences necessary to represent each coding sequence in the final probe set. Using this method, 5–15 representative sequences were selected per orthologous locus, representing the sequence diversity of angiosperms more efficiently than if probes were designed using available sequenced genomes alone. To test our approximately 80,000 probes, we hybridized libraries from 42 species spanning all higher-order groups of angiosperms, with a focus on taxa not present in the sequence alignments used to design the probes. Out of a possible 353 coding sequences, we recovered an average of 283 per species and at least 100 in all species. Differences among taxa in sequence recovery could not be explained by relatedness to the representative taxa selected for probe design, suggesting that there is no phylogenetic bias in the probe set. Our probe set, which targeted 260 kbp of coding sequence, achieved a median recovery of 137 kbp per taxon in coding regions, a maximum recovery of 250 kbp, and an additional median of 212 kbp per taxon in flanking non-coding regions across all species. These results suggest that the Angiosperms353 probe set described here is effective for any group of flowering plants and would be useful for phylogenetic studies from the species level to higher-order groups, including the entire angiosperm clade itself

    A nuclear phylogenomic study of the angiosperm order Myrtales, exploring the potential and limitations of the universal Angiosperms353 probe set

    Get PDF
    PREMISE: To further advance the understanding of the species- rich, economically and ecologically important angiosperm order Myrtales in the rosid clade, comprising nine families, approximately 400 genera and almost 14,000 species occurring on all continents (except Antarctica), we tested the Angiosperms353 probe kit.METHODS: We combined high- throughput sequencing and target enrichment with the Angiosperms353 probe kit to evaluate a sample of 485 species across 305 genera (76% of all genera in the order).RESULTS: Results provide the most comprehensive phylogenetic hypothesis for the order to date. Relationships at all ranks, such as the relationship of the early-diverging families, often reflect previous studies, but gene conflict is evident, and relationships previously found to be uncertain often remain so. Technical considerations for processing HTS data are also discussed.CONCLUSIONS: High- throughput sequencing and the Angiosperms353 probe kit are powerful tools for phylogenomic analysis, but better understanding of the genetic data available is required to identify genes and gene trees that account for likely incomplete lineage sorting and/or hybridization events

    Search for dark matter produced in association with bottom or top quarks in √s = 13 TeV pp collisions with the ATLAS detector

    Get PDF
    A search for weakly interacting massive particle dark matter produced in association with bottom or top quarks is presented. Final states containing third-generation quarks and miss- ing transverse momentum are considered. The analysis uses 36.1 fb−1 of proton–proton collision data recorded by the ATLAS experiment at √s = 13 TeV in 2015 and 2016. No significant excess of events above the estimated backgrounds is observed. The results are in- terpreted in the framework of simplified models of spin-0 dark-matter mediators. For colour- neutral spin-0 mediators produced in association with top quarks and decaying into a pair of dark-matter particles, mediator masses below 50 GeV are excluded assuming a dark-matter candidate mass of 1 GeV and unitary couplings. For scalar and pseudoscalar mediators produced in association with bottom quarks, the search sets limits on the production cross- section of 300 times the predicted rate for mediators with masses between 10 and 50 GeV and assuming a dark-matter mass of 1 GeV and unitary coupling. Constraints on colour- charged scalar simplified models are also presented. Assuming a dark-matter particle mass of 35 GeV, mediator particles with mass below 1.1 TeV are excluded for couplings yielding a dark-matter relic density consistent with measurements

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Measurement of the inclusive isolated-photon cross section in pp collisions at √s = 13 TeV using 36 fb−1 of ATLAS data

    Get PDF
    The differential cross section for isolated-photon production in pp collisions is measured at a centre-of-mass energy of 13 TeV with the ATLAS detector at the LHC using an integrated luminosity of 36.1 fb. The differential cross section is presented as a function of the photon transverse energy in different regions of photon pseudorapidity. The differential cross section as a function of the absolute value of the photon pseudorapidity is also presented in different regions of photon transverse energy. Next-to-leading-order QCD calculations from Jetphox and Sherpa as well as next-to-next-to-leading-order QCD calculations from Nnlojet are compared with the measurement, using several parameterisations of the proton parton distribution functions. The predictions provide a good description of the data within the experimental and theoretical uncertainties. [Figure not available: see fulltext.
    corecore