61 research outputs found

    Selecting optimal partitioning schemes for phylogenomic datasets

    No full text
    BACKGROUND Partitioning involves estimating independent models of molecular evolution for different subsets of sites in a sequence alignment, and has been shown to improve phylogenetic inference. Current methods for estimating best-fit partitioning schemes, however, are only computationally feasible with datasets of fewer than 100 loci. This is a problem because datasets with thousands of loci are increasingly common in phylogenetics. METHODS We develop two novel methods for estimating best-fit partitioning schemes on large phylogenomic datasets: strict and relaxed hierarchical clustering. These methods use information from the underlying data to cluster together similar subsets of sites in an alignment, and build on clustering approaches that have been proposed elsewhere. RESULTS We compare the performance of our methods to each other, and to existing methods for selecting partitioning schemes. We demonstrate that while strict hierarchical clustering has the best computational efficiency on very large datasets, relaxed hierarchical clustering provides scalable efficiency and returns dramatically better partitioning schemes as assessed by common criteria such as AICc and BIC scores. CONCLUSIONS These two methods provide the best current approaches to inferring partitioning schemes for very large datasets. We provide free open-source implementations of the methods in the PartitionFinder software. We hope that the use of these methods will help to improve the inferences made from large phylogenomic datasets.RL would like to acknowledge support from a National Evolutionary Synthesis Centre (NESCent) short-term visitor grant. We would also like to acknowledge support from NESCent to pay for open-access publishing

    High marker density GWAS provides novel insights into the genomic architecture of terpene oil yield in Eucalyptus

    Get PDF
    Terpenoid based essential oils are economically important commodities, yet beyond their biosynthetic pathways, little is known about the genetic architecture of terpene oil yield from plants. Transport, storage, evaporative loss, transcriptional regulation and precursor competition may be important contributors to this complex trait. Here, we associate 2.39 M single nucleotide polymorphisms derived from shallow whole genome sequencing of 468 Eucalyptus polybractea individuals with 12 traits related to the overall terpene yield, eight direct measures of terpene concentration and four biomass‐related traits. Our results show that in addition to terpene biosynthesis, development of secretory cavities where terpenes are both synthesised and stored, and transport of terpenes were important components of terpene yield. For sesquiterpene concentrations, the availability of precursors in the cytosol was important. Candidate terpene synthase genes for the production of 1,8‐cineole and α‐pinene, and β‐pinene, (which made up more than 80% of the total terpenes) were functionally characterised as a 1,8‐cineole synthase and a β / α‐pinene synthase. Our results provide novel insights of the genomic architecture of terpene yield and we provide candidate genes for breeding or engineering of crops for biofuels or the production of industrially valuable terpenes

    A phylogenomic approach reveals a low somatic mutation rate in a long-lived plant.

    Get PDF
    Somatic mutations can have important effects on the life history, ecology, and evolution of plants, but the rate at which they accumulate is poorly understood and difficult to measure directly. Here, we develop a method to measure somatic mutations in individual plants and use it to estimate the somatic mutation rate in a large, long-lived, phenotypically mosaic Eucalyptus melliodora tree. Despite being 100 times larger than Arabidopsis, this tree has a per-generation mutation rate only ten times greater, which suggests that this species may have evolved mechanisms to reduce the mutation rate per unit of growth. This adds to a growing body of evidence that illuminates the correlated evolutionary shifts in mutation rate and life history in plants

    Evaluation of methods and marker systems in genomic selection of oil palm (Elaeis guineensis Jacq.)

    Get PDF
    Background Genomic selection (GS) uses genome-wide markers as an attempt to accelerate genetic gain in breeding programs of both animals and plants. This approach is particularly useful for perennial crops such as oil palm, which have long breeding cycles, and for which the optimal method for GS is still under debate. In this study, we evaluated the effect of different marker systems and modeling methods for implementing GS in an introgressed dura family derived from a Deli dura x Nigerian dura (Deli x Nigerian) with 112 individuals. This family is an important breeding source for developing new mother palms for superior oil yield and bunch characters. The traits of interest selected for this study were fruit-to-bunch (F/B), shell-to-fruit (S/F), kernel-to-fruit (K/F), mesocarp-to-fruit (M/F), oil per palm (O/P) and oil-to-dry mesocarp (O/DM). The marker systems evaluated were simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). RR-BLUP, Bayesian A, B, Cπ, LASSO, Ridge Regression and two machine learning methods (SVM and Random Forest) were used to evaluate GS accuracy of the traits. Results The kinship coefficient between individuals in this family ranged from 0.35 to 0.62. S/F and O/DM had the highest genomic heritability, whereas F/B and O/P had the lowest. The accuracies using 135 SSRs were low, with accuracies of the traits around 0.20. The average accuracy of machine learning methods was 0.24, as compared to 0.20 achieved by other methods. The trait with the highest mean accuracy was F/B (0.28), while the lowest were both M/F and O/P (0.18). By using whole genomic SNPs, the accuracies for all traits, especially for O/DM (0.43), S/F (0.39) and M/F (0.30) were improved. The average accuracy of machine learning methods was 0.32, compared to 0.31 achieved by other methods. Conclusion Due to high genomic resolution, the use of whole-genome SNPs improved the efficiency of GS dramatically for oil palm and is recommended for dura breeding programs. Machine learning slightly outperformed other methods, but required parameters optimization for GS implementation

    City of Hitchcock Comprehensive Plan 2020-2040

    Get PDF
    Hitchcock is a small town located in Galveston County (Figure 1.1), nestled up on the Texas Gulf Coast. It lies about 40 miles south-east of Houston. The boundaries of the city encloses an area of land of 60.46 sq. miles, an area of water of 31.64 sq. miles at an elevation just 16 feet above sea level. Hitchcock has more undeveloped land (~90% of total area) than the county combined. Its strategic location gives it a driving force of opportunities in the Houston-Galveston Region.The guiding principles for this planning process were Hitchcock’s vision statement and its corresponding goals, which were crafted by the task force. The goals focus on factors of growth and development including public participation, development considerations, transportation, community facilities, economic development, parks, and housing and social vulnerabilityTexas Target Communitie

    The Effects of partitioning on phylogenetic inference

    No full text
    Partitioning is a commonly used method in phylogenetics that aims to accommodate variation in substitution patterns among sites. Despite its popularity, there have been few systematic studies of its effects on phylogenetic inference, and there have been no studies that compare the effects of different approaches to partitioning across many empirical data sets. In this study, we applied four commonly used approaches to partitioning to each of 34 empirical data sets, and then compared the resulting tree topologies, branch-lengths, and bootstrap support estimated using each approach. We find that the choice of partitioning scheme often affects tree topology, particularly when partitioning is omitted. Most notably, we find occasional instances where the use of a suboptimal partitioning scheme produces highly supported but incorrect nodes in the tree. Branch-lengths and bootstrap support are also affected by the choice of partitioning scheme, sometimes dramatically so. We discuss the reasons for these effects and make some suggestions for best practice.17 page(s
    corecore