407 research outputs found

    Research And Application Of Parallel Computing Algorithms For Statistical Phylogenetic Inference

    Get PDF
    Estimating the evolutionary history of organisms, phylogenetic inference, is a critical step in many analyses involving biological sequence data such as DNA. The likelihood calculations at the heart of the most effective methods for statistical phylogenetic analyses are extremely computationally intensive, and hence these analyses become a bottleneck in many studies. Recent progress in computer hardware, specifically the increase in pervasiveness of highly parallel, many-core processors has created opportunities for new approaches to computationally intensive methods, such as those in phylogenetic inference. We have developed an open source library, BEAGLE, which uses parallel computing methods to greatly accelerate statistical phylogenetic inference, for both maximum likelihood and Bayesian approaches. BEAGLE defines a uniform application programming interface and includes a collection of efficient implementations that use NVIDIA CUDA, OpenCL, and C++ threading frameworks for evaluating likelihoods under a wide variety of evolutionary models, on GPUs as well as on multi-core CPUs. BEAGLE employs a number of different parallelization techniques for phylogenetic inference, at different granularity levels and for distinct processor architectures. On CUDA and OpenCL devices, the library enables concurrent computation of site likelihoods, data subsets, and independent subtrees. The general design features of the library also provide a model for software development using parallel computing frameworks that is applicable to other domains. BEAGLE has been integrated with some of the leading programs in the field, such as MrBayes and BEAST, and is used in a diverse range of evolutionary studies, including those of disease causing viruses. The library can provide significant performance gains, with the exact increase in performance depending on the specific properties of the data set, evolutionary model, and hardware. In general, nucleotide analyses are accelerated on the order of 10-fold and codon analyses on the order of 100-fold

    BEAGLE 3:Improved Performance, Scaling, and Usability for a High-Performance Computing Library for Statistical Phylogenetics

    Get PDF
    © 2019 The Author(s). BEAGLE is a high-performance likelihood-calculation library for phylogenetic inference. The BEAGLE library defines a simple, but flexible, application programming interface (API), and includes a collection of efficient implementations for calculation under a variety of evolutionary models on different hardware devices. The library has been integrated into recent versions of popular phylogenetics software packages including BEAST and MrBayes and has been widely used across a diverse range of evolutionary studies. Here, we present BEAGLE 3 with new parallel implementations, increased performance for challenging data sets, improved scalability, and better usability. We have added new OpenCL and central processing unit-threaded implementations to the library, allowing the effective utilization of a wider range of modern hardware. Further, we have extended the API and library to support concurrent computation of independent partial likelihood arrays, for increased performance of nucleotide-model analyses with greater flexibility of data partitioning. For better scalability and usability, we have improved how phylogenetic software packages use BEAGLE in multi-GPU (graphics processing unit) and cluster environments, and introduced an automated method to select the fastest device given the data set, evolutionary model, and hardware. For application developers who wish to integrate the library, we also have developed an online tutorial. To evaluate the effect of the improvements, we ran a variety of benchmarks on state-of-the-art hardware. For a partitioned exemplar analysis, we observe run-time performance improvements as high as 5.9-fold over our previous GPU implementation. BEAGLE 3 is free, open-source software licensed under the Lesser GPL and available at https://beagle-dev.github.io

    Unicompartmental knee arthroplasty: A PearlDiver study evaluating complications rates, opioid use and utilization in the Medicare population

    Get PDF
    A grant from the One-University Open Access Fund at the University of Kansas was used to defray the author's publication fees in this Open Access journal. The Open Access Fund, administered by librarians from the KU, KU Law, and KUMC libraries, is made possible by contributions from the offices of KU Provost, KU Vice Chancellor for Research & Graduate Studies, and KUMC Vice Chancellor for Research. For more information about the Open Access Fund, please see http://library.kumc.edu/authors-fund.xml.Purpose Despite increased utilization of unicompartmental knee arthroplasty (UKA) for unicompartmental knee osteoarthritis, outcomes in Medicare patients are not well-reported. The purpose of this study is to analyze practice patterns and outcome differences between UKA and TKA in the Medicare population. It is hypothesized that UKA utilization will have increased over the course of the study period and that UKA will be associated with reduced opioid use and lower complication rates compared to TKA. Methods Using PearlDiver, the Humana Claims dataset and the Medicare Standard Analytic File (SAF) were analyzed. Patients who underwent UKA and TKA were identified by CPT codes. Postoperative complications were identified by ICD-9/ICD-10 codes. Opioid use was analyzed by the number of days patients were prescribed opioids postoperatively. Survivorship was defined as conversion to TKA. Results In the Humana dataset, 7,808 UKA and 150,680 TKA patients were identified. 8-year survivorship was 87.7% (95% CI [0.861,0.894]). Postoperative opioid use was significantly higher after TKA (186.1 days) compared to UKA (144.7 days) (p  80 years old and lowest in patients < 70 years old. In both datasets, postoperative complication rates were higher in TKA patients compared to UKA patients in nearly all categories. Conclusions UKA represents an increasingly utilized treatment for osteoarthritis in the Medicare population and may be comparatively advantageous to TKA due to reduced opioid use and complication rates after surgery

    Diesel particulate matter emission factors and air quality implications from in–service rail in Washington State, USA

    Get PDF
    AbstractWe sought to evaluate the air quality implications of rail traffic at two sites in Washington State. Our goals were to quantify the exposure to diesel particulate matter (DPM) and airborne coal dust from current trains for residents living near the rail lines and to measure the DPM and black carbon emission factors (EFs). We chose two sites in Washington State, one at a residence along the rail lines in the city of Seattle and one near the town of Lyle in the Columbia River Gorge (CRG). At each site, we made measurements of size–segregated particulate matter (PM1, PM2.5 and PM10), CO2 and meteorology, and used a motion–activated camera to capture video of each train for identification. We measured an average DPM EF of 0.94g/kg diesel fuel, with an uncertainty of 20%, based on PM1 and CO2 measurements from more than 450 diesel trains. We found no significant difference in the average DPM EFs measured at the two sites. Open coal trains have a significantly higher concentration of particles greater than 1μm diameter, likely coal dust. Measurements of black carbon (BC) at the CRG site show a strong correlation with PM1 and give an average BC/DPM ratio of 52% from diesel rail emissions. Our measurements of PM2.5 show that living close to the rail lines significantly increases PM2.5 exposure. For the one month of measurements at the Seattle site, the average PM2.5 concentration was 6.8μg/m3 higher near the rail lines compared to the average from several background locations. Because the excess PM2.5 exposure for residents living near the rail lines is likely to be linearly related to the diesel rail traffic density, a 50% increase in rail traffic may put these residents over the new U.S. National Ambient Air Quality Standards, an annual average of 12μg/m3

    Impact of Scottish smoke-free legislation on smoking quit attempts and prevalence

    Get PDF
    &lt;p&gt;&lt;b&gt;Objectives:&lt;/b&gt; In Scotland, legislation was implemented in March 2006 prohibiting smoking in all wholly or partially enclosed public spaces. We investigated the impact on attempts to quit smoking and smoking prevalence.&lt;/p&gt; &lt;p&gt;&lt;b&gt;Methods:&lt;/b&gt; We performed time series models using Box-Jenkins autoregressive integrated moving averages (ARIMA) on monthly data on the gross ingredient cost of all nicotine replacement therapy (NRT) prescribed in Scotland in 2003–2009, and quarterly data on self-reported smoking prevalence between January 1999 and September 2010 from the Scottish Household Survey.&lt;/p&gt; &lt;p&gt;&lt;b&gt;Results:&lt;/b&gt; NRT prescription costs were significantly higher than expected over the three months prior to implementation of the legislation. Prescription costs peaked at £1.3 million in March 2006; £292,005.9 (95% CI £260,402.3, £323,609, p&#60;0.001) higher than the monthly norm. Following implementation of the legislation, costs fell exponentially by around 26% per month (95% CI 17%, 35%, p&#60;0.001). Twelve months following implementation, the costs were not significantly different to monthly norms. Smoking prevalence fell by 8.0% overall, from 31.3% in January 1999 to 23.7% in July–September 2010. In the quarter prior to implementation of the legislation, smoking prevalence fell by 1.7% (95% CI 2.4%, 1.0%, p&#60;0.001) more than expected from the underlying trend.&lt;/p&gt; &lt;p&gt;&lt;b&gt;Conclusions:&lt;/b&gt; Quit attempts increased in the three months leading up to Scotland's smoke-free legislation, resulting in a fall in smoking prevalence. However, neither has been sustained suggesting the need for additional tobacco control measures and ongoing support.&lt;/p&gt

    BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

    Get PDF
    Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.This work was supported by the National Science Foundation [grant numbers DBI-0755048, DEB-0732920, DEB-1036448, DMS-0931642, EF-0331495, EF-0905606, EF-0949453]; the National Institutes of Health [grant numbers R01-HG006139, R01-GM037841, R01-GM078985, R01-GM086887, R01-NS063897]; the Biotechnology and Biological Sciences Research Council [grant number BB/H011285/1]; the Wellcome Trust [grant number WT092807MA]; and Google Summer of Code

    Assessing a model of Pacific Northwest harmful algal bloom transport as a decision-support tool

    Get PDF
    In the Pacific Northwest, blooms of the diatom Pseudo-nitzschia (PN) sometimes produce domoic acid, a neurotoxin that causes amnesic shellfish poisoning, leading to a Harmful Algal Bloom (HAB) event. The Pacific Northwest (PNW) HAB Bulletin project, a partnership between academic, government, and tribal stakeholders, uses a combination of beach and offshore monitoring data and ocean forecast modeling to better understand the formation, evolution, and transport of HABs in this region. This project produces periodic Bulletins to inform local stakeholders of current and forecasted conditions. The goal of this study was to help improve how the forecast model is used in the Bulletin's preparation through a retrospective particle-tracking experiment. Using past observations of beach PN cell counts, events were identified that likely originated in the Juan de Fuca eddy, a known PN hotspot, and then particle tracks were used in the model to simulate these events. A variety of “beaching definitions” were tested, based on both water depth and distance offshore, to define when a particle in the model was close enough to the coast that it was likely to correspond to cells appearing in the intertidal zone and in shellfish diets, as well as a variety of observed PN cell thresholds to determine what cell count should be used to describe an event that would warrant further action. The skill of these criteria was assessed by determining the fraction of true positives, true negatives, false positives, and false negatives within the model in comparison with observations, as well as a variety of derived model performance metrics. This analysis suggested that for our stakeholders’ purposes, the most useful beaching definition is the 30 m isobath and the most useful PN cell threshold for coincident field-based sample PN density estimates is 10,000 PN cells/L. Lastly, the performance of a medium-resolution (1.5 km horizontal resolution) version of the model was compared with that of a high-resolution (0.5 km horizontal resolution) version, the latter currently used in forecasting for the PNW HAB Bulletin project. This analysis includes a direct comparison of the two model resolutions for one overlapping year (2017). These results suggested that a narrower, more realistic beaching definition is most useful in a high-resolution model, while a wider beaching definition is more appropriate in a lower resolution model like the medium-resolution version used in this analysis. Overall, this analysis demonstrated the importance of incorporating stakeholder needs into the statistical approach in order to generate the most effective decision-support information from oceanographic modeling

    BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics

    Get PDF
    Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software

    MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space

    Get PDF
    Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site dN/dS rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software
    corecore