186 research outputs found
Choosing among Partition Models in Bayesian Phylogenetics
Bayesian phylogenetic analyses often depend on Bayes factors (BFs) to determine the optimal way to partition the data. The marginal likelihoods used to compute BFs, in turn, are most commonly estimated using the harmonic mean (HM) method, which has been shown to be inaccurate. We describe a new more accurate method for estimating the marginal likelihood of a model and compare it with the HM method on both simulated and empirical data. The new method generalizes our previously described stepping-stone (SS) approach by making use of a reference distribution parameterized using samples from the posterior distribution. This avoids one challenging aspect of the original SS method, namely the need to sample from distributions that are close (in the Kullback–Leibler sense) to the prior. We specifically address the choice of partition models and find that using the HM method can lead to a strong preference for an overpartitioned model. In contrast to the HM method and the original SS method, we show using simulated data that the generalized SS method is strikingly more precise (repeatable BF values of the same data and partition model) and yields BF values that are much more reasonable than those produced by the HM method. Comparisons of HM and generalized SS methods on an empirical data set demonstrate that the generalized SS method tends to choose simpler partition schemes that are more in line with expectation based on inferred patterns of molecular evolution. The generalized SS method shares with thermodynamic integration the need to sample from a series of distributions in addition to the posterior. Such dedicated path-based Markov chain Monte Carlo analyses appear to be a cost of estimating marginal likelihoods accurately
BEAGLE 3:Improved Performance, Scaling, and Usability for a High-Performance Computing Library for Statistical Phylogenetics
© 2019 The Author(s). BEAGLE is a high-performance likelihood-calculation library for phylogenetic inference. The BEAGLE library defines a simple, but flexible, application programming interface (API), and includes a collection of efficient implementations for calculation under a variety of evolutionary models on different hardware devices. The library has been integrated into recent versions of popular phylogenetics software packages including BEAST and MrBayes and has been widely used across a diverse range of evolutionary studies. Here, we present BEAGLE 3 with new parallel implementations, increased performance for challenging data sets, improved scalability, and better usability. We have added new OpenCL and central processing unit-threaded implementations to the library, allowing the effective utilization of a wider range of modern hardware. Further, we have extended the API and library to support concurrent computation of independent partial likelihood arrays, for increased performance of nucleotide-model analyses with greater flexibility of data partitioning. For better scalability and usability, we have improved how phylogenetic software packages use BEAGLE in multi-GPU (graphics processing unit) and cluster environments, and introduced an automated method to select the fastest device given the data set, evolutionary model, and hardware. For application developers who wish to integrate the library, we also have developed an online tutorial. To evaluate the effect of the improvements, we ran a variety of benchmarks on state-of-the-art hardware. For a partitioned exemplar analysis, we observe run-time performance improvements as high as 5.9-fold over our previous GPU implementation. BEAGLE 3 is free, open-source software licensed under the Lesser GPL and available at https://beagle-dev.github.io
BEAGLE: An Application Programming Interface and High-Performance Computing Library for Statistical Phylogenetics
Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software
Assessing the role of live poultry trade in community-structured transmission of avian influenza in China
The live poultry trade is thought to play an important role in the spread and maintenance of highly pathogenic avian influenza A viruses (HP AIVs) in Asia. Despite an abundance of small-scale observational studies, the role of the poultry trade in disseminating AIV over large geographic areas is still unclear, especially for developing countries with complex poultry production systems. Here we combine virus genomes and reconstructed poultry transportation data to measure and compare the spatial spread in China of three key subtypes of AIV: H5N1, H7N9, and H5N6. Although it is difficult to disentangle the contribution of confounding factors, such as bird migration and spatial distance, we find evidence that the dissemination of these subtypes among domestic poultry is geographically continuous and likely associated with the intensity of the live poultry trade in China. Using two independent data sources and network analysis methods, we report a regional-scale community structure in China that might explain the spread of AIV subtypes in the country. The identification of this structure has the potential to inform more targeted strategies for the prevention and control of AIV in China
Spatial Dynamics of Human-Origin H1 Influenza A Virus in North American Swine
The emergence and rapid global spread of the swine-origin H1N1/09 pandemic influenza A virus in humans underscores the importance of swine populations as reservoirs for genetically diverse influenza viruses with the potential to infect humans. However, despite their significance for animal and human health, relatively little is known about the phylogeography of swine influenza viruses in the United States. This study utilizes an expansive data set of hemagglutinin (HA1) sequences (n = 1516) from swine influenza viruses collected in North America during the period 2003–2010. With these data we investigate the spatial dissemination of a novel influenza virus of the H1 subtype that was introduced into the North American swine population via two separate human-to-swine transmission events around 2003. Bayesian phylogeographic analysis reveals that the spatial dissemination of this influenza virus in the US swine population follows long-distance swine movements from the Southern US to the Midwest, a corn-rich commercial center that imports millions of swine annually. Hence, multiple genetically diverse influenza viruses are introduced and co-circulate in the Midwest, providing the opportunity for genomic reassortment. Overall, the Midwest serves primarily as an ecological sink for swine influenza in the US, with sources of virus genetic diversity instead located in the Southeast (mainly North Carolina) and South-central (mainly Oklahoma) regions. Understanding the importance of long-distance pig transportation in the evolution and spatial dissemination of the influenza virus in swine may inform future strategies for the surveillance and control of influenza, and perhaps other swine pathogens
Random-effects substitution models for phylogenetics via scalable gradient approximations
Phylogenetic and discrete-trait evolutionary inference depend heavily on an
appropriate characterization of the underlying character substitution process.
In this paper, we present random-effects substitution models that extend common
continuous-time Markov chain models into a richer class of processes capable of
capturing a wider variety of substitution dynamics. As these random-effects
substitution models often require many more parameters than their usual
counterparts, inference can be both statistically and computationally
challenging. Thus, we also propose an efficient approach to compute an
approximation to the gradient of the data likelihood with respect to all
unknown substitution model parameters. We demonstrate that this approximate
gradient enables scaling of sampling-based inference, namely Bayesian inference
via Hamiltonian Monte Carlo, under random-effects substitution models across
large trees and state-spaces. Applied to a dataset of 583 SARS-CoV-2 sequences,
an HKY model with random-effects shows strong signals of nonreversibility in
the substitution process, and posterior predictive model checks clearly show
that it is a more adequate model than a reversible model. When analyzing the
pattern of phylogeographic spread of 1441 influenza A virus (H3N2) sequences
between 14 regions, a random-effects phylogeographic substitution model infers
that air travel volume adequately predicts almost all dispersal rates. A
random-effects state-dependent substitution model reveals no evidence for an
effect of arboreality on the swimming mode in the tree frog subfamily Hylinae.
Simulations reveal that random-effects substitution models can accommodate both
negligible and radical departures from the underlying base substitution model.
We show that our gradient-based inference approach is over an order of
magnitude more time efficient than conventional approaches
Unifying the spatial epidemiology and molecular evolution of emerging epidemics
We introduce a conceptual bridge between the previously unlinked fields of phylogenetics and mathematical spatial ecology, which enables the spatial parameters of an emerging epidemic to be directly estimated from sampled pathogen genome sequences. By using phylogenetic history to correct for spatial autocorrelation, we illustrate how a fundamental spatial variable, the diffusion coefficient, can be estimated using robust nonparametric statistics, and how heterogeneity in dispersal can be readily quantified. We apply this framework to the spread of the West Nile virus across North America, an important recent instance of spatial invasion by an emerging infectious disease. We demonstrate that the dispersal of West Nile virus is greater and far more variable than previously measured, such that its dissemination was critically determined by rare, long-range movements that are unlikely to be discerned during field observations. Our results indicate that, by ignoring this heterogeneity, previous models of the epidemic have substantially overestimated its basic reproductive number. More generally, our approach demonstrates that easily obtainable genetic data can be used to measure the spatial dynamics of natural populations that are otherwise difficult or costly to quantify
Accurate reconstruction of insertion-deletion histories by statistical phylogenetics
The Multiple Sequence Alignment (MSA) is a computational abstraction that
represents a partial summary either of indel history, or of structural
similarity. Taking the former view (indel history), it is possible to use
formal automata theory to generalize the phylogenetic likelihood framework for
finite substitution models (Dayhoff's probability matrices and Felsenstein's
pruning algorithm) to arbitrary-length sequences. In this paper, we report
results of a simulation-based benchmark of several methods for reconstruction
of indel history. The methods tested include a relatively new algorithm for
statistical marginalization of MSAs that sums over a stochastically-sampled
ensemble of the most probable evolutionary histories. For mammalian
evolutionary parameters on several different trees, the single most likely
history sampled by our algorithm appears less biased than histories
reconstructed by other MSA methods. The algorithm can also be used for
alignment-free inference, where the MSA is explicitly summed out of the
analysis. As an illustration of our method, we discuss reconstruction of the
evolutionary histories of human protein-coding genes.Comment: 28 pages, 15 figures. arXiv admin note: text overlap with
arXiv:1103.434
Antiretroviral therapy partially improves the abnormalities of dendritic cells and lymphoid and myeloid regulatory populations in recently infected HIV patients
This study aimed to evaluate the effects of antiretroviral therapy on plasmacytoid (pDC) and myeloid
(mDC) dendritic cells as well as regulatory T (Treg) and myeloid-derived suppressor (MDSC) cells in HIVinfected
patients. Forty-five HIV-infected patients (20 of them with detectable HIV load −10 recently
infected and 10 chronically infected patients-, at baseline and after antiretroviral therapy, and 25 with
undetectable viral loads) and 20 healthy controls were studied. The influence of HIV load, bacterial
translocation (measured by 16S rDNA and lipopolysaccharide-binding protein) and immune activation
markers (interleukin –IL- 6, soluble CD14, activated T cells) was analyzed. The absolute numbers and
percentages of pDC and mDC were significantly increased in patients. Patients with detectable viral
load exhibited increased intracellular expression of IL-12 by mDCs and interferon -IFN- α by pDCs.
Activated population markers were elevated, and the proportion of Tregs was significantly higher in
HIV-infected patients. The MDSC percentage was similar in patients and controls, but the intracellular
expression of IL-10 was significantly higher in patients. The achievement of undetectable HIV load
after therapy did not modify bacterial translocation parameters, but induce an increase in pDCs, mDCs
and MDSCs only in recently infected patients. Our data support the importance of early antiretroviral
therapy to preserve dendritic and regulatory cell function in HIV-infected individuals
Recommended from our members
An assessment of the impact of herb-drug combinations used by cancer patients
Background
Herb/Dietary Supplements (HDS) are the most popular Complementary and Alternative Medicine (CAM) modality used by cancer patients and the only type which involves the ingestion of substances which may interfere with the efficacy and safety of conventional medicines. This study aimed to assess the level of use of HDS in cancer patients undergoing treatment in the UK, and their perceptions of their effects, using 127 case histories of patients who were taking HDS. Previous studies have evaluated the risks of interactions between HDS and conventional drugs on the basis on numbers of patient using HDSs, so our study aimed to further this exploration by examining the actual drug combinations taken by individual patients and their potential safety.
Method
Three hundred seventy-five cancer patients attending oncology departments and centres of palliative care at the Oxford University Hospitals Trust (OUH), Duchess of Kent House, Sobell House, and Nettlebed Hospice participated in a self-administered questionnaire survey about their HDS use with their prescribed medicines. The classification system of Stockley’s Herbal Medicine’s Interactions was adopted to assess the potential risk of herb-drug interactions for these patients.
Results
127/375 (34 %; 95 % CI 29, 39) consumed HDS, amounting to 101 different products. Most combinations were assessed as ‘no interaction’, 22 combinations were categorised as ‘doubt about outcomes of use’, 6 combinations as ‘Potentially hazardous outcome’, one combination as an interaction with ‘Significant hazard’, and one combination as an interaction of “Life-threatening outcome”. Most patients did not report any adverse events.
Conclusion
Most of the patients sampled were not exposed to any significant risk of harm from interactions with conventional medicines, but it is not possible as yet to conclude that risks in general are over-estimated. The incidence of HDS use was also less than anticipated, and significantly less than reported in other areas, illustrating the problems when extrapolating results from one region (the UK), in one setting (NHS oncology) in where patterns of supplement use may be very different to those elsewhere
- …