Search CORE

24 research outputs found

Inferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting

Author: Ané Cécile
Solís-Lemus Claudia
Publication venue
Publication date: 12/02/2016
Field of study

Phylogenetic networks are necessary to represent the tree of life expanded by edges to represent events such as horizontal gene transfers, hybridizations or gene flow. Not all species follow the paradigm of vertical inheritance of their genetic material. While a great deal of research has flourished into the inference of phylogenetic trees, statistical methods to infer phylogenetic networks are still limited and under development. The main disadvantage of existing methods is a lack of scalability. Here, we present a statistical method to infer phylogenetic networks from multi-locus genetic data in a pseudolikelihood framework. Our model accounts for incomplete lineage sorting through the coalescent model, and for horizontal inheritance of genes through reticulation nodes in the network. Computation of the pseudolikelihood is fast and simple, and it avoids the burdensome calculation of the full likelihood which can be intractable with many species. Moreover, estimation at the quartet-level has the added computational benefit that it is easily parallelizable. Simulation studies comparing our method to a full likelihood approach show that our pseudolikelihood approach is much faster without compromising accuracy. We applied our method to reconstruct the evolutionary relationships among swordtails and platyfishes (

Xiphophorus

: Poeciliidae), which is characterized by widespread hybridizations

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

Bayesian species delimitation combining multiple genes and traits in a unified framework

Author: Ané Cécile
Knowles L. Lacey
Solís‐lemus Claudia
Publication venue: 'Wiley'
Publication date: 26/11/2014
Field of study

Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/110547/1/evo12582.pd

Crossref

ZENODO

Dryad Digital Repository (Duke University)

Electronic Archiving System

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Deep Blue Documents

Sparse Gaussian chain graphs with the spike-and-slab LASSO: Algorithms and asymptotics

Author: Deshpande Sameer K.
Shen Yunyi
Solís-Lemus Claudia
Publication venue
Publication date: 14/07/2022
Field of study

The Gaussian chain graph model simultaneously parametrizes (i) the direct effects of

p

predictors on

q

correlated outcomes and (ii) the residual partial covariance between pair of outcomes. We introduce a new method for fitting sparse Gaussian chain graph models with spike-and-slab LASSO (SSL) priors. We develop an Expectation-Conditional Maximization algorithm to obtain sparse estimates of the

p \times q

matrix of direct effects and the

q \times q

residual precision matrix. Our algorithm iteratively solves a sequence of penalized maximum likelihood problems with self-adaptive penalties that gradually filter out negligible regression coefficients and partial covariances. Because it adaptively penalizes model parameters, our method is seen to outperform fixed-penalty competitors on simulated data. We establish the posterior concentration rate for our model, buttressing our method's excellent empirical performance with strong theoretical guarantees. We use our method to reanalyze a dataset from a study of the effects of diet and residence type on the composition of the gut microbiome of elderly adults

arXiv.org e-Print Archive

PhyloNetworks: A package for phylogenetic networks

Author: Ané Céline
Bastide Paul
Solís-Lemus Claudia
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

International audiencePhyloNetworks is a Julia package for the inference, manipulation, visualization, and use of phylogenetic networks in an interactive environment. Inference of phylogenetic networks is done with maximum pseudolikelihood from gene trees or multi-locus sequences (SNaQ), with possible bootstrap analysis. PhyloNetworks is the first software providing tools to summarize a set of networks (from a bootstrap or posterior sample) with measures of tree edge support, hybrid edge support, and hybrid node support. Networks can be used for phylogenetic comparative analysis of continuous traits, to estimate ancestral states or do a phylogenetic regression. The software is available in open source and with documentation at https://github.com/crsl4/PhyloNetworks.jl

HAL Descartes

Networks with k = 4 nodes in the reticulation cycle and identical unrooted topologies.

Author: Claudia Solís-Lemus (2547481)
Cécile Ané (257082)
Publication venue
Publication date
Field of study

They differ in their hybrid position (left: good diamond, right: bad diamond I). If D2 is not sampled (n = 4), only for i = 1, 2 are identifiable and the 2 networks are not distinguishable from each other.</p

The Francis Crick Institute

Example of a 4-taxon semi-directed network (left), with known direction of both hybrid edges but unspecified position of the root.

Author: Claudia Solís-Lemus (2547481)
Cécile Ané (257082)
Publication venue
Publication date
Field of study

The root can be placed on the internal edges with length t2, t3, t4, or on the external edges to C or D. The quartet CFs on this network are weighted averages of CFs under 4 trees with weights as shown (right).</p

The Francis Crick Institute

Example of rooted and semi-directed phylogenetic networks with h = 2 hybridization events and n = 7 sampled taxa.

Author: Claudia Solís-Lemus (2547481)
Cécile Ané (257082)
Publication venue
Publication date
Field of study

Inheritance probabilities γ represent the proportion of genes contributed by each parental population to a given hybrid node. Left: rooted network modelling several biological processes. Taxon F is a hybrid between two non-sampled taxa Y and Z with γ2 ≈ 0.50, and the lineage ancestral to taxa C and D has received genes introgressed from a non-sampled taxon X, for which γ1 ≈ 0.10. An alternative process at this event could be the horizontal transfer of only a handful of genes, corresponding to a very small fraction γ1 ≈ 0.001. Center: semi-directed network for the biological scenario just described. Although the root location is unknown, its position is constrained by the direction of hybrid edges (directed by arrows). For example, C, G or E cannot be outgroups. Right: rooted network obtained from the semi-directed network (center) by placing the root on the hybrid edge that leads to taxon F (labeled by 1 − γ2).</p

The Francis Crick Institute

Data from: Bayesian species delimitation combining multiple genes and traits in a unified framework

Author: Ané Cécile
Knowles L. Lacey
Solís-Lemus Claudia
Publication venue
Publication date: 26/11/2014
Field of study

Delimitation of species based exclusively on genetic data has been advocated despite a critical knowledge gap: how might such approaches fail because they rely on genetic data alone, and would their accuracy be improved by using multiple data-types. We provide here the requisite framework for addressing these key questions. Because both phenotypic and molecular data can be analyzed in a common Bayesian framework with our program iBPP, we can compare the accuracy of delimited taxa based on genetic data alone versus when integrated with phenotypic data. We can also evaluate how the integration of phenotypic data might improve species delimitation when divergence occurs with gene flow and/or is selectively driven. These two realities of the speciation process are ignored by currently available genetic approaches. Our model accommodates phenotypic characters that exhibit different degrees of divergence, allowing for both neutral traits and traits under selection. We found a greater accuracy of estimated species boundaries with the integration of phenotypic and genetic data, with a strong beneficial influence of phenotypic data from traits under selection when the speciation process involves gene flow. Our results highlight the benefits of multiple data-types, but also draws into question the rationale of species delimitation based exclusively on genetic data

Dryad Digital Repository (Duke University)

Electronic Archiving System

perl script to simulate data and analyze with iBPP

Author: Claudia Solís-Lemus (2547481)
Cécile Ané (257082)
Publication venue
Publication date
Field of study

This perl script can be used to reproduce all simulations in the article "Bayesian species delimitation combining multiple genes and traits in a unified framework"

The Francis Crick Institute

Performance (average computing time per replicate) of SNaQ and PhyloNet.

Author: Claudia Solís-Lemus (2547481)
Cécile Ané (257082)
Publication venue
Publication date
Field of study

in simulations using true gene trees on networks with n = 6, 10 or 15 taxa and h = 1, 2 or 3. Each replicate consisted of 10 independent runs with full optimization of branch lengths and inheritance probabilities for each run. Pie charts display accuracy (black: probability of recovering the true network). With n = 10 and 300 or more loci, or with n = 15, PhyloNet was too slow to run.</p

The Francis Crick Institute

Inferring phylogenetic networks with maximum pseudolikelihood under incomplete lineage sorting

Bayesian species delimitation combining multiple genes and traits in a unified framework

Sparse Gaussian chain graphs with the spike-and-slab LASSO: Algorithms and asymptotics

PhyloNetworks: A package for phylogenetic networks

Networks with <i>k</i> = 4 nodes in the reticulation cycle and identical unrooted topologies.

Example of a 4-taxon semi-directed network (left), with known direction of both hybrid edges but unspecified position of the root.

Example of rooted and semi-directed phylogenetic networks with <i>h</i> = 2 hybridization events and <i>n</i> = 7 sampled taxa.

Data from: Bayesian species delimitation combining multiple genes and traits in a unified framework

perl script to simulate data and analyze with iBPP

Performance (average computing time per replicate) of SNaQ and PhyloNet.