13 research outputs found
Embracing heterogeneity: coalescing the Tree of Life and the future of phylogenomics
Building the Tree of Life (ToL) is a major challenge of modern biology, requiring advances in cyberinfrastructure, data collection, theory, and more. Here, we argue that phylogenomics stands to benefit by embracing the many heterogeneous genomic signals emerging from the first decade of large-scale phylogenetic analysis spawned by high-throughput sequencing (HTS). Such signals include those most commonly encountered in phylogenomic datasets, such as incomplete lineage sorting, but also those reticulate processes emerging with greater frequency, such as recombination and introgression. Here we focus specifically on how phylogenetic methods can accommodate the heterogeneity incurred by such population genetic processes; we do not discuss phylogenetic methods that ignore such processes, such as concatenation or supermatrix approaches or supertrees. We suggest that methods of data acquisition and the types of markers used in phylogenomics will remain restricted until a posteriori methods of marker choice are made possible with routine whole-genome sequencing of taxa of interest. We discuss limitations and potential extensions of a model supporting innovation in phylogenomics today, the multispecies coalescent model (MSC). Macroevolutionary models that use phylogenies, such as character mapping, often ignore the heterogeneity on which building phylogenies increasingly rely and suggest that assimilating such heterogeneity is an important goal moving forward. Finally, we argue that an integrative cyberinfrastructure linking all steps of the process of building the ToL, from specimen acquisition in the field to publication and tracking of phylogenomic data, as well as a culture that values contributors at each step, are essential for progress
The Probability of a Gene Tree Topology within a Phylogenetic Network with Applications to Hybridization Detection
Gene tree topologies have proven a powerful data source for various tasks, including species tree inference and species delimitation. Consequently, methods for computing probabilities of gene trees within species trees have been developed and widely used in probabilistic inference frameworks. All these methods assume an underlying multispecies coalescent model. However, when reticulate evolutionary events such as hybridization occur, these methods are inadequate, as they do not account for such events. Methods that account for both hybridization and deep coalescence in computing the probability of a gene tree topology currently exist for very limited cases. However, no such methods exist for general cases, owing primarily to the fact that it is currently unknown how to compute the probability of a gene tree topology within the branches of a phylogenetic network. Here we present a novel method for computing the probability of gene tree topologies on phylogenetic networks and demonstrate its application to the inference of hybridization in the presence of incomplete lineage sorting. We reanalyze a Saccharomyces species data set for which multiple analyses had converged on a species tree candidate. Using our method, though, we show that an evolutionary hypothesis involving hybridization in this group has better support than one of strict divergence. A similar reanalysis on a group of three Drosophila species shows that the data is consistent with hybridization. Further, using extensive simulation studies, we demonstrate the power of gene tree topologies at obtaining accurate estimates of branch lengths and hybridization probabilities of a given phylogenetic network. Finally, we discuss identifiability issues with detecting hybridization, particularly in cases that involve extinction or incomplete sampling of taxa
The Signaling Petri Net-Based Simulator: A Non-Parametric Strategy for Characterizing the Dynamics of Cell-Specific Signaling Networks
Reconstructing cellular signaling networks and understanding how they work are major endeavors in cell biology. The scale and complexity of these networks, however, render their analysis using experimental biology approaches alone very challenging. As a result, computational methods have been developed and combined with experimental biology approaches, producing powerful tools for the analysis of these networks. These computational methods mostly fall on either end of a spectrum of model parameterization. On one end is a class of structural network analysis methods; these typically use the network connectivity alone to generate hypotheses about global properties. On the other end is a class of dynamic network analysis methods; these use, in addition to the connectivity, kinetic parameters of the biochemical reactions to predict the network's dynamic behavior. These predictions provide detailed insights into the properties that determine aspects of the network's structure and behavior. However, the difficulty of obtaining numerical values of kinetic parameters is widely recognized to limit the applicability of this latter class of methods
Mapping Network Motif Tunability and Robustness in the Design of Synthetic Signaling Circuits
Cellular networks are highly dynamic in their function, yet evolutionarily conserved in their core network motifs or topologies. Understanding functional tunability and robustness of network motifs to small perturbations in function and structure is vital to our ability to synthesize controllable circuits. In establishing core sets of network motifs, we selected topologies that are overrepresented in mammalian networks, including the linear, feedback, feed-forward, and bifan circuits. Static and dynamic tunability of network motifs were defined as the motif ability to respectively attain steady-state or transient outputs in response to pre-defined input stimuli. Detailed computational analysis suggested that static tunability is insensitive to the circuit topology, since all of the motifs displayed similar ability to attain predefined steady state outputs in response to constant inputs. Dynamic tunability, in contrast, was tightly dependent on circuit topology, with some motifs performing superiorly in achieving observed time-course outputs. Finally, we mapped dynamic tenability onto motif topologies to determine robustness of motif structures to changes in topology and identify design principles for the rational assembly of robust synthetic networks
Motif tunability to transient output objectives.
<p>A: Predefined time-courses and B: convergence percentage of network motifs to transient output objectives for: 1) fast; 2) slowly decaying; 3) asymptotic; 4) rapid and delayed; 5) sigmoidal; 6) pulse; 7) biphasic; 8) rapid increasing-slow decaying and asymptotic; and 9) multi-static responses. Given the 12 motif topologies, 9 time-course objectives, and 20 attempts to identify the model parameter, PSO was implemented 2,160 times.</p
Motifs topologies overrepresented in mammalian signaling networks.
<p>In all of the motifs, “I” is the input source that activates signals across the networks; “A, B” are intermediate components; and “X, Y” are downstream effectors. LM: linear motif; NFB: negative feedback; PFB: positive feedback; PNFB: positive-negative feedback; NFF: negative feed-forward; PFF: positive feed-forward; PNFF: positive-negative feed-forward; CB: coherent bifan; IB: incoherent bifan; PCB: partially coherent bifan; iNFB: isolated negative feedback; iPFB: isolated positive feedback.</p
Robustness index heatmaps of the 81 motif topologies with respect to i-step neighbors.
<p>A: Robustness of functional tunability to slowly decaying responses. B: Robustness of functional tunability to pulse responses. Given the 81 motif topologies, 2 time-course objectives, and 20 attempts to identify the model parameter, PSO was implemented 3,240 times.</p
Functional tunability of network motifs to attain static output objectives.
<p>A: Predefined steady-state objective: [x<sub>00</sub> x<sub>01</sub>] = [0 1], [x<sub>02</sub> x<sub>03</sub>] = [4.5 5.5], [x<sub>04</sub> x<sub>05</sub>] = [9 10]; [y<sub>00</sub> y<sub>01</sub>] = [0 1], [y<sub>02</sub> y<sub>03</sub>] = [4.5 5.5], [y<sub>04</sub> y<sub>05</sub>] = [9 10]; obj<sub>11</sub>, obj<sub>12</sub>, and obj<sub>13</sub>: X<sup>*</sup> is low and Y<sup>*</sup> is low, intermediate, or high; obj<sub>21</sub>, obj<sub>22</sub>, and obj<sub>23</sub>: X<sup>*</sup> is intermediate and Y<sup>*</sup> is low, intermediate, or high; obj<sub>31</sub>, obj<sub>32</sub>, and obj<sub>33</sub>: X<sup>*</sup> is high and Y<sup>*</sup> is low, intermediate, or high. PSO implementation: steady-state levels of motif output responses were arbitrarily set to the following values: [X<sup>*</sup> Y<sup>*</sup>] = [1.5 1.5] (xy<sub>1.5_1.5</sub>); [X<sup>*</sup> Y<sup>*</sup>] = [1.5 8.5] (xy<sub>1.5_8.5</sub>); [X<sup>*</sup> Y<sup>*</sup>] = [8.5 1.5] (xy<sub>8.5_1.5</sub>); and [X<sup>*</sup> Y<sup>*</sup>] = [8.5 8.5] (xy<sub>8.5_8.5</sub>). Model parameters of all motifs were initialized to identical values as follows: 1) PSO was pre-implemented to identify model parameters that generated such static outputs for the linear motif; 2) these parameters were used for the core structure of all other network motifs; 3) the kinetic constants of the additional reactions were set to zero. B: Motif tunability to static output objectives obtained through random sampling of model parameters. C: Motif tunability to static output objectives obtained through PSO sampling of model parameters necessary for ODE implementation. Particle positions were initialized to the point xy<sub>1.5_1.5</sub>. CV was defined as the ratio between the standard deviation and the mean computed across the 100 parameter sets identified by using PSO. Given the 12 motif topologies, 9 objective areas, and 100 sets of identified parameter, PSO was implemented 10,800 times.</p