10 research outputs found
Supplementary Material for Bayesian Long-Branch Attraction Bias and Corrections
Supplementary Material for Bayesian Long-Branch Attraction Bias and Correction
Source Code for Software
This file contains source code for software to correct Bayesian posterior probabilities of topologies. This is Version 1.0, used for the Systematic Biology publication "Bayesian Long Branch Attraction Bias and Corrections". To access the most recent version of the software, please se
Supplement 1, Revision 1. Matlab code for testing the hypothesis of time homogeneity by parametric bootstrap. Submitted 12 December 2005; Published 6 January 2006.
<h2>File List</h2><blockquote>
<p><b>Supplement 1, original</b> </p>
<p><i><b>All files at once</b></i></p>
<blockquote>
<p><a href="Spencer_code.tar">Spencer_code.tar</a></p>
</blockquote>
<p><i><b>Matlab code</b></i></p>
<blockquote>
<p><a href="get_Q_like.m">get_Q_like.m</a></p>
<p><a href="ML_Q_est5.m">ML_Q_est5.m</a></p>
<p><a href="mlq_ufun3.m">mlq_ufun3.m</a></p>
<p><a href="multinomial.m">multinomial.m</a></p>
<p><a href="plike.m">plike.m</a></p>
<p><a href="Qfromest.m">Qfromest.m</a></p>
<p><a href="Qmatrix_bootstrap.m">Qmatrix_bootstrap.m</a></p>
<p><a href="stationary_probt.m">stationary_probt.m</a></p>
<p><a href="sorted_eigs.m">sorted_eigs.m</a></p>
<p> </p>
</blockquote>
<p><i><b>Example data</b></i></p>
<blockquote>
<p><a href="example_data.mat">example_data.mat</a></p>
<p> </p>
</blockquote>
<p><b>Supplement 1, Revision 1</b></p>
<blockquote>
<p><a href="suppl-1R1.htm">Download files</a> </p>
</blockquote>
</blockquote><h2>Description</h2><blockquote><p> </p>
Matlab code for the bootstrap analysis in Spencer and Susko, Continuous-time Markov models for species interactions. Tested under Matlab release 14. Requires the optimization toolbox. <br>
The main function, ML_Q_est5.m, tests the hypothesis that the data can be explained by a homogeneous continuous-time Markov model against the more general alternative that rates are not homogeneous. It first estimates parameters for a homogeneous continuous-time Markov model, then calculates the likelihood ratio between this model and the maximum likelihood discrete-time model. It then generates parametric bootstrap samples from the estimated homogeneous continuous-time model to generate a distribution of the likelihood ratio under the hypothesis of time homogeneity. For detailed instructions on usage, type help ML_Q_est5 in Matlab. Typical usage:
<p> [Q,negl,nobs,neglobs,neglz,twodelta,p,timetaken]=ML_Q_est5(P,obs_csum,reps,nsites,ntimes);</p>
<p> The inputs are:</p>
<p>P is the observed transition matrix for time step of 1 unit.</p>
<p>obs_csum is the number of observed transitions from each state.</p>
<p>reps is number of parametric bootstrap reps to run.</p>
<p>ntimes is number of time periods sampled in original data.</p>
<p>nsites is number of sites sampled in original data.</p>
<p> The outputs are:</p>
<p>Q is the estimated homogeneous instantaneous rate matrix, with all off-diagonal elements constrained to be positive.</p>
<p>negl is partial negative log likelihood for this Q matrix.</p>
<p>nobs is an array of number of observations of each transition (with some rounding if the P matrix wasn't reported exactly).</p>
<p>neglobs is partial negative log likelihood if we use observed transition frequencies (which are the ML estimates in a discrete-time model, not requiring homogeneity in continuous time).</p>
<p>neglz is partial negative log likelihood if we set any tiny elements of Q (<2*eps, where eps is machine precision) to zero. This is usually almost identical to negl.</p>
<p>twodelta is twice difference in partial log likelihoods between the Q matrix and the empirical P matrix: first row is observed, remaining reps rows are from parametric bootstrap. First col is for the Q matrix as estimated, second is with tiny elements of Q set to zero.</p>
<p>p is proportion of reps with twodelta >= observed (first col Q as estimated, second col with tiny elements of Q set to zero).</p>
<p>timetaken is clock time used.</p>
<p>The other functions (mlq_ufun3.m, get_Q_like.m, Qmatrix_bootstrap.m, stationary_probt.m, sorted_eigs.m, Qfromest.m, plike.m, multinomial.m) are called by ML_Q_est5.m.</p>
<p>example_data.mat contains the data described in the paper (Matlab format, compatible with version 6 and later). The original source for these data is Tanner, J., T. Hughes, and J. Connell. 1994. Species coexistence, keystone species, and succession: a sensitivity analysis. Ecology <b>75</b>:2204–2219 (Exposed Crest site, their Table 2, with additional information from J. Tanner, <i>personal communication</i>)</p>
<p></p>
</blockquote
Supplement 1. Matlab code for testing the hypothesis of time homogeneity by parametric bootstrap.
<h2>File List</h2><blockquote>
<p><b>Supplement 1, original</b> </p>
<p><i><b>All files at once</b></i></p>
<blockquote>
<p><a href="Spencer_code.tar">Spencer_code.tar</a></p>
</blockquote>
<p><i><b>Matlab code</b></i></p>
<blockquote>
<p><a href="get_Q_like.m">get_Q_like.m</a></p>
<p><a href="ML_Q_est5.m">ML_Q_est5.m</a></p>
<p><a href="mlq_ufun3.m">mlq_ufun3.m</a></p>
<p><a href="multinomial.m">multinomial.m</a></p>
<p><a href="plike.m">plike.m</a></p>
<p><a href="Qfromest.m">Qfromest.m</a></p>
<p><a href="Qmatrix_bootstrap.m">Qmatrix_bootstrap.m</a></p>
<p><a href="stationary_probt.m">stationary_probt.m</a></p>
<p><a href="sorted_eigs.m">sorted_eigs.m</a></p>
<p> </p>
</blockquote>
<p><i><b>Example data</b></i></p>
<blockquote>
<p><a href="example_data.mat">example_data.mat</a></p>
<p> </p>
</blockquote>
<p><b>Supplement 1, Revision 1</b></p>
<blockquote>
<p><a href="suppl-1R1.htm">Download files</a> </p>
</blockquote>
</blockquote><h2>Description</h2><blockquote><p> </p>
Matlab code for the bootstrap analysis in Spencer and Susko, Continuous-time Markov models for species interactions. Tested under Matlab release 14. Requires the optimization toolbox. <br>
The main function, ML_Q_est5.m, tests the hypothesis that the data can be explained by a homogeneous continuous-time Markov model against the more general alternative that rates are not homogeneous. It first estimates parameters for a homogeneous continuous-time Markov model, then calculates the likelihood ratio between this model and the maximum likelihood discrete-time model. It then generates parametric bootstrap samples from the estimated homogeneous continuous-time model to generate a distribution of the likelihood ratio under the hypothesis of time homogeneity. For detailed instructions on usage, type help ML_Q_est5 in Matlab. Typical usage:
<p> [Q,negl,nobs,neglobs,neglz,twodelta,p,timetaken]=ML_Q_est5(P,obs_csum,reps,nsites,ntimes);</p>
<p> The inputs are:</p>
<p>P is the observed transition matrix for time step of 1 unit.</p>
<p>obs_csum is the number of observed transitions from each state.</p>
<p>reps is number of parametric bootstrap reps to run.</p>
<p>ntimes is number of time periods sampled in original data.</p>
<p>nsites is number of sites sampled in original data.</p>
<p> The outputs are:</p>
<p>Q is the estimated homogeneous instantaneous rate matrix, with all off-diagonal elements constrained to be positive.</p>
<p>negl is partial negative log likelihood for this Q matrix.</p>
<p>nobs is an array of number of observations of each transition (with some rounding if the P matrix wasn't reported exactly).</p>
<p>neglobs is partial negative log likelihood if we use observed transition frequencies (which are the ML estimates in a discrete-time model, not requiring homogeneity in continuous time).</p>
<p>neglz is partial negative log likelihood if we set any tiny elements of Q (<2*eps, where eps is machine precision) to zero. This is usually almost identical to negl.</p>
<p>twodelta is twice difference in partial log likelihoods between the Q matrix and the empirical P matrix: first row is observed, remaining reps rows are from parametric bootstrap. First col is for the Q matrix as estimated, second is with tiny elements of Q set to zero.</p>
<p>p is proportion of reps with twodelta >= observed (first col Q as estimated, second col with tiny elements of Q set to zero).</p>
<p>timetaken is clock time used.</p>
<p>The other functions (mlq_ufun3.m, get_Q_like.m, Qmatrix_bootstrap.m, stationary_probt.m, sorted_eigs.m, Qfromest.m, plike.m, multinomial.m) are called by ML_Q_est5.m.</p>
<p>example_data.mat contains the data described in the paper (Matlab format, compatible with version 6 and later). The original source for these data is Tanner, J., T. Hughes, and J. Connell. 1994. Species coexistence, keystone species, and succession: a sensitivity analysis. Ecology <b>75</b>:2204–2219 (Exposed Crest site, their Table 2, with additional information from J. Tanner, <i>personal communication</i>)</p>
<p></p>
</blockquote
Additional file 7: of Nuclear genetic codes with a different meaning of the UAG and the UAA codon
Complete 18S rRNA sequences of rhizarian exLh and Iotanema spirale (re)assembled from RNAseq reads. (TXT 7 kb
Additional file 1: Table S1. of Nuclear genetic codes with a different meaning of the UAG and the UAA codon
List of detected genes, single-gene trees, and hyperconserved regions for the rhizarian exLh. (XLSX 682 kb
Additional file 6: Figure S2. of Nuclear genetic codes with a different meaning of the UAG and the UAA codon
Sequence comparison of the conserved N-terminal domain of the eukaryotic release factor 1 (eRF1) from various eukaryotes with different types of the nuclear genetic code. (PDF 150 kb
Additional file 2: Figure S1. of Nuclear genetic codes with a different meaning of the UAG and the UAA codon
Phylogenomic analysis of eukaryotes including the rhizarian exLh and I. spirale based on 70 conserved proteins. (PDF 603 kb
Additional file 3: Table S2. of Nuclear genetic codes with a different meaning of the UAG and the UAA codon
List of analyzed transcripts, single-gene trees, and hyperconserved regions for Iotanema spirale. (XLSX 787 kb