6 research outputs found

    Reconstructing phylogenies from noisy quartets in polynomial time with a high success probability

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In recent years, quartet-based phylogeny reconstruction methods have received considerable attentions in the computational biology community. Traditionally, the accuracy of a phylogeny reconstruction method is measured by simulations on synthetic datasets with known "true" phylogenies, while little theoretical analysis has been done. In this paper, we present a new model-based approach to measuring the accuracy of a quartet-based phylogeny reconstruction method. Under this model, we propose three efficient algorithms to reconstruct the "true" phylogeny with a high success probability.</p> <p>Results</p> <p>The first algorithm can reconstruct the "true" phylogeny from the input quartet topology set without quartet errors in <it>O</it>(<it>n</it><sup>2</sup>) time by querying at most (<it>n </it>- 4) log(<it>n </it>- 1) quartet topologies, where <it>n </it>is the number of the taxa. When the input quartet topology set contains errors, the second algorithm can reconstruct the "true" phylogeny with a probability approximately 1 - <it>p </it>in <it>O</it>(<it>n</it><sup>4 </sup>log <it>n</it>) time, where <it>p </it>is the probability for a quartet topology being an error. This probability is improved by the third algorithm to approximately <inline-formula><m:math name="1748-7188-3-1-i1" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mfrac><m:mn>1</m:mn><m:mrow><m:mn>1</m:mn><m:mo>+</m:mo><m:msup><m:mi>q</m:mi><m:mn>2</m:mn></m:msup><m:mo>+</m:mo><m:mfrac><m:mn>1</m:mn><m:mn>2</m:mn></m:mfrac><m:msup><m:mi>q</m:mi><m:mn>4</m:mn></m:msup><m:mo>+</m:mo><m:mfrac><m:mn>1</m:mn><m:mrow><m:mn>16</m:mn></m:mrow></m:mfrac><m:msup><m:mi>q</m:mi><m:mn>5</m:mn></m:msup></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF"> MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaqcfa4aaSaaaeaacqaIXaqmaeaacqaIXaqmcqGHRaWkcqWGXbqCdaahaaqabeaacqaIYaGmaaGaey4kaSYaaSaaaeaacqaIXaqmaeaacqaIYaGmaaGaemyCae3aaWbaaeqabaGaeGinaqdaaiabgUcaRmaalaaabaGaeGymaedabaGaeGymaeJaeGOnaydaaiabdghaXnaaCaaabeqaaiabiwda1aaaaaaaaa@3D5A@</m:annotation></m:semantics></m:math></inline-formula>, where <inline-formula><m:math name="1748-7188-3-1-i2" xmlns:m="http://www.w3.org/1998/Math/MathML"><m:semantics><m:mrow><m:mi>q</m:mi><m:mo>=</m:mo><m:mfrac><m:mi>p</m:mi><m:mrow><m:mn>1</m:mn><m:mo>βˆ’</m:mo><m:mi>p</m:mi></m:mrow></m:mfrac></m:mrow><m:annotation encoding="MathType-MTEF"> MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGacaGaaiaabeqaaeqabiWaaaGcbaGaemyCaeNaeyypa0tcfa4aaSaaaeaacqWGWbaCaeaacqaIXaqmcqGHsislcqWGWbaCaaaaaa@3391@</m:annotation></m:semantics></m:math></inline-formula>, with running time of <it>O</it>(<it>n</it><sup>5</sup>), which is at least 0.984 when <it>p </it>< 0.05.</p> <p>Conclusion</p> <p>The three proposed algorithms are mathematically guaranteed to reconstruct the "true" phylogeny with a high success probability. The experimental results showed that the third algorithm produced phylogenies with a higher probability than its aforementioned theoretical lower bound and outperformed some existing phylogeny reconstruction methods in both speed and accuracy.</p

    The Qphyl System: a web-based interactive system for phylogenetic analysis

    Get PDF
    Phylogenetic tree reconstruction is a prominent problem in computational biology. Currently, all computational methods have their limitations and work well only for simple problems of small size. No existing method can guarantee that trees constructed for real-world problems are true phylogenetic trees for large and complex problems mainly because the existing computational models are not very biologically realistic. It has become a serious issue for many important real-life applications which often desire accurate results from phylogenetic analysis. Thus, it is very crucial to effectively incorporate multi-disciplinary analyses and synthesize results from various sources when answering real-life questions. In this thesis, a novel web-based phylogeny reconstruction system with a real-time interactive environment, called Qphyl (short for quartet-based phylogenetic analysis) is introduced. The Qphyl system uses a new interactive approach to enable biologists to greatly improve the final results through effectively dynamic interaction with the computation, e.g., to move the computation back and forth to different stages so users can check the intermediate results, compare results from different methods and carry out certain manual refinements using their biological domain-specific knowledge in the decision making on how a tree should be reconstructed. Currently the alpha version of this web-based interactive system has been released and accessible through the URL: http://ww-test.it.usyd.edu.au/sogrid/qphyl/

    Constructing Big Trees from Short Sequences

    No full text
    The construction of evolutionary trees is a fundamental problem in biology, and yet methods for reconstructing evolutionary trees are not reliable when it comes to inferring accurate topologies of large divergent evolutionary trees from realistic length sequences. We address this problem and present a new polynomial time algorithm for reconstructing evolutionary trees called the Short Quartets Method which is consistent and which has greater statistical power than other polynomial time methods, such as Neighbor-Joining and the 3-approximation algorithm by Agarwala et al. (and the &quot;Double Pivot&quot; variant of the Agarwala et al. algorithm by Cohen and Farach) for the L1-nearest tree problem. Our study indicates that our method will produce the correct topology from shorter sequences than can be guaranteed using these other methods
    corecore