10 research outputs found

    Inferring phylogenetic trees under the general Markov model via a minimum spanning tree backbone

    Get PDF
    Phylogenetic trees are models of the evolutionary relationships among species, with species typically placed at the leaves of trees. We address the following problems regarding the calculation of phylogenetic trees. (1) Leaf-labeled phylogenetic trees may not be appropriate models of evolutionary relationships among rapidly evolving pathogens which may contain ancestor-descendant pairs. (2) The models of gene evolution that are widely used unrealistically assume that the base composition of DNA sequences does not evolve. Regarding problem (1) we present a method for inferring generally labeled phylogenetic trees that allow sampled species to be placed at non-leaf nodes of the tree. Regarding problem (2), we present a structural expectation maximization method (SEM-GM) for inferring leaf-labeled phylogenetic trees under the general Markov model (GM) which is the most complex model of DNA substitution that allows the evolution of base composition. In order to improve the scalability of SEM-GM we present a minimum spanning tree (MST) framework called MST-backbone. MST-backbone scales linearly with the number of leaves. However, the unrealistic location of the root as inferred on empirical data suggests that the GM model may be overtrained. MST-backbone was inspired by the topological relationship between MSTs and phylogenetic trees that was introduced by Choi et al. (2011). We discovered that the topological relationship does not necessarily hold if there is no unique MST. We propose so-called vertex-order based MSTs (VMSTs) that guarantee a topological relationship with phylogenetic trees.Phylogenetische BĂ€ume modellieren evolutionĂ€re Beziehungen zwischen Spezies, wobei die Spezies typischerweise an den BlĂ€ttern der BĂ€ume sitzen. Wir befassen uns mit den folgenden Problemen bei der Berechnung von phylogenetischen BĂ€umen. (1) Blattmarkierte phylogenetische BĂ€ume sind möglicherweise keine geeigneten Modelle der evolutionĂ€ren Beziehungen zwischen sich schnell entwickelnden Krankheitserregern, die Vorfahren-Nachfahren-Paare enthalten können. (2) Die weit verbreiteten Modelle der Genevolution gehen unrealistischerweise davon aus, dass sich die Basenzusammensetzung von DNA-Sequenzen nicht Ă€ndert. BezĂŒglich Problem (1) stellen wir eine Methode zur Ableitung von allgemein markierten phylogenetischen BĂ€umen vor, die es erlaubt, Spezies, fĂŒr die Proben vorliegen, an inneren des Baumes zu platzieren. BezĂŒglich Problem (2) stellen wir eine strukturelle Expectation-Maximization-Methode (SEM-GM) zur Ableitung von blattmarkierten phylogenetischen BĂ€umen unter dem allgemeinen Markov-Modell (GM) vor, das das komplexeste Modell von DNA-Substitution ist und das die Evolution von Basenzusammensetzung erlaubt. Um die Skalierbarkeit von SEM-GM zu verbessern, stellen wir ein Minimale Spannbaum (MST)-Methode vor, die als MST-Backbone bezeichnet wird. MST-Backbone skaliert linear mit der Anzahl der BlĂ€tter. Die Tatsache, dass die Lage der Wurzel aus empirischen Daten nicht immer realistisch abgeleitet warden kann, legt jedoch nahe, dass das GM-Modell möglicherweise ĂŒbertrainiert ist. MST-backbone wurde von einer topologischen Beziehung zwischen minimalen SpannbĂ€umen und phylogenetischen BĂ€umen inspiriert, die von Choi et al. 2011 eingefĂŒhrt wurde. Wir entdeckten, dass die topologische Beziehung nicht unbedingt Bestand hat, wenn es keinen eindeutigen minimalen Spannbaum gibt. Wir schlagen so genannte vertex-order-based MSTs (VMSTs) vor, die eine topologische Beziehung zu phylogenetischen BĂ€umen garantieren

    LIPIcs, Volume 274, ESA 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 274, ESA 2023, Complete Volum

    LIPIcs, Volume 244, ESA 2022, Complete Volume

    Get PDF
    LIPIcs, Volume 244, ESA 2022, Complete Volum

    Algorithms for Unit-Disk Graphs and Related Problems

    Get PDF
    In this dissertation, we study algorithms for several problems on unit-disk graphs and related problems. The unit-disk graph can be viewed as an intersection graph of a set of congruent disks. Unit-disk graphs have been extensively studied due to many of their applications, e.g., modeling the topology of wireless sensor networks. Lots of problems on unit-disk graphs have been considered in the literature, such as shortest paths, clique, independent set, distance oracle, diameter, etc. Specifically, we study the following problems in this dissertation: L1 shortest paths in unit-disk graphs, reverse shortest paths in unit-disk graphs, minimum bottleneck moving spanning tree, unit-disk range reporting, distance selection, etc. We develop efficient algorithms for these problems and our results are either first-known solutions or somehow improve the previous work. Given a set P of n points in the plane and a parameter r \u3e 0, a unit-disk graph G(P) can be defined using P as its vertex set and two points of P are connected by an edge if the distance between these two points is at most r. The weight of an edge is one in the unweighted case and is equal to the distance between the two endpoints in the weighted case. Note that the distance between two points can be measured by different metrics, e.g., L1 or L2 metric. In the first problem of L1 shortest paths in unit-disk graphs, we are given a point set P and a source point s ∈ P, the problem is to find all shortest paths from s to all other vertices in the L1 weighted unit-disk graph defined on set P. We present an O(n log n) time algorithm, which matches the Ω(n log n)-time lower bound. In the second problem, we are given a set P of n points, parameters r, λ \u3e 0, and two points s and t of P, the goal is to compute the smallest r such that the shortest path length between s and t in the unit-disk graph with respect to set P and parameter r is at most λ. This problem can be defined in both unweighted and weighted cases. We propose an algorithm of O(⌊λ⌋ · n log n) time and another algorithm of O(n5/4 log7/4 n) time for the unweighted case. We also given an O(n5/4 log5/2 n) time algorithm for the weighted case. In the third problem, we are given a set P of n points that are moving in the plane, the problem is to compute a spanning tree for these moving points that does not change its combinatorial structure during the point movement such that the bottleneck weight of the spanning tree (i.e., the largest Euclidean length of all edges) during the whole movement is minimized. We present an algorithm that runs in O(n4/3 log3 n) time. The fourth problem is unit-disk range reporting in which we are given a set P of n points in the plane and a value r, we need to construct a data structure so that given any query disk of radius r, all points of P in the disk can be reported efficiently. We build a data structure of O(n) space in O(n log n) time that can answer each query in O(k + log n) time, where k is the output size. The time complexity of our algorithm is the same as the previous result but our approach is much simpler. Finally, for the problem of distance selection, we are given a set P of n points in the plane and an integer 1 ≀ k ≀ (n2), the distance selection problem is to find the k-th smallest interpoint distance among all pairs of points of p. We propose an algorithm that runs in O(n4/3 log n) time. Our techniques yield two algorithmic frameworks for solving geometric optimization problems. Many algorithms and techniques developed in this dissertation are quite general and fundamental, and we believe they will find other applications in future

    Social informatics

    Get PDF
    5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings</p

    Programming Languages and Systems

    Get PDF
    This open access book constitutes the proceedings of the 29th European Symposium on Programming, ESOP 2020, which was planned to take place in Dublin, Ireland, in April 2020, as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The actual ETAPS 2020 meeting was postponed due to the Corona pandemic. The papers deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

    A survey of the application of soft computing to investment and financial trading

    Get PDF
    corecore