Search CORE

10 research outputs found

Inferring phylogenetic trees under the general Markov model via a minimum spanning tree backbone

Author: Kalaghatgi Prabhav
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2020
Field of study

Phylogenetic trees are models of the evolutionary relationships among species, with species typically placed at the leaves of trees. We address the following problems regarding the calculation of phylogenetic trees. (1) Leaf-labeled phylogenetic trees may not be appropriate models of evolutionary relationships among rapidly evolving pathogens which may contain ancestor-descendant pairs. (2) The models of gene evolution that are widely used unrealistically assume that the base composition of DNA sequences does not evolve. Regarding problem (1) we present a method for inferring generally labeled phylogenetic trees that allow sampled species to be placed at non-leaf nodes of the tree. Regarding problem (2), we present a structural expectation maximization method (SEM-GM) for inferring leaf-labeled phylogenetic trees under the general Markov model (GM) which is the most complex model of DNA substitution that allows the evolution of base composition. In order to improve the scalability of SEM-GM we present a minimum spanning tree (MST) framework called MST-backbone. MST-backbone scales linearly with the number of leaves. However, the unrealistic location of the root as inferred on empirical data suggests that the GM model may be overtrained. MST-backbone was inspired by the topological relationship between MSTs and phylogenetic trees that was introduced by Choi et al. (2011). We discovered that the topological relationship does not necessarily hold if there is no unique MST. We propose so-called vertex-order based MSTs (VMSTs) that guarantee a topological relationship with phylogenetic trees.Phylogenetische Bäume modellieren evolutionäre Beziehungen zwischen Spezies, wobei die Spezies typischerweise an den Blättern der Bäume sitzen. Wir befassen uns mit den folgenden Problemen bei der Berechnung von phylogenetischen Bäumen. (1) Blattmarkierte phylogenetische Bäume sind möglicherweise keine geeigneten Modelle der evolutionären Beziehungen zwischen sich schnell entwickelnden Krankheitserregern, die Vorfahren-Nachfahren-Paare enthalten können. (2) Die weit verbreiteten Modelle der Genevolution gehen unrealistischerweise davon aus, dass sich die Basenzusammensetzung von DNA-Sequenzen nicht ändert. Bezüglich Problem (1) stellen wir eine Methode zur Ableitung von allgemein markierten phylogenetischen Bäumen vor, die es erlaubt, Spezies, für die Proben vorliegen, an inneren des Baumes zu platzieren. Bezüglich Problem (2) stellen wir eine strukturelle Expectation-Maximization-Methode (SEM-GM) zur Ableitung von blattmarkierten phylogenetischen Bäumen unter dem allgemeinen Markov-Modell (GM) vor, das das komplexeste Modell von DNA-Substitution ist und das die Evolution von Basenzusammensetzung erlaubt. Um die Skalierbarkeit von SEM-GM zu verbessern, stellen wir ein Minimale Spannbaum (MST)-Methode vor, die als MST-Backbone bezeichnet wird. MST-Backbone skaliert linear mit der Anzahl der Blätter. Die Tatsache, dass die Lage der Wurzel aus empirischen Daten nicht immer realistisch abgeleitet warden kann, legt jedoch nahe, dass das GM-Modell möglicherweise übertrainiert ist. MST-backbone wurde von einer topologischen Beziehung zwischen minimalen Spannbäumen und phylogenetischen Bäumen inspiriert, die von Choi et al. 2011 eingeführt wurde. Wir entdeckten, dass die topologische Beziehung nicht unbedingt Bestand hat, wenn es keinen eindeutigen minimalen Spannbaum gibt. Wir schlagen so genannte vertex-order-based MSTs (VMSTs) vor, die eine topologische Beziehung zu phylogenetischen Bäumen garantieren

Universaar

Acronym

LIPIcs, Volume 274, ESA 2023, Complete Volume

Author: Farach-Colton Martin
Herman Grzegorz
Puglisi Simon J.
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 31st Annual European Symposium on Algorithms (ESA 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 274, ESA 2023, Complete Volum

Dagstuhl Research Online Publication Server

LIPIcs, Volume 244, ESA 2022, Complete Volume

Author: Chechik Shiri
Herman Grzegorz
Navarro Gonzalo
Rotenberg Eva
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th Annual European Symposium on Algorithms (ESA 2022)
Publication date: 01/01/2022
Field of study

LIPIcs, Volume 244, ESA 2022, Complete Volum

Dagstuhl Research Online Publication Server

Recommended from our members

Stochastic Network Design: Models and Scalable Algorithms

Author: Wu Xiaojian
Publication venue: ScholarWorks@UMass Amherst
Publication date: 10/11/2016
Field of study

Many natural and social phenomena occur in networks. Examples include the spread of information, ideas, and opinions through a social network, the propagation of an infectious disease among people, and the spread of species within an interconnected habitat network. The ability to modify a phenomenon towards some desired outcomes has widely recognized benefits to our society and the economy. The outcome of a phenomenon is largely determined by the topology or properties of its underlying network. A decision maker can take management actions to modify a network and, therefore, change the outcome of the phenomenon. A management action is an activity that changes the topology or other properties of a network. For example, species that live in a small area may expand their population and gradually spread into an interconnected habitat network. However, human development of various structures such as highways and factories may destroy natural habitats or block paths connecting different habitat patches, which results in a population decline. To facilitate the dispersal of species and help the population recover, artificial corridors (e.g., a wildlife highway crossing) can be built to restore connectivity of isolated habitats, and conservation areas can be established to restore historical habitats of species, both of which are examples of management actions. The set of management actions that can be taken is restricted by a budget, so we must find cost-effective allocations of limited funding resources. In the thesis, the problem of finding the (nearly) optimal set of management actions is formulated as a discrete and stochastic optimization problem. Specifically, a general decision-making framework called stochastic network design is defined to model a broad range of similar real-world problems. The framework is defined upon a stochastic network, in which edges are either present or absent with certain probabilities. It defines several metrics to measure the outcome of the underlying phenomenon and a set of management actions that modify the network or its parameters in specific ways. The goal is to select a subset of management actions, subject to a budget constraint, to maximize a specified metric. The major contribution of the thesis is to develop scalable algorithms to find high- quality solutions for different problems within the framework. In general, these problems are NP-hard, and their objective functions are neither submodular nor super-modular. Existing algorithms, such as greedy algorithms and heuristic search algorithms, either lack theoretical guarantees or have limited scalability. In the thesis, fast approximate algorithms are developed under three different settings that are gradually more general. The most restricted setting is when a network is tree-structured. For this case, fully polynomial-time approximation schemes (FPTAS) are developed using dynamic programming algorithms and rounding techniques. A more general setting is when networks are general directed graphs. We use a sampling technique to convert the original stochastic optimization problem into a deterministic optimization problem and develop a primal-dual algorithm to solve it efficiently. In the previous two problem settings, the goal is to maximize connectivity of networks. In the most general setting, the goal is to maximize the number of nodes being connected and minimize the distance between these connected nodes. For example, we do not only want the species to reach a large number of habitat areas but also want them to be able to get there within a reasonable amount of time. The scalable algorithms for this setting combine a fast primal-dual algorithm and a sampling procedure. Three real-world problems from the areas of computational sustainability and emergency response are used to evaluate these algorithms. They are the barrier removal problem aimed to determine which instream barriers to remove to help fish access their historical habitats in a river network, the spatial conservation planning problem to determine which habitat units to set as conservation areas to encourage the dispersal of endangered species in a landscape, and the pre-disaster preparation problem aimed to minimize the disruption of emergency medical services by natural disasters. In these three problems, the developed algorithms are much more scalable than the existing state-of-the-arts and produce high-quality solutions

ScholarWorks@UMass Amherst

Algorithms for Unit-Disk Graphs and Related Problems

Author: Zhao Yiming
Publication venue: DigitalCommons@USU
Publication date: 01/05/2023
Field of study

In this dissertation, we study algorithms for several problems on unit-disk graphs and related problems. The unit-disk graph can be viewed as an intersection graph of a set of congruent disks. Unit-disk graphs have been extensively studied due to many of their applications, e.g., modeling the topology of wireless sensor networks. Lots of problems on unit-disk graphs have been considered in the literature, such as shortest paths, clique, independent set, distance oracle, diameter, etc. Specifically, we study the following problems in this dissertation: L1 shortest paths in unit-disk graphs, reverse shortest paths in unit-disk graphs, minimum bottleneck moving spanning tree, unit-disk range reporting, distance selection, etc. We develop efficient algorithms for these problems and our results are either first-known solutions or somehow improve the previous work. Given a set P of n points in the plane and a parameter r \u3e 0, a unit-disk graph G(P) can be defined using P as its vertex set and two points of P are connected by an edge if the distance between these two points is at most r. The weight of an edge is one in the unweighted case and is equal to the distance between the two endpoints in the weighted case. Note that the distance between two points can be measured by different metrics, e.g., L1 or L2 metric. In the first problem of L1 shortest paths in unit-disk graphs, we are given a point set P and a source point s ∈ P, the problem is to find all shortest paths from s to all other vertices in the L1 weighted unit-disk graph defined on set P. We present an O(n log n) time algorithm, which matches the Ω(n log n)-time lower bound. In the second problem, we are given a set P of n points, parameters r, λ \u3e 0, and two points s and t of P, the goal is to compute the smallest r such that the shortest path length between s and t in the unit-disk graph with respect to set P and parameter r is at most λ. This problem can be defined in both unweighted and weighted cases. We propose an algorithm of O(⌊λ⌋ · n log n) time and another algorithm of O(n5/4 log7/4 n) time for the unweighted case. We also given an O(n5/4 log5/2 n) time algorithm for the weighted case. In the third problem, we are given a set P of n points that are moving in the plane, the problem is to compute a spanning tree for these moving points that does not change its combinatorial structure during the point movement such that the bottleneck weight of the spanning tree (i.e., the largest Euclidean length of all edges) during the whole movement is minimized. We present an algorithm that runs in O(n4/3 log3 n) time. The fourth problem is unit-disk range reporting in which we are given a set P of n points in the plane and a value r, we need to construct a data structure so that given any query disk of radius r, all points of P in the disk can be reported efficiently. We build a data structure of O(n) space in O(n log n) time that can answer each query in O(k + log n) time, where k is the output size. The time complexity of our algorithm is the same as the previous result but our approach is much simpler. Finally, for the problem of distance selection, we are given a set P of n points in the plane and an integer 1 ≤ k ≤ (n2), the distance selection problem is to find the k-th smallest interpoint distance among all pairs of points of p. We propose an algorithm that runs in O(n4/3 log n) time. Our techniques yield two algorithmic frameworks for solving geometric optimization problems. Many algorithms and techniques developed in this dissertation are quite general and fundamental, and we believe they will find other applications in future

DigitalCommons@USU

Abstracts for the twentyfirst European workshop on Computational geometry, Technische Universiteit Eindhoven, The Netherlands, March 9-11, 2005

Author
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2005
Field of study

Pure OAI Repository

Proceedings of the Fifth European Workshop on Probabilistic Graphical Models

Author: Jaakkola Tommi
Myllymäki Petri
Roos Teemu
Publication venue
Publication date: 01/09/2010
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Social informatics

Author: Bing Tian DAI
DIAS Gael
DING Ying
Ee-peng LIM
FLANAGIN Andrew J.
JATOWT Adam
MIURA Asako
TANAKA Katsumi
TEZUKA Taro
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/11/2013
Field of study

5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings</p

Institutional Knowledge at Singapore Management University

Programming Languages and Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2021
Field of study

This open access book constitutes the proceedings of the 29th European Symposium on Programming, ESOP 2020, which was planned to take place in Dublin, Ireland, in April 2020, as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The actual ETAPS 2020 meeting was postponed due to the Corona pandemic. The papers deal with fundamental issues in the specification, design, analysis, and implementation of programming languages and systems

Directory of Open Access Books (DOAB)

A survey of the application of soft computing to investment and financial trading

Author: Tan Clarence
Vanstone Bruce J
Publication venue: The Australian Pattern Recognition Society
Publication date: 01/01/2003
Field of study

Bond University Research Portal