733 research outputs found

    Finding and counting vertex-colored subtrees

    Full text link
    The problems studied in this article originate from the Graph Motif problem introduced by Lacroix et al. in the context of biological networks. The problem is to decide if a vertex-colored graph has a connected subgraph whose colors equal a given multiset of colors MM. It is a graph pattern-matching problem variant, where the structure of the occurrence of the pattern is not of interest but the only requirement is the connectedness. Using an algebraic framework recently introduced by Koutis et al., we obtain new FPT algorithms for Graph Motif and variants, with improved running times. We also obtain results on the counting versions of this problem, proving that the counting problem is FPT if M is a set, but becomes W[1]-hard if M is a multiset with two colors. Finally, we present an experimental evaluation of this approach on real datasets, showing that its performance compares favorably with existing software.Comment: Conference version in International Symposium on Mathematical Foundations of Computer Science (MFCS), Brno : Czech Republic (2010) Journal Version in Algorithmic

    Improved Algorithms for the Point-Set Embeddability problem for Plane 3-Trees

    Full text link
    In the point set embeddability problem, we are given a plane graph GG with nn vertices and a point set SS with nn points. Now the goal is to answer the question whether there exists a straight-line drawing of GG such that each vertex is represented as a distinct point of SS as well as to provide an embedding if one does exist. Recently, in \cite{DBLP:conf/gd/NishatMR10}, a complete characterization for this problem on a special class of graphs known as the plane 3-trees was presented along with an efficient algorithm to solve the problem. In this paper, we use the same characterization to devise an improved algorithm for the same problem. Much of the efficiency we achieve comes from clever uses of the triangular range search technique. We also study a generalized version of the problem and present improved algorithms for this version of the problem as well

    Phase transition in the sample complexity of likelihood-based phylogeny inference

    Full text link
    Reconstructing evolutionary trees from molecular sequence data is a fundamental problem in computational biology. Stochastic models of sequence evolution are closely related to spin systems that have been extensively studied in statistical physics and that connection has led to important insights on the theoretical properties of phylogenetic reconstruction algorithms as well as the development of new inference methods. Here, we study maximum likelihood, a classical statistical technique which is perhaps the most widely used in phylogenetic practice because of its superior empirical accuracy. At the theoretical level, except for its consistency, that is, the guarantee of eventual correct reconstruction as the size of the input data grows, much remains to be understood about the statistical properties of maximum likelihood in this context. In particular, the best bounds on the sample complexity or sequence-length requirement of maximum likelihood, that is, the amount of data required for correct reconstruction, are exponential in the number, nn, of tips---far from known lower bounds based on information-theoretic arguments. Here we close the gap by proving a new upper bound on the sequence-length requirement of maximum likelihood that matches up to constants the known lower bound for some standard models of evolution. More specifically, for the rr-state symmetric model of sequence evolution on a binary phylogeny with bounded edge lengths, we show that the sequence-length requirement behaves logarithmically in nn when the expected amount of mutation per edge is below what is known as the Kesten-Stigum threshold. In general, the sequence-length requirement is polynomial in nn. Our results imply moreover that the maximum likelihood estimator can be computed efficiently on randomly generated data provided sequences are as above.Comment: To appear in Probability Theory and Related Field

    Convex Hull Realizations of the Multiplihedra

    Get PDF
    We present a simple algorithm for determining the extremal points in Euclidean space whose convex hull is the nth polytope in the sequence known as the multiplihedra. This answers the open question of whether the multiplihedra could be realized as convex polytopes. We use this realization to unite the approach to A_n-maps of Iwase and Mimura to that of Boardman and Vogt. We include a review of the appearance of the nth multiplihedron for various n in the studies of higher homotopy commutativity, (weak) n-categories, A_infinity-categories, deformation theory, and moduli spaces. We also include suggestions for the use of our realizations in some of these areas as well as in related studies, including enriched category theory and the graph associahedra.Comment: typos fixed, introduction revise

    Connectivity Oracles for Graphs Subject to Vertex Failures

    Full text link
    We introduce new data structures for answering connectivity queries in graphs subject to batched vertex failures. A deterministic structure processes a batch of ddd\leq d_{\star} failed vertices in O~(d3)\tilde{O}(d^3) time and thereafter answers connectivity queries in O(d)O(d) time. It occupies space O(dmlogn)O(d_{\star} m\log n). We develop a randomized Monte Carlo version of our data structure with update time O~(d2)\tilde{O}(d^2), query time O(d)O(d), and space O~(m)\tilde{O}(m) for any failure bound dnd\le n. This is the first connectivity oracle for general graphs that can efficiently deal with an unbounded number of vertex failures. We also develop a more efficient Monte Carlo edge-failure connectivity oracle. Using space O(nlog2n)O(n\log^2 n), dd edge failures are processed in O(dlogdloglogn)O(d\log d\log\log n) time and thereafter, connectivity queries are answered in O(loglogn)O(\log\log n) time, which are correct w.h.p. Our data structures are based on a new decomposition theorem for an undirected graph G=(V,E)G=(V,E), which is of independent interest. It states that for any terminal set UVU\subseteq V we can remove a set BB of U/(s2)|U|/(s-2) vertices such that the remaining graph contains a Steiner forest for UBU-B with maximum degree ss
    corecore