8 research outputs found

    Solving a "Hard" Problem to Approximate an "Easy" One: Heuristics for Maximum Matchings and Maximum Traveling Salesman Problems

    Get PDF
    We consider geometric instances of the Maximum Weighted Matching Problem (MWMP) and the Maximum Traveling Salesman Problem (MTSP) with up to 3,000,000 vertices. Making use of a geometric duality relationship between MWMP, MTSP, and the Fermat-Weber-Problem (FWP), we develop a heuristic approach that yields in near-linear time solutions as well as upper bounds. Using various computational tools, we get solutions within considerably less than 1% of the optimum. An interesting feature of our approach is that, even though an FWP is hard to compute in theory and Edmonds' algorithm for maximum weighted matching yields a polynomial solution for the MWMP, the practical behavior is just the opposite, and we can solve the FWP with high accuracy in order to find a good heuristic solution for the MWMP.Comment: 20 pages, 14 figures, Latex, to appear in Journal of Experimental Algorithms, 200

    Efficient Exact Inference in Planar Ising Models

    Full text link
    We give polynomial-time algorithms for the exact computation of lowest-energy (ground) states, worst margin violators, log partition functions, and marginal edge probabilities in certain binary undirected graphical models. Our approach provides an interesting alternative to the well-known graph cut paradigm in that it does not impose any submodularity constraints; instead we require planarity to establish a correspondence with perfect matchings (dimer coverings) in an expanded dual graph. We implement a unified framework while delegating complex but well-understood subproblems (planar embedding, maximum-weight perfect matching) to established algorithms for which efficient implementations are freely available. Unlike graph cut methods, we can perform penalized maximum-likelihood as well as maximum-margin parameter estimation in the associated conditional random fields (CRFs), and employ marginal posterior probabilities as well as maximum a posteriori (MAP) states for prediction. Maximum-margin CRF parameter estimation on image denoising and segmentation problems shows our approach to be efficient and effective. A C++ implementation is available from http://nic.schraudolph.org/isinf/Comment: Fixed a number of bugs in v1; added 10 pages of additional figures, explanations, proofs, and experiment

    Implementation of O(nmlogn)O(nm \log n) Weighted Matchings in General Graphs: The Power of Data Structures

    No full text
    We describe the implementation of an algorithm which solves the weighted matching problem in general graphs with nn vertices and mm edges in time O(nmlogn)O(nm \log n). Our algorithm is a variant of the algorithm of Galil, Micali and Gabow [Galil et al., 1986, SIAM J. Computing, 15, 120--130] and extensively uses sophisticated data structures, in particular \emph{concatenable priority queues}, so as to reduce the time needed to perform dual adjustments and to find tight edges in Edmonds' blossom-shrinking algorithm. We compare our implementation to the experimentally fastest implementation, named \emph{Blossom IV}, due to Cook and Rohe [Cook and Rohe, Technical Report 97863, Forschungsinstitut f{\"u}r Diskrete Mathematik, Universit{\"a}t Bonn]. Blossom IV requires only very simple data structures and has an asymptotic running time of O(n2m)O(n^2m). Our experiments show that our new implementation is superior to Blossom IV. A closer inspection reveals that the running time of Edmonds' blossom-shrinking algorithm in practice heavily depends on the time spent to perform dual adjustments and to find tight edges. Therefore, optimizing these operations, as is done in our implementation, indeed speeds-up the practical performance of implementations of Edmonds' algorithm

    Use of a weighted matching algorithm to sequence clusters in spatial join processing

    Get PDF
    One of the most expensive operations in a spatial database is spatial join processing. This study focuses on how to improve the performance of such processing. The main objective is to reduce the Input/Output (I/O) cost of the spatial join process by using a technique called cluster-scheduling. Generally, the spatial join is processed in two steps, namely filtering and refinement. The cluster-scheduling technique is performed after the filtering step and before the refinement step and is part of the housekeeping phase. The key point of this technique is to realise order wherein two consecutive clusters in the sequence have maximal overlapping objects. However, finding the maximal overlapping order has been shown to be Nondeterministic Polynomial-time (NP)-complete. This study proposes an algorithm to provide approximate maximal overlapping (AMO) order in a Cluster Overlapping (CO) graph. The study proposes the use of an efficient maximum weighted matching algorithm to solve the problem of finding AMO order. As a result, the I/O cost in spatial join processing can be minimised

    Implementation of O(nmlogn)O(nm \log n) Weighted Matchings in General Graphs: The Power of Data Structures

    No full text
    We describe the implementation of an algorithm which solves the weighted matching problem in general graphs with nn vertices and mm edges in time O(nmlogn)O(nm \log n). Our algorithm is a variant of the algorithm of Galil, Micali and Gabow [Galil et al., 1986, SIAM J. Computing, 15, 120--130] and extensively uses sophisticated data structures, in particular \emph{concatenable priority queues}, so as to reduce the time needed to perform dual adjustments and to find tight edges in Edmonds' blossom-shrinking algorithm. We compare our implementation to the experimentally fastest implementation, named \emph{Blossom IV}, due to Cook and Rohe [Cook and Rohe, Technical Report 97863, Forschungsinstitut f{\"u}r Diskrete Mathematik, Universit{\"a}t Bonn]. Blossom IV requires only very simple data structures and has an asymptotic running time of O(n2m)O(n^2m). Our experiments show that our new implementation is superior to Blossom IV. A closer inspection reveals that the running time of Edmonds' blossom-shrinking algorithm in practice heavily depends on the time spent to perform dual adjustments and to find tight edges. Therefore, optimizing these operations, as is done in our implementation, indeed speeds-up the practical performance of implementations of Edmonds' algorithm

    Development of New Computational Tools for Analyzing Hi-C Data and Predicting Three-Dimensional Genome Organization

    Get PDF
    Background: The development of Hi-C (and related methods) has allowed for unprecedented sequence-level investigations into the structure-function relationship of the genome. There has been extensive effort in developing new tools to analyze this data in order to better understand the relationship between 3D genomic structure and function. While useful, the existing tools are far from maturity and (in some cases) lack the generalizability that would be required for application in a diverse set of organisms. This is problematic since the research community has proposed many cross-species "hallmarks" of 3D genome organization without confirming their existence in a variety of organisms. Research Objective: Develop new, generalizable computational tools for Hi-C analysis and 3D genome prediction. Results: Three new computational tools were developed for Hi-C analysis or 3D genome prediction: GrapHi-C (visualization), GeneRHi-C (3D prediction) and StoHi-C (3D prediction). Each tool has the potential to be used for 3D genome analysis in both model and non-model organisms since the underlying algorithms do not rely on any organism-specific constraints. A brief description of each tool follows. GrapHi-C is a graph-based visualization of Hi-C data. Unlike existing visualization methods, GrapHi-C allows for a more intuitive structural visualization of the underlying data. GeneRHi-C and StoHi-C are tools that can be used to predict 3D genome organizations from Hi-C data (the 3D-genome reconstruction problem). GeneRHi-C uses a combination of mixed integer programming and network layout algorithms to generate 3D coordinates from a ploidy-dependent subset of the Hi-C data. Alternatively, StoHi-C uses t-stochastic neighbour embedding with the complete set of Hi-C data to generate 3D coordinates of the genome. Each tool was applied to multiple, independent existing Hi-C datasets from fission yeast to demonstrate their utility. This is the first time 3D genome prediction has been successfully applied to these datasets. Overall, the tools developed here more clearly recapitulated documented features of fission yeast genomic organization when compared to existing techniques. Future work will focus on extending and applying these tools to analyze Hi-C datasets from other organisms. Additional Information: This thesis contains a collection of papers pertaining to the development of new tools for analyzing Hi-C data and predicting 3D genome organization. Each paper's publication status (as of January 2020) has been provided at the beginning of the corresponding chapter. For published papers, reprint permission was obtained and is available in the appendix

    Sixth Biennial Report : August 2001 - May 2003

    No full text
    corecore