46,914 research outputs found

    Convex Relaxations for Permutation Problems

    Full text link
    Seriation seeks to reconstruct a linear order between variables using unsorted, pairwise similarity information. It has direct applications in archeology and shotgun gene sequencing for example. We write seriation as an optimization problem by proving the equivalence between the seriation and combinatorial 2-SUM problems on similarity matrices (2-SUM is a quadratic minimization problem over permutations). The seriation problem can be solved exactly by a spectral algorithm in the noiseless case and we derive several convex relaxations for 2-SUM to improve the robustness of seriation solutions in noisy settings. These convex relaxations also allow us to impose structural constraints on the solution, hence solve semi-supervised seriation problems. We derive new approximation bounds for some of these relaxations and present numerical experiments on archeological data, Markov chains and DNA assembly from shotgun gene sequencing data.Comment: Final journal version, a few typos and references fixe

    Computing the Boolean product of two n\times n Boolean matrices using O(n^2) mechanical operation

    Full text link
    We study the problem of determining the Boolean product of two n\times n Boolean matrices in an unconventional computational model allowing for mechanical operations. We show that O(n^2) operations are sufficient to compute the product in this model.Comment: 11 pages, 7 figure

    Boolean Matrix Factorization Meets Consecutive Ones Property

    No full text
    Boolean matrix factorization is a natural and a popular technique for summarizing binary matrices. In this paper, we study a problem of Boolean matrix factorization where we additionally require that the factor matrices have consecutive ones property (OBMF). A major application of this optimization problem comes from graph visualization: standard techniques for visualizing graphs are circular or linear layout, where nodes are ordered in circle or on a line. A common problem with visualizing graphs is clutter due to too many edges. The standard approach to deal with this is to bundle edges together and represent them as ribbon. We also show that we can use OBMF for edge bundling combined with circular or linear layout techniques. We demonstrate that not only this problem is NP-hard but we cannot have a polynomial-time algorithm that yields a multiplicative approximation guarantee (unless P = NP). On the positive side, we develop a greedy algorithm where at each step we look for the best 1-rank factorization. Since even obtaining 1-rank factorization is NP-hard, we propose an iterative algorithm where we fix one side and and find the other, reverse the roles, and repeat. We show that this step can be done in linear time using pq-trees. We also extend the problem to cyclic ones property and symmetric factorizations. Our experiments show that our algorithms find high-quality factorizations and scale well

    Minimal Conflicting Sets for the Consecutive Ones Property in ancestral genome reconstruction

    Full text link
    A binary matrix has the Consecutive Ones Property (C1P) if its columns can be ordered in such a way that all 1's on each row are consecutive. A Minimal Conflicting Set is a set of rows that does not have the C1P, but every proper subset has the C1P. Such submatrices have been considered in comparative genomics applications, but very little is known about their combinatorial structure and efficient algorithms to compute them. We first describe an algorithm that detects rows that belong to Minimal Conflicting Sets. This algorithm has a polynomial time complexity when the number of 1's in each row of the considered matrix is bounded by a constant. Next, we show that the problem of computing all Minimal Conflicting Sets can be reduced to the joint generation of all minimal true clauses and maximal false clauses for some monotone boolean function. We use these methods on simulated data related to ancestral genome reconstruction to show that computing Minimal Conflicting Set is useful in discriminating between true positive and false positive ancestral syntenies. We also study a dataset of yeast genomes and address the reliability of an ancestral genome proposal of the Saccahromycetaceae yeasts.Comment: 20 pages, 3 figure

    Improved Approximation Algorithms for Segment Minimization in Intensity Modulated Radiation Therapy

    Full text link
    he segment minimization problem consists of finding the smallest set of integer matrices that sum to a given intensity matrix, such that each summand has only one non-zero value, and the non-zeroes in each row are consecutive. This has direct applications in intensity-modulated radiation therapy, an effective form of cancer treatment. We develop three approximation algorithms for matrices with arbitrarily many rows. Our first two algorithms improve the approximation factor from the previous best of 1+log2h1+\log_2 h to (roughly) 3/2(1+log3h)3/2 \cdot (1+\log_3 h) and 11/6(1+log4h)11/6\cdot(1+\log_4{h}), respectively, where hh is the largest entry in the intensity matrix. We illustrate the limitations of the specific approach used to obtain these two algorithms by proving a lower bound of (2b2)blogbh+1b\frac{(2b-2)}{b}\cdot\log_b{h} + \frac{1}{b} on the approximation guarantee. Our third algorithm improves the approximation factor from 2(logD+1)2 \cdot (\log D+1) to 24/13(logD+1)24/13 \cdot (\log D+1), where DD is (roughly) the largest difference between consecutive elements of a row of the intensity matrix. Finally, experimentation with these algorithms shows that they perform well with respect to the optimum and outperform other approximation algorithms on 77% of the 122 test cases we consider, which include both real world and synthetic data.Comment: 18 page

    Cache-Oblivious Selection in Sorted X+Y Matrices

    Full text link
    Let X[0..n-1] and Y[0..m-1] be two sorted arrays, and define the mxn matrix A by A[j][i]=X[i]+Y[j]. Frederickson and Johnson gave an efficient algorithm for selecting the k-th smallest element from A. We show how to make this algorithm IO-efficient. Our cache-oblivious algorithm performs O((m+n)/B) IOs, where B is the block size of memory transfers

    The quadratic assignment problem is easy for Robinsonian matrices with Toeplitz structure

    Get PDF
    We present a new polynomially solvable case of the Quadratic Assignment Problem in Koopmans-Beckman form QAP(A,B)QAP(A,B), by showing that the identity permutation is optimal when AA and BB are respectively a Robinson similarity and dissimilarity matrix and one of AA or BB is a Toeplitz matrix. A Robinson (dis)similarity matrix is a symmetric matrix whose entries (increase) decrease monotonically along rows and columns when moving away from the diagonal, and such matrices arise in the classical seriation problem.Comment: 15 pages, 2 figure
    corecore