46,914 research outputs found
Convex Relaxations for Permutation Problems
Seriation seeks to reconstruct a linear order between variables using
unsorted, pairwise similarity information. It has direct applications in
archeology and shotgun gene sequencing for example. We write seriation as an
optimization problem by proving the equivalence between the seriation and
combinatorial 2-SUM problems on similarity matrices (2-SUM is a quadratic
minimization problem over permutations). The seriation problem can be solved
exactly by a spectral algorithm in the noiseless case and we derive several
convex relaxations for 2-SUM to improve the robustness of seriation solutions
in noisy settings. These convex relaxations also allow us to impose structural
constraints on the solution, hence solve semi-supervised seriation problems. We
derive new approximation bounds for some of these relaxations and present
numerical experiments on archeological data, Markov chains and DNA assembly
from shotgun gene sequencing data.Comment: Final journal version, a few typos and references fixe
Computing the Boolean product of two n\times n Boolean matrices using O(n^2) mechanical operation
We study the problem of determining the Boolean product of two n\times n
Boolean matrices in an unconventional computational model allowing for
mechanical operations. We show that O(n^2) operations are sufficient to compute
the product in this model.Comment: 11 pages, 7 figure
Boolean Matrix Factorization Meets Consecutive Ones Property
Boolean matrix factorization is a natural and a popular technique for summarizing binary matrices. In this paper, we study a problem of Boolean matrix factorization where we additionally require that the factor matrices have consecutive ones property (OBMF). A major application of this optimization problem comes from graph visualization: standard techniques for visualizing graphs are circular or linear layout, where nodes are ordered in circle or on a line. A common problem with visualizing graphs is clutter due to too many edges. The standard approach to deal with this is to bundle edges together and represent them as ribbon. We also show that we can use OBMF for edge bundling combined with circular or linear layout techniques. We demonstrate that not only this problem is NP-hard but we cannot have a polynomial-time algorithm that yields a multiplicative approximation guarantee (unless P = NP). On the positive side, we develop a greedy algorithm where at each step we look for the best 1-rank factorization. Since even obtaining 1-rank factorization is NP-hard, we propose an iterative algorithm where we fix one side and and find the other, reverse the roles, and repeat. We show that this step can be done in linear time using pq-trees. We also extend the problem to cyclic ones property and symmetric factorizations. Our experiments show that our algorithms find high-quality factorizations and scale well
Minimal Conflicting Sets for the Consecutive Ones Property in ancestral genome reconstruction
A binary matrix has the Consecutive Ones Property (C1P) if its columns can be
ordered in such a way that all 1's on each row are consecutive. A Minimal
Conflicting Set is a set of rows that does not have the C1P, but every proper
subset has the C1P. Such submatrices have been considered in comparative
genomics applications, but very little is known about their combinatorial
structure and efficient algorithms to compute them. We first describe an
algorithm that detects rows that belong to Minimal Conflicting Sets. This
algorithm has a polynomial time complexity when the number of 1's in each row
of the considered matrix is bounded by a constant. Next, we show that the
problem of computing all Minimal Conflicting Sets can be reduced to the joint
generation of all minimal true clauses and maximal false clauses for some
monotone boolean function. We use these methods on simulated data related to
ancestral genome reconstruction to show that computing Minimal Conflicting Set
is useful in discriminating between true positive and false positive ancestral
syntenies. We also study a dataset of yeast genomes and address the reliability
of an ancestral genome proposal of the Saccahromycetaceae yeasts.Comment: 20 pages, 3 figure
Improved Approximation Algorithms for Segment Minimization in Intensity Modulated Radiation Therapy
he segment minimization problem consists of finding the smallest set of
integer matrices that sum to a given intensity matrix, such that each summand
has only one non-zero value, and the non-zeroes in each row are consecutive.
This has direct applications in intensity-modulated radiation therapy, an
effective form of cancer treatment. We develop three approximation algorithms
for matrices with arbitrarily many rows. Our first two algorithms improve the
approximation factor from the previous best of to (roughly) and , respectively, where is
the largest entry in the intensity matrix. We illustrate the limitations of the
specific approach used to obtain these two algorithms by proving a lower bound
of on the approximation
guarantee. Our third algorithm improves the approximation factor from to , where is (roughly) the largest
difference between consecutive elements of a row of the intensity matrix.
Finally, experimentation with these algorithms shows that they perform well
with respect to the optimum and outperform other approximation algorithms on
77% of the 122 test cases we consider, which include both real world and
synthetic data.Comment: 18 page
Cache-Oblivious Selection in Sorted X+Y Matrices
Let X[0..n-1] and Y[0..m-1] be two sorted arrays, and define the mxn matrix A
by A[j][i]=X[i]+Y[j]. Frederickson and Johnson gave an efficient algorithm for
selecting the k-th smallest element from A. We show how to make this algorithm
IO-efficient. Our cache-oblivious algorithm performs O((m+n)/B) IOs, where B is
the block size of memory transfers
The quadratic assignment problem is easy for Robinsonian matrices with Toeplitz structure
We present a new polynomially solvable case of the Quadratic Assignment
Problem in Koopmans-Beckman form , by showing that the identity
permutation is optimal when and are respectively a Robinson similarity
and dissimilarity matrix and one of or is a Toeplitz matrix. A Robinson
(dis)similarity matrix is a symmetric matrix whose entries (increase) decrease
monotonically along rows and columns when moving away from the diagonal, and
such matrices arise in the classical seriation problem.Comment: 15 pages, 2 figure
- …