Search CORE

4,039 research outputs found

Approximating the Geometric Edit Distance

Author: Fox Kyle
Li Xinyi
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 30th International Symposium on Algorithms and Computation (ISAAC 2019)
Publication date: 01/01/2019
Field of study

Edit distance is a measurement of similarity between two sequences such as strings, point sequences, or polygonal curves. Many matching problems from a variety of areas, such as signal analysis, bioinformatics, etc., need to be solved in a geometric space. Therefore, the geometric edit distance (GED) has been studied. In this paper, we describe the first strictly sublinear approximate near-linear time algorithm for computing the GED of two point sequences in constant dimensional Euclidean space. Specifically, we present a randomized O(n log^2n) time O(sqrt n)-approximation algorithm. Then, we generalize our result to give a randomized alpha-approximation algorithm for any alpha in [1, sqrt n], running in time O~(n^2/alpha^2). Both algorithms are Monte Carlo and return approximately optimal solutions with high probability

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Approximating Edit Distance Within Constant Factor in Truly Sub-Quadratic Time

Author: Chakraborty Diptarka
Das Debarati
Goldenberg Elazar
Koucky Michal
Saks Michael
Publication venue
Publication date: 08/10/2018
Field of study

Edit distance is a measure of similarity of two strings based on the minimum number of character insertions, deletions, and substitutions required to transform one string into the other. The edit distance can be computed exactly using a dynamic programming algorithm that runs in quadratic time. Andoni, Krauthgamer and Onak (2010) gave a nearly linear time algorithm that approximates edit distance within approximation factor

\text{poly}(\log n)

. In this paper, we provide an algorithm with running time

\tilde{O}(n^{2-2/7})

that approximates the edit distance within a constant factor

arXiv.org e-Print Archive

Copenhagen University Research Information System

Approximating Weighted Duo-Preservation in Comparative Genomics

Author: AR Mushegian
B Brubach
G Cormode
H Jiang
KM Swenson
L Bulteau
M Chrobak
N Boria
NH Mustafa
RC Hardison
S Beretta
TM Chan
W Chen
X Chen
Publication venue
Publication date: 30/08/2017
Field of study

Motivated by comparative genomics, Chen et al. [9] introduced the Maximum Duo-preservation String Mapping (MDSM) problem in which we are given two strings

s_1

and

s_2

from the same alphabet and the goal is to find a mapping

\pi

between them so as to maximize the number of duos preserved. A duo is any two consecutive characters in a string and it is preserved in the mapping if its two consecutive characters in

s_1

are mapped to same two consecutive characters in

s_2

. The MDSM problem is known to be NP-hard and there are approximation algorithms for this problem [3, 5, 13], but all of them consider only the "unweighted" version of the problem in the sense that a duo from

s_1

is preserved by mapping to any same duo in

s_2

regardless of their positions in the respective strings. However, it is well-desired in comparative genomics to find mappings that consider preserving duos that are "closer" to each other under some distance measure [19]. In this paper, we introduce a generalized version of the problem, called the Maximum-Weight Duo-preservation String Mapping (MWDSM) problem that captures both duos-preservation and duos-distance measures in the sense that mapping a duo from

s_1

to each preserved duo in

s_2

has a weight, indicating the "closeness" of the two duos. The objective of the MWDSM problem is to find a mapping so as to maximize the total weight of preserved duos. In this paper, we give a polynomial-time 6-approximation algorithm for this problem.Comment: Appeared in proceedings of the 23rd International Computing and Combinatorics Conference (COCOON 2017

arXiv.org e-Print Archive

Crossref

Distributed PCP Theorems for Hardness of Approximation in P

Author: Abboud Amir
Rubinstein Aviad
Williams Ryan
Publication venue
Publication date: 01/01/1952
Field of study

We present a new distributed model of probabilistically checkable proofs (PCP). A satisfying assignment

x \in \{0,1\}^n

to a CNF formula

\varphi

is shared between two parties, where Alice knows

x_1, \dots, x_{n/2}

, Bob knows

x_{n/2+1},\dots,x_n

, and both parties know

\varphi

. The goal is to have Alice and Bob jointly write a PCP that

x

satisfies

\varphi

, while exchanging little or no information. Unfortunately, this model as-is does not allow for nontrivial query complexity. Instead, we focus on a non-deterministic variant, where the players are helped by Merlin, a third party who knows all of

x

. Using our framework, we obtain, for the first time, PCP-like reductions from the Strong Exponential Time Hypothesis (SETH) to approximation problems in P. In particular, under SETH we show that there are no truly-subquadratic approximation algorithms for Bichromatic Maximum Inner Product over {0,1}-vectors, Bichromatic LCS Closest Pair over permutations, Approximate Regular Expression Matching, and Diameter in Product Metric. All our inapproximability factors are nearly-tight. In particular, for the first two problems we obtain nearly-polynomial factors of

2^{(\log n)^{1-o(1)}}

; only

(1+o(1))

-factor lower bounds (under SETH) were known before

arXiv.org e-Print Archive

Biblioteca Virtual del Patrimonio Bibliográfico (Virtual Library of Bibliographical Heritage)

Crossref

Convex Graph Invariant Relaxations For Graph Edit Distance

Author: Candogan Utkan Onur
Chandrasekaran Venkat
Publication venue
Publication date: 17/04/2019
Field of study

The edit distance between two graphs is a widely used measure of similarity that evaluates the smallest number of vertex and edge deletions/insertions required to transform one graph to another. It is NP-hard to compute in general, and a large number of heuristics have been proposed for approximating this quantity. With few exceptions, these methods generally provide upper bounds on the edit distance between two graphs. In this paper, we propose a new family of computationally tractable convex relaxations for obtaining lower bounds on graph edit distance. These relaxations can be tailored to the structural properties of the particular graphs via convex graph invariants. Specific examples that we highlight in this paper include constraints on the graph spectrum as well as (tractable approximations of) the stability number and the maximum-cut values of graphs. We prove under suitable conditions that our relaxations are tight (i.e., exactly compute the graph edit distance) when one of the graphs consists of few eigenvalues. We also validate the utility of our framework on synthetic problems as well as real applications involving molecular structure comparison problems in chemistry.Comment: 27 pages, 7 figure

arXiv.org e-Print Archive

Caltech Authors

The Traveling Salesman Problem in the Natural Environment

Author: Flip Phillips
Oliver W. Layton
Thomas O&#x27
Publication venue
Publication date: 07/10/2010
Field of study

Is it possible for humans to navigate in the natural environment wherein the path taken between various destinations is 'optimal' in some way? In the domain of optimization this challenge is traditionally framed as the "Traveling Salesman Problem" (TSP). What strategies and ecological considerations are plausible for human navigation? When given a two-dimensional map-like presentation of the destinations, participants solve this optimization exceptionally well (only 2-3% longer than optimum)^1, 2^. In the following experiments we investigate the effect of effort and its environmental affordance on navigation decisions when humans solve the TSP in the natural environment. Fifteen locations were marked on two outdoor landscapes with flat and varied terrains respectively. Performance in the flat-field condition was excellent (∼6% error) and was worse but still quite good in the variable-terrain condition (∼20% error), suggesting participants do not globally pre-plan routes but rather develop them on the fly. We suggest that perceived effort guides participant solutions due to the dynamic constraints of effortful locomotion and obstacle avoidance

Nature Precedings