Search CORE

98 research outputs found

Bounded repairability for regular tree languages

Author: Bourhis P.
Puppis G.
Riveros C.
Staworko S.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

We study the problem of bounded repairability of a given restriction tree language R into a target tree language T. More precisely, we say that R is bounded repairable w.r.t. T if there exists a bound on the number of standard tree editing operations necessary to apply to any tree in R in order to obtain a tree in T. We consider a number of possible specifications for tree languages: bottom-up tree automata (on curry encoding of unranked trees) that capture the class of XML Schemas and DTDs. We also consider a special case when the restriction language R is universal, i.e., contains all trees over a given alphabet. We give an effective characterization of bounded repairability between pairs of tree languages represented with automata. This characterization introduces two tools, synopsis trees and a coverage relation between them, allowing one to reason about tree languages that undergo a bounded number of editing operations. We then employ this characterization to provide upper bounds to the complexity of deciding bounded repairability and we show that these bounds are tight. In particular, when the input tree languages are specified with arbitrary bottom-up automata, the problem is coNEXPTIME-complete. The problem remains coNEXPTIME-complete even if we use deterministic non-recursive DTDs to specify the input languages. The complexity of the problem can be reduced if we assume that the alphabet, the set of node labels, is fixed: the problem becomes PSPACE-complete for non-recursive DTDs and coNP-complete for deterministic non-recursive DTDs. Finally, when the restriction tree language R is universal, we show that the bounded repairability problem becomes EXPTIME-complete if the target language is specified by an arbitrary bottom-up tree automaton and becomes tractable (PTIME-complete, in fact) when a deterministic bottom-up automaton is used

Archivio istituzionale della ricerca - Università degli Studi di Udine

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Bounded repairability for regular tree languages

Author: Puppis Gabriele
Riveros Cristian
Staworko Slawek
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

International audienceWe consider the problem of repairing unranked trees (e.g., XML documents) satisfying a given restriction specification R (e.g., a DTD) into unranked trees satisfying a given target specification T. Specifically, we focus on the question of whether one can get from any tree in a regular language R to some tree in another regular language T with a finite, uniformly bounded, number of edit operations (i.e., deletions and insertions of nodes). We give effective characterizations of the pairs of specifications R and T for which such a uniform bound exists, and we study the complexity of the problem under different representations of the regular tree languages (e.g., non-deterministic stepwise automata, deterministic stepwise automata, DTDs). Finally, we point out some connections with the analogous problem for regular languages of words

HAL - Lille 3

Crossref

Archivio istituzionale della ricerca - Università degli Studi di Udine

INRIA a CCSD electronic archive server

Edinburgh Research Explorer

Bounded Repairability for Regular Tree Languages

Author: Boobna Utsav
Carme Julien
Chen Shan
Cristian Riveros
Emde Boas Peter Van
Gabriele Puppis
Grahne Gösta
Pierre Bourhis
Staworko Slawomir
Sławek Staworko
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Which DTDs are streaming bounded repairable?

Author: Cristian Riveros
Gabriele Puppis
Pierre Bourhis
Publication venue
Publication date: 31/03/2020
Field of study

ABSTRACT Integrity constraint management concerns both checking whether data is valid and taking action to restore correctness when invalid data is discovered. In XML the notion of valid data can be captured by schema languages such as Document Type Definitions (DTDs) and more generally XML schemas. DTDs have the property that constraint checking can be done in streaming fashion. In this paper we consider when the corresponding action to restore validity -repair -can be done in streaming fashion. We formalize this as the problem of determining, given a DTD, whether or not a streaming procedure exists that transforms an input document so as to satisfy the DTD, using a number of edits independent of the document. We show that this problem is decidable. In fact, we show the decidability of a more general problem, allowing a more general class of schemas than DTDs, and requiring a repair procedure that works only for documents that are already known to satisfy another class of constraints. The decision procedure relies on a new analysis of the structure of DTDs, reducing to a novel notion of game played on pushdown systems associated with the schemas

CiteSeerX

Edit Distance for Pushdown Automata

Author: Chatterjee Krishnendu
Henzinger Thomas A.
Ibsen-Jensen Rasmus
Otop Jan
Publication venue
Publication date: 01/01/2017
Field of study

The edit distance between two words

w_1, w_2

is the minimal number of word operations (letter insertions, deletions, and substitutions) necessary to transform

w_1

w_2

. The edit distance generalizes to languages

\mathcal{L}_1, \mathcal{L}_2

, where the edit distance from

\mathcal{L}_1

\mathcal{L}_2

is the minimal number

k

such that for every word from

\mathcal{L}_1

there exists a word in

\mathcal{L}_2

with edit distance at most

k

. We study the edit distance computation problem between pushdown automata and their subclasses. The problem of computing edit distance to a pushdown automaton is undecidable, and in practice, the interesting question is to compute the edit distance from a pushdown automaton (the implementation, a standard model for programs with recursion) to a regular language (the specification). In this work, we present a complete picture of decidability and complexity for the following problems: (1)~deciding whether, for a given threshold

k

, the edit distance from a pushdown automaton to a finite automaton is at most

k

, and (2)~deciding whether the edit distance from a pushdown automaton to a finite automaton is finite.Comment: An extended version of a paper accepted to ICALP 2015 with the same title. The paper has been accepted to the LMCS journa

arXiv.org e-Print Archive

Episciences.org

IST PubRep

IST Austria: PubRep (Institute of Science and Technology)

IST Austria Technical Report

Author: Chatterjee Krishnendu
Henzinger Thomas A
Ibsen-Jensen Rasmus
Otop Jan
Publication venue: IST Austria
Publication date: 01/01/2015
Field of study

The edit distance between two words w1, w2 is the minimal number of word operations (letter insertions, deletions, and substitutions) necessary to transform w1 to w2. The edit distance generalizes to languages L1, L2, where the edit distance is the minimal number k such that for every word from L1 there exists a word in L2 with edit distance at most k. We study the edit distance computation problem between pushdown automata and their subclasses. The problem of computing edit distance to a pushdown automaton is undecidable, and in practice, the interesting question is to compute the edit distance from a pushdown automaton (the implementation, a standard model for programs with recursion) to a regular language (the specification). In this work, we present a complete picture of decidability and complexity for deciding whether, for a given threshold k, the edit distance from a pushdown automaton to a finite automaton is at most k

IST Austria: PubRep (Institute of Science and Technology)

SPIDER-WEB enables stable, repairable, and encryptible algorithms under arbitrary local biochemical constraints in DNA-based storage

Author: Lan Zhaojun
Ping Zhi
Shen Yue
Xu Xun
Zhang Haoling
Zhang Wenwei
Zhang Yiwei
Publication venue
Publication date: 10/04/2022
Field of study

DNA has been considered as a promising medium for storing digital information. Despite the biochemical progress in DNA synthesis and sequencing, novel coding algorithms need to be constructed under the specific constraints in DNA-based storage. Many functional operations and storage carriers were introduced in recent years, bringing in various biochemical constraints including but not confined to long single-nucleotide repeats and abnormal GC content. Existing coding algorithms are not applicable or unstable due to more local biochemical constraints and their combinations. In this paper, we design a graph-based architecture, named SPIDER-WEB, to generate corresponding graph-based algorithms under arbitrary local biochemical constraints. These generated coding algorithms could be used to encode arbitrary digital data as DNA sequences directly or served as a benchmark for the follow-up construction of coding algorithms. To further consider recovery and security issues existing in the storage field, it also provides pluggable algorithmic patches based on the generated coding algorithms: path-based correcting and mapping shuffling. They provide approaches for probabilistic error correction and symmetric encryption respectively.Comment: 30 pages; 12 figures; 2 table

arXiv.org e-Print Archive

On Distances Between Words with Parameters

Author: Bourhis Pierre
Boussidan Aaron
Gambette Philippe
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)
Publication date: 01/01/2023
Field of study

The edit distance between parameterized words is a generalization of the classical edit distance where it is allowed to map particular letters of the first word, called parameters, to parameters of the second word before computing the distance. This problem has been introduced in particular for detection of code duplication, and the notion of words with parameters has also been used with different semantics in other fields. The complexity of several variants of edit distances between parameterized words has been studied, however, the complexity of the most natural one, the Levenshtein distance, remained open. In this paper, we solve this open question and close the exhaustive analysis of all cases of parameterized word matching and function matching, showing that these problems are np-complete. To this aim, we also provide a comparison of the different problems, exhibiting several equivalences between them. We also provide and implement a MaxSAT encoding of the problem, as well as a simple FPT algorithm in the alphabet size, and study their efficiency on real data in the context of theater play structure comparison

Dagstuhl Research Online Publication Server

Edit Distance for Pushdown Automata

Author: A Aho
AK Chandra
G Pighizzini
JE Hopcroft
M Benedikt
M Mohri
P Gawrychowski
T Okuda
TA Henzinger
VI Levenshtein
Y Lifshits
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

The edit distance between two words w1,w2 is the minimal number of word operations (letter insertions, deletions, and substitutions) necessary to transform w1 to w2 . The edit distance generalizes to languages L1,L2 , where the edit distance is the minimal number k such that for every word from L1 there exists a word in L2 with edit distance at most k. We study the edit distance computation problem between pushdown automata and their subclasses. The problem of computing edit distance to pushdown automata is undecidable, and in practice, the interesting question is to compute the edit distance from a pushdown automaton (the implementation, a standard model for programs with recursion) to a regular language (the specification). In this work, we present a complete picture of decidability and complexity for deciding whether, for a given threshold k, the edit distance from a pushdown automaton to a finite automaton is at most k

University of Liverpool Repository

Crossref

IST Austria: PubRep (Institute of Science and Technology)

IST PubRep