Search CORE

57,877 research outputs found

Edit Distance for Pushdown Automata

Author: Chatterjee Krishnendu
Henzinger Thomas A.
Ibsen-Jensen Rasmus
Otop Jan
Publication venue
Publication date: 01/01/2017
Field of study

The edit distance between two words

w_1, w_2

is the minimal number of word operations (letter insertions, deletions, and substitutions) necessary to transform

w_1

w_2

. The edit distance generalizes to languages

\mathcal{L}_1, \mathcal{L}_2

, where the edit distance from

\mathcal{L}_1

\mathcal{L}_2

is the minimal number

k

such that for every word from

\mathcal{L}_1

there exists a word in

\mathcal{L}_2

with edit distance at most

k

. We study the edit distance computation problem between pushdown automata and their subclasses. The problem of computing edit distance to a pushdown automaton is undecidable, and in practice, the interesting question is to compute the edit distance from a pushdown automaton (the implementation, a standard model for programs with recursion) to a regular language (the specification). In this work, we present a complete picture of decidability and complexity for the following problems: (1)~deciding whether, for a given threshold

k

, the edit distance from a pushdown automaton to a finite automaton is at most

k

, and (2)~deciding whether the edit distance from a pushdown automaton to a finite automaton is finite.Comment: An extended version of a paper accepted to ICALP 2015 with the same title. The paper has been accepted to the LMCS journa

arXiv.org e-Print Archive

Episciences.org

IST PubRep

IST Austria: PubRep (Institute of Science and Technology)

Streaming Property Testing of Visibly Pushdown Languages

Author: de Rougemont Michel
François Nathanaël
Magniez Frédéric
Serre Olivier
Publication venue
Publication date: 03/11/2015
Field of study

In the context of language recognition, we demonstrate the superiority of streaming property testers against streaming algorithms and property testers, when they are not combined. Initiated by Feigenbaum et al., a streaming property tester is a streaming algorithm recognizing a language under the property testing approximation: it must distinguish inputs of the language from those that are

\varepsilon

-far from it, while using the smallest possible memory (rather than limiting its number of input queries). Our main result is a streaming

\varepsilon

-property tester for visibly pushdown languages (VPL) with one-sided error using memory space

\mathrm{poly}((\log n) / \varepsilon)

. This constructions relies on a (non-streaming) property tester for weighted regular languages based on a previous tester by Alon et al. We provide a simple application of this tester for streaming testing special cases of instances of VPL that are already hard for both streaming algorithms and property testers. Our main algorithm is a combination of an original simulation of visibly pushdown automata using a stack with small height but possible items of linear size. In a second step, those items are replaced by small sketches. Those sketches relies on a notion of suffix-sampling we introduce. This sampling is the key idea connecting our streaming tester algorithm to property testers.Comment: 23 pages. Major modifications in the presentatio

arXiv.org e-Print Archive

HAL Descartes

Hal-Diderot

Error-tolerant Finite State Recognition with Applications to Morphological Analysis and Spelling Correction

Author: Oflazer Kemal
Publication venue
Publication date: 21/07/1995
Field of study

Error-tolerant recognition enables the recognition of strings that deviate mildly from any string in the regular set recognized by the underlying finite state recognizer. Such recognition has applications in error-tolerant morphological processing, spelling correction, and approximate string matching in information retrieval. After a description of the concepts and algorithms involved, we give examples from two applications: In the context of morphological analysis, error-tolerant recognition allows misspelled input word forms to be corrected, and morphologically analyzed concurrently. We present an application of this to error-tolerant analysis of agglutinative morphology of Turkish words. The algorithm can be applied to morphological analysis of any language whose morphology is fully captured by a single (and possibly very large) finite state transducer, regardless of the word formation processes and morphographemic phenomena involved. In the context of spelling correction, error-tolerant recognition can be used to enumerate correct candidate forms from a given misspelled string within a certain edit distance. Again, it can be applied to any language with a word list comprising all inflected forms, or whose morphology is fully described by a finite state transducer. We present experimental results for spelling correction for a number of languages. These results indicate that such recognition works very efficiently for candidate generation in spelling correction for many European languages such as English, Dutch, French, German, Italian (and others) with very large word lists of root and inflected forms (some containing well over 200,000 forms), generating all candidate solutions within 10 to 45 milliseconds (with edit distance 1) on a SparcStation 10/41. For spelling correction in Turkish, error-tolerantComment: Replaces 9504031. gzipped, uuencoded postscript file. To appear in Computational Linguistics Volume 22 No:1, 1996, Also available as ftp://ftp.cs.bilkent.edu.tr/pub/ko/clpaper9512.ps.

arXiv.org e-Print Archive

CiteSeerX

Bilkent University Institutional Repository