Search CORE

1,522 research outputs found

Finite Automata for the Sub- and Superword Closure of CFLs: Descriptional and Computational Complexity

Author: A Okhotin
B Courcelle
C Brabrand
H Gruber
J Esparza
J Leeuwen van
M Mohri
MF Atig
N Rampersad
N Vasudevan
P Ganty
P Habermehl
R Axelsson
S Schmitz
Y Bar-Hillel
Z Long
Publication venue
Publication date: 23/10/2014
Field of study

We answer two open questions by (Gruber, Holzer, Kutrib, 2009) on the state-complexity of representing sub- or superword closures of context-free grammars (CFGs): (1) We prove a (tight) upper bound of

2^{\mathcal{O}(n)}

on the size of nondeterministic finite automata (NFAs) representing the subword closure of a CFG of size

n

. (2) We present a family of CFGs for which the minimal deterministic finite automata representing their subword closure matches the upper-bound of

2^{2^{\mathcal{O}(n)}}

following from (1). Furthermore, we prove that the inequivalence problem for NFAs representing sub- or superword-closed languages is only NP-complete as opposed to PSPACE-complete for general NFAs. Finally, we extend our results into an approximation method to attack inequivalence problems for CFGs

arXiv.org e-Print Archive

CiteSeerX

On the Commutative Equivalence of Context-Free Languages

Author: A Luca de
A Restivo
F D’Alessandro
F D’Alessandro
F D’Alessandro
F D’Alessandro
F D’Alessandro
G Baron
J Berstel
JM Boë
L Ilie
M Latteux
M-P Béal
MR Bridson
N Chomsky
OH Ibarra
OH Ibarra
P Flajolet
PW Shor
R Incitti
S Ginsburg
S Ginsburg
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

The problem of the commutative equivalence of context-free and regular languages is studied. In particular conditions ensuring that a context-free language of exponential growth is commutatively equivalent with a regular language are investigated

Archivio della ricerca- Università di Roma La Sapienza

Efficient Normal-Form Parsing for Combinatory Categorial Grammar

Author: Eisner Jason
Publication venue
Publication date: 01/01/1996
Field of study

Under categorial grammars that have powerful rules like composition, a simple n-word sentence can have exponentially many parses. Generating all parses is inefficient and obscures whatever true semantic ambiguities are in the input. This paper addresses the problem for a fairly general form of Combinatory Categorial Grammar, by means of an efficient, correct, and easy to implement normal-form parsing technique. The parser is proved to find exactly one parse in each semantic equivalence class of allowable parses; that is, spurious ambiguity (as carefully defined) is shown to be both safely and completely eliminated.Comment: 8 pages, LaTeX packaged with three .sty files, also uses cgloss4e.st

arXiv.org e-Print Archive

CiteSeerX

Data-Oriented Language Processing. An Overview

Author: Bod Rens
Scha Remko
Publication venue
Publication date: 01/01/1996
Field of study

During the last few years, a new approach to language processing has started to emerge, which has become known under various labels such as "data-oriented parsing", "corpus-based interpretation", and "tree-bank grammar" (cf. van den Berg et al. 1994; Bod 1992-96; Bod et al. 1996a/b; Bonnema 1996; Charniak 1996a/b; Goodman 1996; Kaplan 1996; Rajman 1995a/b; Scha 1990-92; Sekine & Grishman 1995; Sima'an et al. 1994; Sima'an 1995-96; Tugwell 1995). This approach, which we will call "data-oriented processing" or "DOP", embodies the assumption that human language perception and production works with representations of concrete past language experiences, rather than with abstract linguistic rules. The models that instantiate this approach therefore maintain large corpora of linguistic representations of previously occurring utterances. When processing a new input utterance, analyses of this utterance are constructed by combining fragments from the corpus; the occurrence-frequencies of the fragments are used to estimate which analysis is the most probable one. In this paper we give an in-depth discussion of a data-oriented processing model which employs a corpus of labelled phrase-structure trees. Then we review some other models that instantiate the DOP approach. Many of these models also employ labelled phrase-structure trees, but use different criteria for extracting fragments from the corpus or employ different disambiguation strategies (Bod 1996b; Charniak 1996a/b; Goodman 1996; Rajman 1995a/b; Sekine & Grishman 1995; Sima'an 1995-96); other models use richer formalisms for their corpus annotations (van den Berg et al. 1994; Bod et al., 1996a/b; Bonnema 1996; Kaplan 1996; Tugwell 1995).Comment: 34 pages, Postscrip

arXiv.org e-Print Archive

CiteSeerX