367 research outputs found
An Alternative Conception of Tree-Adjoining Derivation
The precise formulation of derivation for tree-adjoining grammars has
important ramifications for a wide variety of uses of the formalism, from
syntactic analysis to semantic interpretation and statistical language
modeling. We argue that the definition of tree-adjoining derivation must be
reformulated in order to manifest the proper linguistic dependencies in
derivations. The particular proposal is both precisely characterizable through
a definition of TAG derivations as equivalence classes of ordered derivation
trees, and computationally operational, by virtue of a compilation to linear
indexed grammars together with an efficient algorithm for recognition and
parsing according to the compiled grammar.Comment: 33 page
An automata characterisation for multiple context-free languages
We introduce tree stack automata as a new class of automata with storage and
identify a restricted form of tree stack automata that recognises exactly the
multiple context-free languages.Comment: This is an extended version of a paper with the same title accepted
at the 20th International Conference on Developments in Language Theory (DLT
2016
Multiple Context-Free Tree Grammars: Lexicalization and Characterization
Multiple (simple) context-free tree grammars are investigated, where "simple"
means "linear and nondeleting". Every multiple context-free tree grammar that
is finitely ambiguous can be lexicalized; i.e., it can be transformed into an
equivalent one (generating the same tree language) in which each rule of the
grammar contains a lexical symbol. Due to this transformation, the rank of the
nonterminals increases at most by 1, and the multiplicity (or fan-out) of the
grammar increases at most by the maximal rank of the lexical symbols; in
particular, the multiplicity does not increase when all lexical symbols have
rank 0. Multiple context-free tree grammars have the same tree generating power
as multi-component tree adjoining grammars (provided the latter can use a
root-marker). Moreover, every multi-component tree adjoining grammar that is
finitely ambiguous can be lexicalized. Multiple context-free tree grammars have
the same string generating power as multiple context-free (string) grammars and
polynomial time parsing algorithms. A tree language can be generated by a
multiple context-free tree grammar if and only if it is the image of a regular
tree language under a deterministic finite-copying macro tree transducer.
Multiple context-free tree grammars can be used as a synchronous translation
device.Comment: 78 pages, 13 figure
A prototype system for machine translation from English to South African Sign Language using synchronous tree adjoining grammars
Thesis (MSc)--University of Stellenbosch, 2007.ENGLISH ABSTRACT: Machine translation, especially machine translation for sign languages, remains an active research
area. Sign language machine translation presents unique challenges to the whole machine translation
process. In this thesis a prototype machine translation system is presented. This system is
designed to translate English text into a gloss based representation of South African Sign Language
(SASL).
In order to perform the machine translation, a transfer based approach was taken. English
text is parsed into an intermediate representation. Translation rules are then applied to this
intermediate representation to transform it into an equivalent intermediate representation for the
SASL glosses. For both these intermediate representations, a tree adjoining grammar (TAG)
formalism is used. As part of the prototype machine translation system, a TAG parser was
implemented.
The translation rules used by the system were derived from a SASL phrase book. This phrase
book was also used to create a small gloss based SASL TAG grammar. Lastly, some additional
tools, for the editing of TAG trees, were also added to the prototype system.AFRIKAANSE OPSOMMING: Masjienvertaling, veral masjienvertaling vir gebaretale, bly ’n aktiewe navorsingsgebied. Masjienvertaling
vir gebaretale bied unieke uitdagings tot die hele masjienvertalingproses. In hierdie tesis
bied ons ’n prototipe masjienvertalingstelsel aan. Hierdie stelsel is ontwerp om Engelse teks te
vertaal na ’n glos gebaseerde voorstelling van Suid-Afrikaanse Gebaretaal (SAG).
Ons vertalingstelsel maak gebruik van ’n oorplasingsbenadering tot masjienvertaling. Engelse
teks word ontleed na ’n intermediˆere vorm. Vertalingre¨els word toegepas op hierdie intermediˆere
vorm om dit te transformeer na ’n ekwivalente intermediˆere vorm vir die SAG glosse. Vir beide
hierdie intermediˆere vorms word boomkoppelingsgrammatikas (BKGs) gebruik. As deel van die
prototipe masjienvertalingstelsel, is ’n BKG sintaksontleder ge¨ımplementeer.
Die vertalingre¨els wat gebruik word deur die stelsel, is afgelei vanaf ’n SAG fraseboek. Hierdie
fraseboek was ook gebruik om ’n klein BKG vir SAG glosse te ontwikkel. Laastens was addisionele
nutsfasiliteite, vir die redigering van BKG bome, ontwikkel
A prototype system for machine translation from English to South African Sign Language using synchronous tree adjoining grammars
Thesis (MSc)--University of Stellenbosch, 2007.ENGLISH ABSTRACT: Machine translation, especially machine translation for sign languages, remains an active research
area. Sign language machine translation presents unique challenges to the whole machine translation
process. In this thesis a prototype machine translation system is presented. This system is
designed to translate English text into a gloss based representation of South African Sign Language
(SASL).
In order to perform the machine translation, a transfer based approach was taken. English
text is parsed into an intermediate representation. Translation rules are then applied to this
intermediate representation to transform it into an equivalent intermediate representation for the
SASL glosses. For both these intermediate representations, a tree adjoining grammar (TAG)
formalism is used. As part of the prototype machine translation system, a TAG parser was
implemented.
The translation rules used by the system were derived from a SASL phrase book. This phrase
book was also used to create a small gloss based SASL TAG grammar. Lastly, some additional
tools, for the editing of TAG trees, were also added to the prototype system.AFRIKAANSE OPSOMMING: Masjienvertaling, veral masjienvertaling vir gebaretale, bly ’n aktiewe navorsingsgebied. Masjienvertaling
vir gebaretale bied unieke uitdagings tot die hele masjienvertalingproses. In hierdie tesis
bied ons ’n prototipe masjienvertalingstelsel aan. Hierdie stelsel is ontwerp om Engelse teks te
vertaal na ’n glos gebaseerde voorstelling van Suid-Afrikaanse Gebaretaal (SAG).
Ons vertalingstelsel maak gebruik van ’n oorplasingsbenadering tot masjienvertaling. Engelse
teks word ontleed na ’n intermediˆere vorm. Vertalingre¨els word toegepas op hierdie intermediˆere
vorm om dit te transformeer na ’n ekwivalente intermediˆere vorm vir die SAG glosse. Vir beide
hierdie intermediˆere vorms word boomkoppelingsgrammatikas (BKGs) gebruik. As deel van die
prototipe masjienvertalingstelsel, is ’n BKG sintaksontleder ge¨ımplementeer.
Die vertalingre¨els wat gebruik word deur die stelsel, is afgelei vanaf ’n SAG fraseboek. Hierdie
fraseboek was ook gebruik om ’n klein BKG vir SAG glosse te ontwikkel. Laastens was addisionele
nutsfasiliteite, vir die redigering van BKG bome, ontwikkel
Parsing for agile modeling
Agile modeling refers to a set of methods that allow for a quick initial development of an importer and its further refinement. These requirements are not met simultaneously by the current parsing technology. Problems with parsing became a bottleneck in our research of agile modeling.
In this thesis we introduce a novel approach to specify and build parsers. Our approach allows for expressive, tolerant and composable parsers without sacrificing performance. The approach is based on a context-sensitive extension of parsing expression grammars that allows a grammar engineer to specify complex language restrictions. To insure high parsing performance we automatically analyze a grammar definition and choose different parsing strategies for different parts of the grammar.
We show that context-sensitive parsing expression grammars allow for highly composable, tolerant and variable-grained parsers that can be easily refined. Different parsing strategies significantly insure high-performance of parsers without sacrificing expressiveness of the underlying grammars
Tree-Adjoining Grammars and Lexicalized Grammars
In this paper, we will describe a tree generating system called tree-adjoining grammar(TAG)and state some of the recent results about TAGs. The work on TAGS is motivated by linguistic considerations. However, a number of formal results have been established for TAGs, which we believe, would be of interest to researchers in tree grammars and tree automata. After giving a short introduction to TAG, we briefly state these results concerning both the properties of the string sets and tree sets (Section 2). We will also describe the notion of lexicalization of grammars (Section 3) and investigate the relationship of lexicalization to context-free grammars (CFGs) and TAGS (Section 4)
- …