2,534 research outputs found
Structure preserving transformations on non-left-recursive grammars
We will be concerned with grammar covers, The first part of this paper presents a general framework for covers. The second part introduces a transformation from nonleft-recursive grammars to grammars in Greibach normal form. An investigation of the structure preserving properties of this transformation, which serves also as an illustration of our framework for covers, is presented
Interpretation and reduction of attribute grammars
An attribute grammar (AG) is in reduced form if in all its derivation trees every attribute contributes to the translation. We prove that, eventhough AG are generally not in reduced form, they can be reduced, i.e., put into reduced form, without modifying their translations. This is shown first for noncircular AG and then for arbitrary AG. In both cases the reduction consists of easy (almost syntactic) transformations which do not change the semantic domain of the AG. These easy transformations are formalized by introducing the notion of AG interpretation as an extension to AG of the concept of context-free grammar form. Finally we prove that any general algorithm for reducing even the simple class of L-AG needs exponential time (in the size of the input AG) infinitely often
Corpora and evaluation tools for multilingual named entity grammar development
We present an effort for the development of multilingual named entity grammars in a unification-based finite-state formalism (SProUT). Following an extended version of the MUC7 standard, we have developed Named Entity Recognition grammars for German, Chinese, Japanese, French, Spanish, English, and Czech. The grammars recognize person names, organizations, geographical locations, currency, time and date expressions. Subgrammars and gazetteers are shared as much as possible for the grammars of the different languages. Multilingual corpora from the business domain are used for grammar development and evaluation. The annotation format (named entity and other linguistic information) is described. We present an evaluation tool which provides detailed statistics and diagnostics, allows for partial matching of annotations, and supports user-defined mappings between different annotation and grammar output formats
Words by convention
Existing metasemantic projects presuppose that word- (or sentence-) types are part of the non-semantic base. We propose a new strategy: an endogenous account of word types, that is, one where word types are fixed as part of the metasemantics. On this view, it is the conventions of truthfulness and trust that ground not only the meaning of the words (meaning by convention) but also what the word type is of each particular token utterance (words by convention). The same treatment extends to identifying the populations through which the conventions prevail. We consider whether this proposal leads to new underdetermination challenges for metasemantics, and make a case that it does not
An Abstract Machine for Unification Grammars
This work describes the design and implementation of an abstract machine,
Amalia, for the linguistic formalism ALE, which is based on typed feature
structures. This formalism is one of the most widely accepted in computational
linguistics and has been used for designing grammars in various linguistic
theories, most notably HPSG. Amalia is composed of data structures and a set of
instructions, augmented by a compiler from the grammatical formalism to the
abstract instructions, and a (portable) interpreter of the abstract
instructions. The effect of each instruction is defined using a low-level
language that can be executed on ordinary hardware.
The advantages of the abstract machine approach are twofold. From a
theoretical point of view, the abstract machine gives a well-defined
operational semantics to the grammatical formalism. This ensures that grammars
specified using our system are endowed with well defined meaning. It enables,
for example, to formally verify the correctness of a compiler for HPSG, given
an independent definition. From a practical point of view, Amalia is the first
system that employs a direct compilation scheme for unification grammars that
are based on typed feature structures. The use of amalia results in a much
improved performance over existing systems.
In order to test the machine on a realistic application, we have developed a
small-scale, HPSG-based grammar for a fragment of the Hebrew language, using
Amalia as the development platform. This is the first application of HPSG to a
Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks,
pst-node, psfig, fullname and a macros fil
- …