1,135 research outputs found
A Logic-based Approach for Recognizing Textual Entailment Supported by Ontological Background Knowledge
We present the architecture and the evaluation of a new system for
recognizing textual entailment (RTE). In RTE we want to identify automatically
the type of a logical relation between two input texts. In particular, we are
interested in proving the existence of an entailment between them. We conceive
our system as a modular environment allowing for a high-coverage syntactic and
semantic text analysis combined with logical inference. For the syntactic and
semantic analysis we combine a deep semantic analysis with a shallow one
supported by statistical models in order to increase the quality and the
accuracy of results. For RTE we use logical inference of first-order employing
model-theoretic techniques and automated reasoning tools. The inference is
supported with problem-relevant background knowledge extracted automatically
and on demand from external sources like, e.g., WordNet, YAGO, and OpenCyc, or
other, more experimental sources with, e.g., manually defined presupposition
resolutions, or with axiomatized general and common sense knowledge. The
results show that fine-grained and consistent knowledge coming from diverse
sources is a necessary condition determining the correctness and traceability
of results.Comment: 25 pages, 10 figure
Robust Grammatical Analysis for Spoken Dialogue Systems
We argue that grammatical analysis is a viable alternative to concept
spotting for processing spoken input in a practical spoken dialogue system. We
discuss the structure of the grammar, and a model for robust parsing which
combines linguistic sources of information and statistical sources of
information. We discuss test results suggesting that grammatical processing
allows fast and accurate processing of spoken input.Comment: Accepted for JNL
An Abstract Machine for Unification Grammars
This work describes the design and implementation of an abstract machine,
Amalia, for the linguistic formalism ALE, which is based on typed feature
structures. This formalism is one of the most widely accepted in computational
linguistics and has been used for designing grammars in various linguistic
theories, most notably HPSG. Amalia is composed of data structures and a set of
instructions, augmented by a compiler from the grammatical formalism to the
abstract instructions, and a (portable) interpreter of the abstract
instructions. The effect of each instruction is defined using a low-level
language that can be executed on ordinary hardware.
The advantages of the abstract machine approach are twofold. From a
theoretical point of view, the abstract machine gives a well-defined
operational semantics to the grammatical formalism. This ensures that grammars
specified using our system are endowed with well defined meaning. It enables,
for example, to formally verify the correctness of a compiler for HPSG, given
an independent definition. From a practical point of view, Amalia is the first
system that employs a direct compilation scheme for unification grammars that
are based on typed feature structures. The use of amalia results in a much
improved performance over existing systems.
In order to test the machine on a realistic application, we have developed a
small-scale, HPSG-based grammar for a fragment of the Hebrew language, using
Amalia as the development platform. This is the first application of HPSG to a
Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks,
pst-node, psfig, fullname and a macros fil
A Labelled Analytic Theorem Proving Environment for Categorial Grammar
We present a system for the investigation of computational properties of
categorial grammar parsing based on a labelled analytic tableaux theorem
prover. This proof method allows us to take a modular approach, in which the
basic grammar can be kept constant, while a range of categorial calculi can be
captured by assigning different properties to the labelling algebra. The
theorem proving strategy is particularly well suited to the treatment of
categorial grammar, because it allows us to distribute the computational cost
between the algorithm which deals with the grammatical types and the algebraic
checker which constrains the derivation.Comment: 11 pages, LaTeX2e, uses examples.sty and a4wide.st
From UBGs to CFGs A practical corpus-driven approach
We present a simple and intuitive unsound corpus-driven approximation method for turning unification-based grammars (UBGs), such as HPSG, CLE, or PATR-II into context-free grammars (CFGs). The method is unsound in that it does not generate a CFG whose language is a true superset of the language accepted by the original unification-based grammar. It is a corpus-driven method in that it relies on a corpus of parsed sentences and generates broader CFGs when given more input samples. Our open approach can be fine-tuned in different directions, allowing us to monotonically come close to the original parse trees by shifting more information into the context-free symbols. The approach has been fully implemented in JAVA. This report updates and extends the paper presented at the International Colloquium on Grammatical Inference (ICGI 2004) and presents further measurements
Incremental syntax generation with tree adjoining grammars
With the increasing capacity of AI systems the design of human--computer interfaces has become a favorite research topic in AI. In this paper we focus on aspects of the output of a computer. The architecture of a sentence generation component -- embedded in the WIP system -- is described. The main emphasis is laid on the motivation for the incremental style of processing and the encoding of adequate linguistic units as rules of a Lexicalized Tree Adjoining Grammar with Unification
Approximate text generation from non-hierarchical representations in a declarative framework
This thesis is on Natural Language Generation. It describes a linguistic realisation
system that translates the semantic information encoded in a conceptual graph into an
English language sentence. The use of a non-hierarchically structured semantic representation (conceptual graphs) and an approximate matching between semantic structures allows us to investigate a more general version of the sentence generation problem
where one is not pre-committed to a choice of the syntactically prominent elements in
the initial semantics. We show clearly how the semantic structure is declaratively related to linguistically motivated syntactic representation — we use D-Tree Grammars
which stem from work on Tree-Adjoining Grammars. The declarative specification of
the mapping between semantics and syntax allows for different processing strategies
to be exploited. A number of generation strategies have been considered: a pure topdown strategy and a chart-based generation technique which allows partially successful
computations to be reused in other branches of the search space. Having a generator
with increased paraphrasing power as a consequence of using non-hierarchical input
and approximate matching raises the issue whether certain 'better' paraphrases can be
generated before others. We investigate preference-based processing in the context of
generation
Juri Apresjan and the Development of Semantics and Lexicography
The major aim of this article is to highlight Juri Apresjan's impact on the develop-ment of linguistic semantics and theoretical lexicography. In order to achieve this goal, a number of issues of paramount importance, which have always been in the focus of attention in Apresjan's publications, have to be discussed: (a) the notion of "naïve picture of the world", i.e. language-spe-cific folk categorization encoded in the lexical and grammatical semantics of a particular language, as opposed to the supposedly universal and language-independent system of scientific concepts; (b) basic properties of the formal metalanguage of semantic desciption, its explanatory power and applicability in dictionary-making; and (c) representation of synonymy in a bilingual and a mono-lingual dictionary of synonyms designed within the framework of systematic lexicography. In addition, considerable attention has been given to two basic categories of systematic lexicography, "lexicographic portrait" and "lexicographic type", as well as the zonal structure of dictionary arti-cles.
Keywords: bilingual dictionary, commonsense (everyday) knowledge, definition, dictionary of synonyms, expert knowledge, integrated lexico-graphic description, lexicographic portrait, lexicographic type, meta-language, naïve picture of the world, scientific picture of the world, synonym series, systematic lexicography, translation dictionary, zonal structure (of a dictionary entry
Incremental syntactic generation of natural language with tree adjoining grammars
This document combines the basic ideas of my master´s thesis - which has been developped within the WIP project - with new results from my work as a member of WIP, as far as they concern the integration and further development of the implemented system. ISGT (in German \u27Inkrementeller Syntaktischer Generierer natürlicher Sprache mit TAGs´) is a syntactic component for a text generation system and is based on Tree Adjoining Grammars. It is lexically guided and consists of two levels of syntactic processing: A component that computes the hierarchical structure of the sentence under construction (hierarchical level) and a component that computes the word position and utters the sentence (positional level). The central aim of this work has been to design a syntactic generator that computes sentences in an incremental fashion. The realization of the incremental syntactic generator has been supported by a distributed parallel model that is used to speed up the computation of single parts of the sentence
- …