1,135 research outputs found

    A Logic-based Approach for Recognizing Textual Entailment Supported by Ontological Background Knowledge

    Full text link
    We present the architecture and the evaluation of a new system for recognizing textual entailment (RTE). In RTE we want to identify automatically the type of a logical relation between two input texts. In particular, we are interested in proving the existence of an entailment between them. We conceive our system as a modular environment allowing for a high-coverage syntactic and semantic text analysis combined with logical inference. For the syntactic and semantic analysis we combine a deep semantic analysis with a shallow one supported by statistical models in order to increase the quality and the accuracy of results. For RTE we use logical inference of first-order employing model-theoretic techniques and automated reasoning tools. The inference is supported with problem-relevant background knowledge extracted automatically and on demand from external sources like, e.g., WordNet, YAGO, and OpenCyc, or other, more experimental sources with, e.g., manually defined presupposition resolutions, or with axiomatized general and common sense knowledge. The results show that fine-grained and consistent knowledge coming from diverse sources is a necessary condition determining the correctness and traceability of results.Comment: 25 pages, 10 figure

    Robust Grammatical Analysis for Spoken Dialogue Systems

    Full text link
    We argue that grammatical analysis is a viable alternative to concept spotting for processing spoken input in a practical spoken dialogue system. We discuss the structure of the grammar, and a model for robust parsing which combines linguistic sources of information and statistical sources of information. We discuss test results suggesting that grammatical processing allows fast and accurate processing of spoken input.Comment: Accepted for JNL

    An Abstract Machine for Unification Grammars

    Full text link
    This work describes the design and implementation of an abstract machine, Amalia, for the linguistic formalism ALE, which is based on typed feature structures. This formalism is one of the most widely accepted in computational linguistics and has been used for designing grammars in various linguistic theories, most notably HPSG. Amalia is composed of data structures and a set of instructions, augmented by a compiler from the grammatical formalism to the abstract instructions, and a (portable) interpreter of the abstract instructions. The effect of each instruction is defined using a low-level language that can be executed on ordinary hardware. The advantages of the abstract machine approach are twofold. From a theoretical point of view, the abstract machine gives a well-defined operational semantics to the grammatical formalism. This ensures that grammars specified using our system are endowed with well defined meaning. It enables, for example, to formally verify the correctness of a compiler for HPSG, given an independent definition. From a practical point of view, Amalia is the first system that employs a direct compilation scheme for unification grammars that are based on typed feature structures. The use of amalia results in a much improved performance over existing systems. In order to test the machine on a realistic application, we have developed a small-scale, HPSG-based grammar for a fragment of the Hebrew language, using Amalia as the development platform. This is the first application of HPSG to a Semitic language.Comment: Doctoral Thesis, 96 pages, many postscript figures, uses pstricks, pst-node, psfig, fullname and a macros fil

    A Labelled Analytic Theorem Proving Environment for Categorial Grammar

    Full text link
    We present a system for the investigation of computational properties of categorial grammar parsing based on a labelled analytic tableaux theorem prover. This proof method allows us to take a modular approach, in which the basic grammar can be kept constant, while a range of categorial calculi can be captured by assigning different properties to the labelling algebra. The theorem proving strategy is particularly well suited to the treatment of categorial grammar, because it allows us to distribute the computational cost between the algorithm which deals with the grammatical types and the algebraic checker which constrains the derivation.Comment: 11 pages, LaTeX2e, uses examples.sty and a4wide.st

    From UBGs to CFGs A practical corpus-driven approach

    Get PDF
    We present a simple and intuitive unsound corpus-driven approximation method for turning unification-based grammars (UBGs), such as HPSG, CLE, or PATR-II into context-free grammars (CFGs). The method is unsound in that it does not generate a CFG whose language is a true superset of the language accepted by the original unification-based grammar. It is a corpus-driven method in that it relies on a corpus of parsed sentences and generates broader CFGs when given more input samples. Our open approach can be fine-tuned in different directions, allowing us to monotonically come close to the original parse trees by shifting more information into the context-free symbols. The approach has been fully implemented in JAVA. This report updates and extends the paper presented at the International Colloquium on Grammatical Inference (ICGI 2004) and presents further measurements

    Incremental syntax generation with tree adjoining grammars

    Get PDF
    With the increasing capacity of AI systems the design of human--computer interfaces has become a favorite research topic in AI. In this paper we focus on aspects of the output of a computer. The architecture of a sentence generation component -- embedded in the WIP system -- is described. The main emphasis is laid on the motivation for the incremental style of processing and the encoding of adequate linguistic units as rules of a Lexicalized Tree Adjoining Grammar with Unification

    Approximate text generation from non-hierarchical representations in a declarative framework

    Get PDF
    This thesis is on Natural Language Generation. It describes a linguistic realisation system that translates the semantic information encoded in a conceptual graph into an English language sentence. The use of a non-hierarchically structured semantic representation (conceptual graphs) and an approximate matching between semantic structures allows us to investigate a more general version of the sentence generation problem where one is not pre-committed to a choice of the syntactically prominent elements in the initial semantics. We show clearly how the semantic structure is declaratively related to linguistically motivated syntactic representation — we use D-Tree Grammars which stem from work on Tree-Adjoining Grammars. The declarative specification of the mapping between semantics and syntax allows for different processing strategies to be exploited. A number of generation strategies have been considered: a pure topdown strategy and a chart-based generation technique which allows partially successful computations to be reused in other branches of the search space. Having a generator with increased paraphrasing power as a consequence of using non-hierarchical input and approximate matching raises the issue whether certain 'better' paraphrases can be generated before others. We investigate preference-based processing in the context of generation

    Juri Apresjan and the Development of Semantics and Lexicography

    Get PDF
    The major aim of this article is to highlight Juri Apresjan's impact on the develop-ment of linguistic semantics and theoretical lexicography. In order to achieve this goal, a number of issues of paramount importance, which have always been in the focus of attention in Apresjan's publications, have to be discussed: (a) the notion of "naïve picture of the world", i.e. language-spe-cific folk categorization encoded in the lexical and grammatical semantics of a particular language, as opposed to the supposedly universal and language-independent system of scientific concepts; (b) basic properties of the formal metalanguage of semantic desciption, its explanatory power and applicability in dictionary-making; and (c) representation of synonymy in a bilingual and a mono-lingual dictionary of synonyms designed within the framework of systematic lexicography. In addition, considerable attention has been given to two basic categories of systematic lexicography, "lexicographic portrait" and "lexicographic type", as well as the zonal structure of dictionary arti-cles. Keywords: bilingual dictionary, commonsense (everyday) knowledge, definition, dictionary of synonyms, expert knowledge, integrated lexico-graphic description, lexicographic portrait, lexicographic type, meta-language, naïve picture of the world, scientific picture of the world, synonym series, systematic lexicography, translation dictionary, zonal structure (of a dictionary entry

    Incremental syntactic generation of natural language with tree adjoining grammars

    Get PDF
    This document combines the basic ideas of my master´s thesis - which has been developped within the WIP project - with new results from my work as a member of WIP, as far as they concern the integration and further development of the implemented system. ISGT (in German \u27Inkrementeller Syntaktischer Generierer natürlicher Sprache mit TAGs´) is a syntactic component for a text generation system and is based on Tree Adjoining Grammars. It is lexically guided and consists of two levels of syntactic processing: A component that computes the hierarchical structure of the sentence under construction (hierarchical level) and a component that computes the word position and utters the sentence (positional level). The central aim of this work has been to design a syntactic generator that computes sentences in an incremental fashion. The realization of the incremental syntactic generator has been supported by a distributed parallel model that is used to speed up the computation of single parts of the sentence
    corecore