43,039 research outputs found

    Building a Generation Knowledge Source using Internet-Accessible Newswire

    Full text link
    In this paper, we describe a method for automatic creation of a knowledge source for text generation using information extraction over the Internet. We present a prototype system called PROFILE which uses a client-server architecture to extract noun-phrase descriptions of entities such as people, places, and organizations. The system serves two purposes: as an information extraction tool, it allows users to search for textual descriptions of entities; as a utility to generate functional descriptions (FD), it is used in a functional-unification based generation system. We present an evaluation of the approach and its applications to natural language generation and summarization.Comment: 8 pages, uses eps

    Morphological Productivity in the Lexicon

    Full text link
    In this paper we outline a lexical organization for Turkish that makes use of lexical rules for inflections, derivations, and lexical category changes to control the proliferation of lexical entries. Lexical rules handle changes in grammatical roles, enforce type constraints, and control the mapping of subcategorization frames in valency-changing operations. A lexical inheritance hierarchy facilitates the enforcement of type constraints. Semantic compositions in inflections and derivations are constrained by the properties of the terms and predicates. The design has been tested as part of a HPSG grammar for Turkish. In terms of performance, run-time execution of the rules seems to be a far better alternative than pre-compilation. The latter causes exponential growth in the lexicon due to intensive use of inflections and derivations in Turkish.Comment: 10 pages LaTeX, {lingmacros,avm,psfig}.sty, 1 figure, 1 bibtex fil

    FrameNet CNL: a Knowledge Representation and Information Extraction Language

    Full text link
    The paper presents a FrameNet-based information extraction and knowledge representation framework, called FrameNet-CNL. The framework is used on natural language documents and represents the extracted knowledge in a tailor-made Frame-ontology from which unambiguous FrameNet-CNL paraphrase text can be generated automatically in multiple languages. This approach brings together the fields of information extraction and CNL, because a source text can be considered belonging to FrameNet-CNL, if information extraction parser produces the correct knowledge representation as a result. We describe a state-of-the-art information extraction parser used by a national news agency and speculate that FrameNet-CNL eventually could shape the natural language subset used for writing the newswire articles.Comment: CNL-2014 camera-ready version. The final publication is available at link.springer.co
    • …
    corecore