8,644 research outputs found

    Frequency vs. Association for Constraint Selection in Usage-Based Construction Grammar

    Get PDF
    A usage-based Construction Grammar (CxG) posits that slot-constraints generalize from common exemplar constructions. But what is the best model of constraint generalization? This paper evaluates competing frequency-based and association-based models across eight languages using a metric derived from the Minimum Description Length paradigm. The experiments show that association-based models produce better generalizations across all languages by a significant margin

    A Survey of Paraphrasing and Textual Entailment Methods

    Full text link
    Paraphrasing methods recognize, generate, or extract phrases, sentences, or longer natural language expressions that convey almost the same information. Textual entailment methods, on the other hand, recognize, generate, or extract pairs of natural language expressions, such that a human who reads (and trusts) the first element of a pair would most likely infer that the other element is also true. Paraphrasing can be seen as bidirectional textual entailment and methods from the two areas are often similar. Both kinds of methods are useful, at least in principle, in a wide range of natural language processing applications, including question answering, summarization, text generation, and machine translation. We summarize key ideas from the two areas by considering in turn recognition, generation, and extraction methods, also pointing to prominent articles and resources.Comment: Technical Report, Natural Language Processing Group, Department of Informatics, Athens University of Economics and Business, Greece, 201

    A Theme-Rewriting Approach for Generating Algebra Word Problems

    Full text link
    Texts present coherent stories that have a particular theme or overall setting, for example science fiction or western. In this paper, we present a text generation method called {\it rewriting} that edits existing human-authored narratives to change their theme without changing the underlying story. We apply the approach to math word problems, where it might help students stay more engaged by quickly transforming all of their homework assignments to the theme of their favorite movie without changing the math concepts that are being taught. Our rewriting method uses a two-stage decoding process, which proposes new words from the target theme and scores the resulting stories according to a number of factors defining aspects of syntactic, semantic, and thematic coherence. Experiments demonstrate that the final stories typically represent the new theme well while still testing the original math concepts, outperforming a number of baselines. We also release a new dataset of human-authored rewrites of math word problems in several themes.Comment: To appear EMNLP 201

    Grip Force Reveals the Context Sensitivity of Language-Induced Motor Activity during “Action Words

    Get PDF
    Studies demonstrating the involvement of motor brain structures in language processing typically focus on \ud time windows beyond the latencies of lexical-semantic access. Consequently, such studies remain inconclusive regarding whether motor brain structures are recruited directly in language processing or through post-linguistic conceptual imagery. In the present study, we introduce a grip-force sensor that allows online measurements of language-induced motor activity during sentence listening. We use this tool to investigate whether language-induced motor activity remains constant or is modulated in negative, as opposed to affirmative, linguistic contexts. Our findings demonstrate that this simple experimental paradigm can be used to study the online crosstalk between language and the motor systems in an ecological and economical manner. Our data further confirm that the motor brain structures that can be called upon during action word processing are not mandatorily involved; the crosstalk is asymmetrically\ud governed by the linguistic context and not vice versa

    Some Reflections on the Task of Content Determination in the Context of Multi-Document Summarization of Evolving Events

    Full text link
    Despite its importance, the task of summarizing evolving events has received small attention by researchers in the field of multi-document summariztion. In a previous paper (Afantenos et al. 2007) we have presented a methodology for the automatic summarization of documents, emitted by multiple sources, which describe the evolution of an event. At the heart of this methodology lies the identification of similarities and differences between the various documents, in two axes: the synchronic and the diachronic. This is achieved by the introduction of the notion of Synchronic and Diachronic Relations. Those relations connect the messages that are found in the documents, resulting thus in a graph which we call grid. Although the creation of the grid completes the Document Planning phase of a typical NLG architecture, it can be the case that the number of messages contained in a grid is very large, exceeding thus the required compression rate. In this paper we provide some initial thoughts on a probabilistic model which can be applied at the Content Determination stage, and which tries to alleviate this problem.Comment: 5 pages, 2 figure
    corecore