1,146 research outputs found

    The Automatic Interpretation of Nominalizations

    Get PDF
    This paper discusses the interpretation of nominalizations in domain independent wide-coverage text. We present a statistical model which interprets nominalizations based on the cooccurrence of verb-argument tuples in a large balanced corpus. We propose an algorithm which treats the interpretation task as a disambiguation problem and achieves a performance of approximately 80 % by combining partial parsing, smoothing techniques and domain independent taxonomic information (e.g., WordNet)

    Empirical methods for the study of denotation in nominalizations in Spanish

    Get PDF
    This article deals with deverbal nominalizations in Spanish; concretely, we focus on the denotative distinction between event and result nominalizations. The goals of this work is twofold: first, to detect the most relevant features for this denotative distinction; and, second, to build an automatic classification system of deverbal nominalizations according to their denotation. We have based our study on theoretical hypotheses dealing with this semantic distinction and we have analyzed them empirically by means of Machine Learning techniques which are the basis of the ADN-Classifier. This is the first tool that aims to automatically classify deverbal nominalizations in event, result, or underspecified denotation types in Spanish. The ADN-Classifier has helped us to quantitatively evaluate the validity of our claims regarding deverbal nominalizations. We set up a series of experiments in order to test the ADN-Classifier with different models and in different realistic scenarios depending on the knowledge resources and natural language processors available. The ADN-Classifier achieved good results (87.20% accuracy)

    Nominalization and Alternations in Biomedical Language

    Get PDF
    Background: This paper presents data on alternations in the argument structure of common domain-specific verbs and their associated verbal nominalizations in the PennBioIE corpus. Alternation is the term in theoretical linguistics for variations in the surface syntactic form of verbs, e.g. the different forms of stimulate in FSH stimulates follicular development and follicular development is stimulated by FSH. The data is used to assess the implications of alternations for biomedical text mining systems and to test the fit of the sublanguage model to biomedical texts. Methodology/Principal Findings: We examined 1,872 tokens of the ten most common domain-specific verbs or their zerorelated nouns in the PennBioIE corpus and labelled them for the presence or absence of three alternations. We then annotated the arguments of 746 tokens of the nominalizations related to these verbs and counted alternations related to the presence or absence of arguments and to the syntactic position of non-absent arguments. We found that alternations are quite common both for verbs and for nominalizations. We also found a previously undescribed alternation involving an adjectival present participle. Conclusions/Significance: We found that even in this semantically restricted domain, alternations are quite common, and alternations involving nominalizations are exceptionally diverse. Nonetheless, the sublanguage model applies to biomedica

    Iarg-AnCora: Spanish corpus annotated with implicit arguments

    Get PDF
    This article presents the Spanish Iarg-AnCora corpus (400 k-words, 13,883 sentences) annotated with the implicit arguments of deverbal nominalizations (18,397 occurrences). We describe the methodology used to create it, focusing on the annotation scheme and criteria adopted. The corpus was manually annotated and an interannotator agreement test was conducted (81 % observed agreement) in order to ensure the reliability of the final resource. The annotation of implicit arguments results in an important gain in argument and thematic role coverage (128 % on average). It is the first corpus annotated with implicit arguments for the Spanish language with a wide coverage that is freely available. This corpus can subsequently be used by machine learning-based semantic role labeling systems, and for the linguistic analysis of implicit arguments grounded on real data. Semantic analyzers are essential components of current language technology applications, which need to obtain a deeper understanding of the text in order to make inferences at the highest level to obtain qualitative improvements in the results

    Deverbal semantics and the Montagovian generative lexicon

    Get PDF
    We propose a lexical account of action nominals, in particular of deverbal nominalisations, whose meaning is related to the event expressed by their base verb. The literature about nominalisations often assumes that the semantics of the base verb completely defines the structure of action nominals. We argue that the information in the base verb is not sufficient to completely determine the semantics of action nominals. We exhibit some data from different languages, especially from Romance language, which show that nominalisations focus on some aspects of the verb semantics. The selected aspects, however, seem to be idiosyncratic and do not automatically result from the internal structure of the verb nor from its interaction with the morphological suffix. We therefore propose a partially lexicalist approach view of deverbal nouns. It is made precise and computable by using the Montagovian Generative Lexicon, a type theoretical framework introduced by Bassac, Mery and Retor\'e in this journal in 2010. This extension of Montague semantics with a richer type system easily incorporates lexical phenomena like the semantics of action nominals in particular deverbals, including their polysemy and (in)felicitous copredications.Comment: A revised version will appear in the Journal of Logic, Language and Informatio

    Discourse Deixis and Coreference: Evidence from AnCora

    Get PDF
    Proceedings of the Second Workshop on Anaphora Resolution (WAR II). Editor: Christer Johansson. NEALT Proceedings Series, Vol. 2 (2008), 73-82. © 2008 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/7129
    • 

    corecore