7 research outputs found

    Semantic Annotation of Deverbal Nominalizations in the Spanish AnCora Corpus

    Get PDF
    Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 187-198. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891

    IARG-AnCora: Annotating AnCora corpus with implicit arguments

    Full text link
    [EN] Iarg-AnCora aims to annotate the implicit arguments of deverbal nominalizations in AnCora corpus. This corpus will be the basis for systems of automatic semantic role labeling based on machine learning techniques. Semantic analyzers are essential components in the current applications of language technologies, in which it is important to obtain a deeper understanding of the text to make inferences on the highest level in order to obtain qualitative improvements in the results.[ES] IARG-AnCora tiene como objetivo la anotación con papeles temáticos de los argumentos implícitos de las nominalizaciones deverbales en el corpus AnCora. Estos corpus servirán de base para los sistemas de etiquetado automático de roles semánticos basados en técnicas de aprendizaje automático. Los analizadores semánticos son componentes básicos en las aplicaciones actuales de las tecnologías del lenguaje, en las que se quiere potenciar una comprensión más profunda del texto para realizar inferencias de más alto nivel y obtener así mejoras cualitativas en los resultados.Acción complementaria (FFI2011-13737-E), asociada al proyecto TextMess 2.0 (TIN2009-13391-C04-03/04).Taulé Delor, M.; Peris, A.; Martí Antonín, MA.; Moreno Boronat, LA.; Rodríguez, H.; Moreda, P. (2012). IARG-AnCora: Anotación de los corpus AnCora con argumentos implícitos. PROCESAMIENTO DEL LENGUAJE NATURAL. 49:181-184. http://hdl.handle.net/10251/29863S1811844

    IARG-AnCora: Anotación de los corpus AnCora con argumentos implícitos

    Get PDF
    Iarg-AnCora aims to annotate the implicit arguments of deverbal nominalizations in AnCora corpus. This corpus will be the basis for systems of automatic semantic role labeling based on machine learning techniques. Semantic analyzers are essential components in the current applications of language technologies, in which it is important to obtain a deeper understanding of the text to make inferences on the highest level in order to obtain qualitative improvements in the results

    AnCora-Nom: un léxico de nominalizaciones deverbales del español

    Get PDF
    En este artículo se describe un nuevo recurso: AnCora-Nom, un léxico de nominalizaciones deverbales del español. Actualmente, contiene 1.655 entradas léxicas y 3.094 sentidos, donde cada sentido tiene asociado el tipo denotativo y la estructura argumental con los papeles temáticos correspondientes. Este léxico se ha extraído automáticamente a partir de la información anotada en el corpus AnCora-Es. AnCora-Nom se derivó teniendo en cuenta no sólo la información estrictamente relacionada con las nominalizaciones deverbales sino también con información morfológica y sintáctico-semántica previamente anotada en el corpus.This paper describes a new lexical resource: Ancora-Nom, a Spanish lexicon of deverbal nominalizations. At present, it contains 1,655 lexical entries and 3,094 senses. Each sense has a denotation type associated, and the mapping of nominal complements with arguments and the corresponding theta roles is also annotated. A particular interest of this lexicon is that it has been automatically extracted from the annotated AnCora-Es corpus. AnCora-Nom was derived taking into account the information directly related to nominalizations, but also the morphological and syntactic-semantic information annotated in the corpus.This research has received support from the projects Text-Knowledge 2.0 (TIN2009-13391-C04-04) and AnCora-Net (FFI2009-06497-E/FILO) from the Spanish Ministry of Science and Innovation, and a FPU grant (AP2007-01028) from the Spanish Ministry of Education

    AnCora-Nom: A Spanish lexicon of deverbal nominalizations

    Get PDF
    This paper describes a new lexical resource: Ancora-Nom, a Spanish lexicon of deverbal nominalizations. At present, it contains 1,655 lexical entries and 3,094 senses. Each sense has a denotation type associated, and the mapping of nominal complements with arguments and the corresponding theta roles is also annotated. A particular interest of this lexicon is that it has been automatically extracted from the annotated AnCora-Es corpus. AnCora-Nom was derived taking into account the information directly related to nominalizations, but also the morphological and syntactic-semantic information annotated in the corpus, such as WordNet synsets, the specifier type of the nominalization, and its morphological number (singular or plural)

    Iarg-AnCora: Spanish corpus annotated with implicit arguments

    Get PDF
    This article presents the Spanish Iarg-AnCora corpus (400 k-words, 13,883 sentences) annotated with the implicit arguments of deverbal nominalizations (18,397 occurrences). We describe the methodology used to create it, focusing on the annotation scheme and criteria adopted. The corpus was manually annotated and an interannotator agreement test was conducted (81 % observed agreement) in order to ensure the reliability of the final resource. The annotation of implicit arguments results in an important gain in argument and thematic role coverage (128 % on average). It is the first corpus annotated with implicit arguments for the Spanish language with a wide coverage that is freely available. This corpus can subsequently be used by machine learning-based semantic role labeling systems, and for the linguistic analysis of implicit arguments grounded on real data. Semantic analyzers are essential components of current language technology applications, which need to obtain a deeper understanding of the text in order to make inferences at the highest level to obtain qualitative improvements in the results

    Proceedings

    Get PDF
    Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 268 pages. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891
    corecore