1,310 research outputs found

    Les collocations comme indice pour distinguer les genres textuels

    Get PDF
    Cette étude se propose de vérifier l’efficacité des collocations en tant qu’indice pour distinguer les genres textuels. De plus, elle a le double objectif d’aborder l’exploration de la variabilité de l’italien en utilisant des méthodologies computationnelles, et de vérifier l’efficacité d’une nouvelle mesure d’association dans l’étude des collocations.Quatre typologies de collocations ont été analysées (verbe-nom, nom-adjectif, nom-nom et nom-préposition-nom) dans six genres textuels différents, dont trois sont écrits (textes littéraires, textes académiques et compositions scolaires) et trois sont oraux (conversations, discours et dialogues filmiques).La fréquence des collocations dans les différents genres montre que chaque typologie de texte a des préférences spécifiques pour des typologies de collocations spécifiques; la seule fréquence et la seule distinction entre textes écrits et oraux, toutefois, ne réussit pas à interpréter cette différente distribution selon un modèle cohérent. A cet effet, la mesure statistique de la gravité lexicale semble posséder une efficacité majeure, comme nous essayerons de démontrer.Collocations as an Index for Distinguishing Text GenresThis paper aims to incorporate collocations as an index to distinguish text genres: our main hypothesis is that collocations, as well as other linguistic features, are potentially suitable to identify genres. Thus, this is mostly an exploratory study, aimed at verifying this hypothesis and at taking a deeper look into register variation across different genres in Italian with computational and statistical methods.Furthermore, in a broader perspective, this study might give significant contributions in other fields, such as automatic genre identification [Santini 2004], measure of text cohesion [Louwerse et al. 2004] or text readability, where the detection of collocations as a marker of genres can increase the accuracy of computational tools devoted to these tasks

    Automatic Acquisition of Knowledge About Multiword Predicates

    Get PDF
    PACLIC 19 / Taipei, taiwan / December 1-3, 200

    Multiword expressions at length and in depth

    Get PDF
    The annual workshop on multiword expressions takes place since 2001 in conjunction with major computational linguistics conferences and attracts the attention of an ever-growing community working on a variety of languages, linguistic phenomena and related computational processing issues. MWE 2017 took place in Valencia, Spain, and represented a vibrant panorama of the current research landscape on the computational treatment of multiword expressions, featuring many high-quality submissions. Furthermore, MWE 2017 included the first shared task on multilingual identification of verbal multiword expressions. The shared task, with extended communal work, has developed important multilingual resources and mobilised several research groups in computational linguistics worldwide. This book contains extended versions of selected papers from the workshop. Authors worked hard to include detailed explanations, broader and deeper analyses, and new exciting results, which were thoroughly reviewed by an internationally renowned committee. We hope that this distinctly joint effort will provide a meaningful and useful snapshot of the multilingual state of the art in multiword expressions modelling and processing, and will be a point point of reference for future work

    D6.1: Technologies and Tools for Lexical Acquisition

    Get PDF
    This report describes the technologies and tools to be used for Lexical Acquisition in PANACEA. It includes descriptions of existing technologies and tools which can be built on and improved within PANACEA, as well as of new technologies and tools to be developed and integrated in PANACEA platform. The report also specifies the Lexical Resources to be produced. Four main areas of lexical acquisition are included: Subcategorization frames (SCFs), Selectional Preferences (SPs), Lexical-semantic Classes (LCs), for both nouns and verbs, and Multi-Word Expressions (MWEs)
    • …
    corecore