PARSEME Survey on MWE Resources

Abstract

International audienceThis paper summarizes the first results of an ongoing survey on multiword resources carried out within the IC1207 Cost ActionPARSEME (PARSing and Multi-word Expressions). Despite the availability of language resource catalogues and the inventory ofmultiword data-sets available at the SIGLEX-MWE website, multiword resources are scattered and prove to be difficult to be found.In many cases, language resources such as corpora, treebanks or lexical databases include multiwords as part of their data or take theminto consideration in their annotations. However, it is needed to centralize these resources so that other researches may subsequentlyuse them. The final aim of this survey is thus to create a portal where researchers may find multiword resources or multiword-awarelanguage resources for their research. We report on how the survey was designed and analyze the data gathered so far. We also discussthe problems we have detected upon examination of the data and possible ways of enhancing the survey

    Similar works