Automatic extraction of Arabic multiword expressions

Attia, Mohammed; Pecina, Pavel; Toral, Antonio; Tounsi, Lamia; van Genabith, Josef

research

Automatic extraction of Arabic multiword expressions

Authors: Mohammed Attia
Pavel Pecina
Antonio Toral
Lamia Tounsi
Josef van Genabith
Publication date: 1 January 2010
Publisher

Abstract

In this paper we investigate the automatic acquisition of Arabic Multiword Expressions (MWE). We propose three complementary approaches to extract MWEs from available data resources. The first approach relies on the correspondence asymmetries between Arabic Wikipedia titles and titles in 21 different languages. The second approach collects English MWEs from Princeton WordNet 3.0, translates the collection into Arabic using Google Translate, and utilizes different search engines to validate the output. The third uses lexical association measures to extract MWEs from a large unannotated corpus. We experimentally explore the feasibility of each approach and measure the quality and coverage of the output against gold standards

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

DCU Online Research Access Service

oai:doras.dcu.ie:16155

Last time updated on 10/07/2013

Name not available

oai:doras.dcu.ie:16155

Last time updated on 09/02/2018

Irish Universities

Last time updated on 30/12/2017