Search CORE

2 research outputs found

Sentiment analysis tools should take account of the number of exclamation marks!!!

Author: Ameur H.
Battaglino C.
Bonny I.
Cambria E.
Gang G.
Gill A. J.
Kalman Y. M.
Lobur M.
Naradhipa A. R.
Teh P. L.
Urabe Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 11/12/2015
Field of study

There are various factors that affect the sentiment level expressed in textual comments. Capitalization of letters tends to mark something for attention and repeating of letters tends to strengthen the emotion. Emoticons are used to help visualize facial expressions which can affect understanding of text. In this paper, we show the effect of the number of exclamation marks used, via testing with twelve online sentiment tools. We present opinions gathered from 500 respondents towards “like” and “dislike” values, with a varying number of exclamation marks. Results show that only 20% of the online sentiment tools tested considered the number of exclamation marks in their returned scores. However, results from our human raters show that the more exclamation marks used for positive comments, the more they have higher “like” values than the same comments with fewer exclamations marks. Similarly, adding more exclamation marks for negative comments, results in a higher “dislike”

Crossref

Lancaster E-Prints

Script Independent Morphological Segmentation for Arabic Maghrebi Dialects: An Application to Machine Translation

Author: Harrat Salima
Meftouh Karima
Smaïli Kamel
Publication venue: 'Instituto Politecnico Nacional/Centro de Investigacion en Computacion'
Publication date: 01/01/2019
Field of study

International audienceThis research deals with resources creation for under-resourced languages. We try to adapt existing resources for other resourced-languages to process less-resourced ones. We focus on Arabic dialects of the Maghreb, namely Algerian, Moroccan and Tunisian. We first adapt a well-known statistical word segmenter to segment Algerian dialect texts written in both Arabic and Latin scripts. We demonstrate that unsupervised morphological segmentation could be applied to Arabic dialects regardless of used script. Next, we use this kind of segmentation to improve statistical machine translation scores between the tree Maghrebi dialects and French. We use a parallel multidialectal corpus that includes six Arabic dialects in addition to MSA and French. We achieved interesting results. Regards to word segmentation, the rate of correctly segmented words reached 70% for those written in Latin script and 79% for those written in Arabic script. For machine translation, the unsupervised morphological segmentation helped to decrease out-of-vocabulary words rates by a minimum of 35%

INRIA a CCSD electronic archive server

Hal-Diderot