14,144 research outputs found

    A Machine Learning Approach For Opinion Holder Extraction In Arabic Language

    Full text link
    Opinion mining aims at extracting useful subjective information from reliable amounts of text. Opinion mining holder recognition is a task that has not been considered yet in Arabic Language. This task essentially requires deep understanding of clauses structures. Unfortunately, the lack of a robust, publicly available, Arabic parser further complicates the research. This paper presents a leading research for the opinion holder extraction in Arabic news independent from any lexical parsers. We investigate constructing a comprehensive feature set to compensate the lack of parsing structural outcomes. The proposed feature set is tuned from English previous works coupled with our proposed semantic field and named entities features. Our feature analysis is based on Conditional Random Fields (CRF) and semi-supervised pattern recognition techniques. Different research models are evaluated via cross-validation experiments achieving 54.03 F-measure. We publicly release our own research outcome corpus and lexicon for opinion mining community to encourage further research

    SemEval-2016 task 5 : aspect based sentiment analysis

    Get PDF
    International audienceThis paper describes the SemEval 2016 shared task on Aspect Based Sentiment Analysis (ABSA), a continuation of the respective tasks of 2014 and 2015. In its third year, the task provided 19 training and 20 testing datasets for 8 languages and 7 domains, as well as a common evaluation procedure. From these datasets, 25 were for sentence-level and 14 for text-level ABSA; the latter was introduced for the first time as a subtask in SemEval. The task attracted 245 submissions from 29 teams

    Seminar Users in the Arabic Twitter Sphere

    Full text link
    We introduce the notion of "seminar users", who are social media users engaged in propaganda in support of a political entity. We develop a framework that can identify such users with 84.4% precision and 76.1% recall. While our dataset is from the Arab region, omitting language-specific features has only a minor impact on classification performance, and thus, our approach could work for detecting seminar users in other parts of the world and in other languages. We further explored a controversial political topic to observe the prevalence and potential potency of such users. In our case study, we found that 25% of the users engaged in the topic are in fact seminar users and their tweets make nearly a third of the on-topic tweets. Moreover, they are often successful in affecting mainstream discourse with coordinated hashtag campaigns.Comment: to appear in SocInfo 201
    corecore