Search CORE

3,131 research outputs found

A Machine Learning Approach For Opinion Holder Extraction In Arabic Language

Author: AbdelRahman Samir
Elarnaoty Mohamed
Fahmy Aly
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 06/04/2012
Field of study

Opinion mining aims at extracting useful subjective information from reliable amounts of text. Opinion mining holder recognition is a task that has not been considered yet in Arabic Language. This task essentially requires deep understanding of clauses structures. Unfortunately, the lack of a robust, publicly available, Arabic parser further complicates the research. This paper presents a leading research for the opinion holder extraction in Arabic news independent from any lexical parsers. We investigate constructing a comprehensive feature set to compensate the lack of parsing structural outcomes. The proposed feature set is tuned from English previous works coupled with our proposed semantic field and named entities features. Our feature analysis is based on Conditional Random Fields (CRF) and semi-supervised pattern recognition techniques. Different research models are evaluated via cross-validation experiments achieving 54.03 F-measure. We publicly release our own research outcome corpus and lexicon for opinion mining community to encourage further research

arXiv.org e-Print Archive

Exploring manuscripts: sharing ancient wisdoms across the semantic web

Author: Hedges Mark
Jordanous Anna
Lawrence K Faith
Tupman Charlotte
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

Recent work in digital humanities has seen researchers in-creasingly producing online editions of texts and manuscripts, particularly in adoption of the TEI XML format for online publishing. The benefits of semantic web techniques are un-derexplored in such research, however, with a lack of sharing and communication of research information. The Sharing Ancient Wisdoms (SAWS) project applies linked data prac-tices to enhance and expand on what is possible with these digital text editions. Focussing on Greek and Arabic col-lections of ancient wise sayings, which are often related to each other, we use RDF to annotate and extract seman-tic information from the TEI documents as RDF triples. This allows researchers to explore the conceptual networks that arise from these interconnected sayings. The SAWS project advocates a semantic-web-based methodology, en-hancing rather than replacing current workflow processes, for digital humanities researchers to share their findings and collectively benefit from each other’s work

CiteSeerX

Sarcasm Detection is Way Too Easy! An Empirical Comparison of Human and Machine Sarcasm Detection

Author: Abu Farha Ibrahim
Magdy Walid
Oprea Silviu Vlad
Wilson Steven
Publication venue
Publication date: 02/02/2023
Field of study

Deceptive Opinions Detection Using New Proposed Arabic Semantic Features

Author: Aldwairi Monther
Azizi Nabiha
Chekkai Nassira
Salah Marwa Hadj
Schwab Didier
Zemmal Nawel
Zenakhra Djamel
Ziani Amel
Publication venue: ZU Scholars
Publication date: 01/01/2021
Field of study

Some users try to post false reviews to promote or to devalue other’s products and services. This action is known as deceptive opinions spam, where spammers try to gain or to profit from posting untruthful reviews. Therefore, we conducted this work to develop and to implement new semantic features to improve the Arabic deception detection. These features were inspired from the study of discourse parse and the rhetoric relations in Arabic. Looking to the importance of the phrase unit in the Arabic language and the grammatical studies, we have analyzed and selected the most used unit markers and relations to calculate the proposed features. These last were used basically to represent the reviews texts in the classification phase. Thus, the most accurate classification technique used in this area which has been proven by several previous works is the Support Vector Machine classifier (SVM). But there is always a lack concerning the Arabic annotated resources specially for deception detection area as it is considered new research area. Therefore, we used the semi supervised SVM to overcome this problem by using the unlabeled data

Hal - Université Grenoble Alpes

Hal-Diderot