2 research outputs found
Sentiment Analysis of Spanish Words of Arabic Origin Related to Islam: A Social Network Analysis
With the arrival of Muslims in 711 till their expulsion in the 1600s, Arabic language was present in Spain for more than eight centuries. Although social networks have become a valuable resource for mining sentiments, there is no previous research investigating the layman’s sentiment towards Spanish words of Arabic etymology related to Islamic terminology. This study aim at analyzing Spanish words of Arabic origin related to Islam. A random sample of 4586 out of 45860 tweets was used to evaluate general sentiment towards some Spanish words of Arabic origin related to Islam. An expert-predefined Spanish lexicon of around 6800 seed adjectives was used to conduct the analysis. Results indicate a generally positive sentiment towards several Spanish words of Arabic etymology related to Islam. By implementing both a qualitative and quantitative methodology to analyze tweets’ sentiments towards Spanish words of Arabic etymology, this research adds breadth and depth to the debate over Arabic linguistic influence on Spanish vocabulary
ANALYZING CUSTOMER REVIEWS IN TURKISH USING MACHINE LEARNING AND DATA SCIENCE METHODOLOGIES
Digital life, especially after the introduction of Web 2.0, has significantly altered
human relations, providing all people the “right of public speech”. Ideas, emotions,
and opinions on many topics are generously shared in virtual environments. A new age
global and digital Mouth of World is shaping the society where knowledge is the most
influential power. Being fed by social media data highly dynamic in either amount or
shape, automatic handling is indispensable.
Natural Language Processing, in cooperation with Machine Language techniques, has
an important say in analyzing written textual data. Traditional techniques exploited in
the literature are empowered when hybrid ones are applied, in accordance also with the
characteristic properties of the language used and the domain-specific data. Although
all the subsequent steps of the text classification chain are important, adequate feature
selecting has a notable huge impact on accurate classification prediction.
In this study, a simple classification of the sentiment polarity of comments in document
level of subjective texts in Turkish is done. Different domains include reviews of
customers towards company products, movies, and healthcare services, deciding on the
positivity or negativity of the comments. Another domain includes doctors’ notes on
patients’ symptoms aiming to predict and thus recommend some of the most often used
medical tests according to general doctors’ procedures.
The features used included a part of or all distinct words roots together with their
binary or frequency information. Linear or vector analysis of the feature sets was done
employing Machine Learning algorithms provided by the Weka tool. Hybrid features
set was proposed and found more efficient combining binary vectors and frequency
meta-features from nodes and leaves of J48 tree classifier for all or a set of correlation based selected features, improving both prediction accuracy and classification
performance