15 research outputs found

    Arabic text classification methods: Systematic literature review of primary studies

    Get PDF
    Recent research on Big Data proposed and evaluated a number of advanced techniques to gain meaningful information from the complex and large volume of data available on the World Wide Web. To achieve accurate text analysis, a process is usually initiated with a Text Classification (TC) method. Reviewing the very recent literature in this area shows that most studies are focused on English (and other scripts) while attempts on classifying Arabic texts remain relatively very limited. Hence, we intend to contribute the first Systematic Literature Review (SLR) utilizing a search protocol strictly to summarize key characteristics of the different TC techniques and methods used to classify Arabic text, this work also aims to identify and share a scientific evidence of the gap in current literature to help suggesting areas for further research. Our SLR explicitly investigates empirical evidence as a decision factor to include studies, then conclude which classifier produced more accurate results. Further, our findings identify the lack of standardized corpuses for Arabic text; authors compile their own, and most of the work is focused on Modern Arabic with very little done on Colloquial Arabic despite its wide use in Social Media Networks such as Twitter. In total, 1464 papers were surveyed from which 48 primary studies were included and analyzed

    Applications of Mining Arabic Text: A Review

    Get PDF
    Since the appearance of text mining, the Arabic language gained some interest in applying several text mining tasks over a text written in the Arabic language. There are several challenges faced by the researchers. These tasks include Arabic text summarization, which is one of the challenging open areas for research in natural language processing (NLP) and text mining fields, Arabic text categorization, and Arabic sentiment analysis. This chapter reviews some of the past and current researches and trends in these areas and some future challenges that need to be tackled. It also presents some case studies for two of the reviewed approaches

    Interest identification from browser tab titles: A systematic literature review

    Get PDF
    Modeling and understanding users interests has become an essential part of our daily lives. A variety of business processes and a growing number of companies employ various tools to such an end. The outcomes of these identification strategies are beneficial for both companies and users: the former are more likely to offer services to those customers who really need them, while the latter are more likely to get the service they desire. Several works have been carried out in the area of user interests identification. As a result, it might not be easy for researchers, developers, and users to orient themselves in the field; that is, to find the tools and methods that they most need, to identify ripe areas for further investigations, and to propose the development and adoption of new research plans. In this study, to overcome these potential shortcomings, we performed a systematic literature review on user interests identification. We used as input data browsing tab titles. Our goal here is to offer a service to the readership, which is capable of systematically guiding and reliably orienting researchers, developers, and users in this very vast domain. Our findings demonstrate that the majority of the research carried out in the field gathers data from either social networks (such as Twitter, Instagram and Facebook) or from search engines, leaving open the question of what to do when such data is not available

    Arabic semantic similarity approach for farmers’ complaints

    Get PDF
    Semantic similarity is applied for many areas in Natural Language Processing, such as information retrieval, text classification, plagiarism detection, and others. Many researchers used semantic similarity for English texts, but few used for Arabic due to the ambiguity of Arabic concepts in both sense and morphology. Therefore, the first contribution in this paper is developing a semantic similarity approach between Arabic sentences. Nowadays, the world faces a global problem of coronavirus disease. In light of these circumstances and distancing’s imposition, it is difficult for farmers to physically communicate with agricultural experts to provide advice and find suitable solutions for their agricultural complaints. In addition, traditional practices still are used by most farmers. Thus, our second contribution is helping the farmers solve their Arabic agricultural complaints using our proposed approach. The Latent Semantic Analysis approach is applied to retrieve the most problem-related semantic to a farmer’s complaint and find the related solution for the farmer. Two methods are used in this approach as a weighting schema for data representation are Term Frequency and Term Frequency-Inverse Document Frequency. The proposed model has also classified the big agricultural dataset and the submitted farmer complaint according to the crop type using MapReduce Support Vector Machine to improve the performance of semantic similarity results. The proposed approach’s performance with Term Frequency-Inverse Document Frequency-based Latent Semantic Analysis achieved better than its counterparts with an F-measure of 86.7%

    A review of sentiment analysis research in Arabic language

    Full text link
    Sentiment analysis is a task of natural language processing which has recently attracted increasing attention. However, sentiment analysis research has mainly been carried out for the English language. Although Arabic is ramping up as one of the most used languages on the Internet, only a few studies have focused on Arabic sentiment analysis so far. In this paper, we carry out an in-depth qualitative study of the most important research works in this context by presenting limits and strengths of existing approaches. In particular, we survey both approaches that leverage machine translation or transfer learning to adapt English resources to Arabic and approaches that stem directly from the Arabic language

    Mobile Application for Analysis of Sentiments in Twitter

    Get PDF
    El Análisis de Sentimientos es una técnica muy popular para el estudio de redes sociales. Una de las redes sociales más populares para microblogging, con gran crecimiento, es Twitter, ya que permite a las personas expresar sus opiniones utilizando oraciones cortas y simples. Estos textos se generan a diario y por esta razón, es común que las personas quieran saber cuáles son los temas de actualidad y sus derivaciones. En este trabajo, proponemos implementar una aplicación móvil que brinde información a las personas, como un grado de polaridad positiva o negativa, sobre cualquier tema relevante en la sociedad, ayudando de esta manera a que las personas puedan tomar la mejor decisión. En el aplicativo se utilizarán varias técnicas de clasificación de texto de manera conjunta. Estas técnicas están enfocadas en el aprendizaje de máquina y de léxico
    corecore