Search CORE

2 research outputs found

Persian Text Classification using naive Bayes algorithms and Support Vector Machine algorithm

Author: Novikova Galina
Rezaeian Naeim
Publication venue: IAES Indonesia Section
Publication date: 26/03/2020
Field of study

One of the several beneﬁts of text classiﬁcation is to automatically assign document in predeﬁned category is one of the primary steps toward knowledge extraction from the raw textual data. In such tasks, words are dealt with as a set of features. Due to high dimensionality and sparseness of feature vector results from traditional feature selection methods, most of the proposed text classiﬁcation methods for this purpose lack performance and accuracy. Many algorithms have been implemented to the problem of Automatic Text Categorization that’s why, we tried to use new methods like Information Extraction, Natural Language Processing, and Machine Learning. This paper proposes an innovative approach to improve the classification performance of the Persian text. Naive Bayes classifiers which are widely used for text classification in machine learning are based on the conditional probability. we have compared the Gaussian, Multinomial and Bernoulli methods of naive Bayes algorithms with SVM algorithm. for statistical text representation, TF and TF-IDF and character-level 3 (3-Gram) [6,9] were used. Finally, experimental results on 10 newsgroups

Indonesian Journal of Electrical Engineering and Informatics (IJEEI)