3 research outputs found

    A novel classifier based on meaning for text classification

    No full text
    Ganiz, Murat Can (Dogus Author) -- Akyokuş, Selim (Dogus Author) -- Conference full title: International Symposium on Innovations in Intelligent Systems and Applications, INISTA 2015; Madrid; Spain; 2 September 2015 through 4 September 2015.Text classification is one of the key methods used in text mining. Generally, traditional classification algorithms from machine learning field are used in text classification. These algorithms are primarily designed for structured data. In this paper, we propose a new classifier for textual data, called Supervised Meaning Classifier (SMC). The new SMC classifier uses meaning measure, which is based on Helmholtz principle from Gestalt Theory. In SMC, meaningfulness of terms in the context of classes are calculated and used for classification of a document. Experiment results show that new SMC classifier outperforms traditional classifiers of Multinomial Naïve Bayes (MNB) and Support Vector Machine (SVM) especially when the training data limited

    Evaluation of classification models for language processing

    No full text
    Kilimci, Zeynep Hilal (Dogus Author) -- Ganiz, Murat Can (Dogus Author) -- Conference full title: International Symposium on Innovations in Intelligent Systems and Applications, INISTA 2015; Madrid; Spain; 2 August 2015 through 4 August 2015.Naïve Bayes is a commonly used algorithm in text categorization because of its easy implementation and low complexity. Naïve Bayes has mainly two event models used for text categorization which are multivariate Bernoulli and multinomial models. A very large number of studies choose multinomial model and Laplace smoothing just based on the assumption that it performs better than multivariate model under almost any conditions. This study aims to shed some light into this widely adopted assumption by analyzing Naïve Bayes event models and smoothing methods from a different perspective. To clarify the difference between events models of Naïve Bayes, their classification performance are compared on different languages - English and Turkish - datasets. Results of our extensive experiments demonstrate that superior performance of multinomial model does not observed all the time. On the other hand, multivariate Bernoulli model can perform well when combined with an appropriate smoothing method under different training data size conditions
    corecore