263 research outputs found

    Implementation of Document Classification using NaĂŻve Bayes Classifier for the Performance and Level of Accuracy of Document Searching using Boyer-Moore Algorithm

    Get PDF
    Search engine is a program that searches a data in a database based on keywords entered by the user. There are several algorithms that have been used, one of them is Boyer-Moore. However, the use of the method found problems such as slow speed of search and low accuracy of search results when used on a large scale. In this study, is used classification of documents using the Naive Bayes Classifier to overcome these problems. Based on the results of research using 1000 documents, it was found that the speed of searching software with a classified documents is better than the speed of searching software without classified documents. However, the result of accuracy level have the same great value

    Monitoring Indonesian online news for COVID-19 event detection using deep learning

    Get PDF
    Even though coronavirus disease 2019 (COVID-19) vaccination has been done, preparedness for the possibility of the next outbreak wave is still needed with new mutations and virus variants. A near real-time surveillance system is required to provide the stakeholders, especially the public, to act in a timely response. Due to the hierarchical structure, epidemic reporting is usually slow particularly when passing jurisdictional borders. This condition could lead to time gaps for public awareness of new and emerging events of infectious diseases. Online news is a potential source for COVID-19 monitoring because it reports almost every infectious disease incident globally. However, the news does not report only about COVID-19 events, but also various information related to COVID-19 topics such as the economic impact, health tips, and others. We developed a framework for online news monitoring and applied sentence classification for news titles using deep learning to distinguish between COVID-19 events and non-event news. The classification results showed that the fine-tuned bidirectional encoder representations from transformers (BERT) trained with Bahasa Indonesia achieved the highest performance (accuracy: 95.16%, precision: 94.71%, recall: 94.32%, F1-score: 94.51%). Interestingly, our framework was able to identify news that reports the new COVID strain from the United Kingdom (UK) as an event news, 13 days before the Indonesian officials closed the border for foreigners

    A Performance Evaluation of Classifiers Employ Language Dependent Tools for Indonesian Text

    Get PDF
    This paper evaluates the performance of Maximum Entropy (MaxEnt), Support Vector Machine (SVM) and Na¨ıve Bayes (NB) techniques for Indonesian text classification. Performance of MaxEnt and SVM techniques are compared against baseline NB technique. We also investigate the effect of language dependent tools such as Indonesian stemming and stop words removal can have on these techniques for text classification performances. Up to now, there is no experimental report about the effect of Indonesian stemmer on the text classification accuracy. From our experiments, we conclude that maximum entropy performs better than other classifiers in general. Language dependent tools such as stemming and stop words removal have only little effect on the accuracy of text classification. However stemmed approach scored highest average accuracy and due to the dimension reduction of feature vectors used in classification, make this approach is viable step in pre-processing stage

    Exploring multinomial naïve Bayes for Yorùbá text document classification

    Get PDF
    The recent increase in the emergence of Nigerian language text online motivates this paper in which the problem of classifying text documents written in Yorùbá language into one of a few pre-designated classes is considered. Text document classification/categorization research is well established for English language and many other languages; this is not so for Nigerian languages. This paper evaluated the performance of a multinomial Naive Bayes model learned on a research dataset consisting of 100 samples of text each from business, sporting, entertainment, technology and political domains, separately on unigram, bigram and trigram features obtained using the bag of words representation approach. Results show that the performance of the model over unigram and bigram features is comparable but significantly better than a model learned on trigram features. The results generally indicate a possibility for the practical application of NB algorithm to the classification of text documents written in Yorùbá language. Keywords: Supervised learning, text classification, Yorùbá language, text mining, BoW Representatio

    Comparative Analysis of KNN, NaĂŻve Bayes and SVM Algorithms for Movie Genres Classification Based on Synopsis.

    Get PDF
    Text classification is a process of categorizing a text into the correct label. Text classification in natural language processing is a challenging task that requires accuracy to get the correct results, manual text classification tends to be inefficient because it requires a lot of time and also experts. The utilization of machine learning for automatic text classification can be a solution to this problem. KNN, Naive Bayes, and SVM are known as some of the most algorithms to solve classification problems, especially text classification. In this study, we are trying to compare the KNN, Naive Bayes, and SVM algorithms for text classification with the problem of classifying movie genres based on a synopsis using datasets obtained from Kaggle.com and IMDB Dataset. The results of this study indicate that of the 12 experiments, Support Vector Machine (SVM) is the bestperforming algorithm with an accuracy of 90%, 93%, 65%, and 63%. It is hoped that this research can help to determine the best algorithm in the text classification process.

    Challenges of Sarcasm Detection for Social Network : A Literature Review

    Get PDF
    Nowadays, sarcasm recognition and detection simplified with various domains knowledge, among others, computer science, social science, psychology, mathematics, and many more. This article aims to explain trends in sentiment analysis especially sarcasm detection in the last ten years and its direction in the future. We review journals with the title’s keyword “sarcasm” and published from the year 2008 until 2018. The articles were classified based on the most frequently discussed topics among others: the dataset, pre-processing, annotations, approaches, features, context, and methods used. The significant increase in the number of articles on “sarcasm” in recent years indicates that research in this area still has enormous opportunities. The research about “sarcasm” also became very interesting because only a few researchers offer solutions for unstructured language. Some hybrid approaches using classification and feature extraction are used to identify the sarcasm sentence using deep learning models. This article will provide a further explanation of the most widely used algorithms for sarcasm detection with object social media. At the end of this article also shown that the critical aspect of research on sarcasm sentence that could be done in the future is dataset usage with various languages that cover unstructured data problem with contextual information will effectively detect sarcasm sentence and will improve the existing performance

    Facial Emotional Expressions Of Life-Like Character Based On Text Classifier And Fuzzy Logic

    Get PDF
    A system consists of a text classifier and Fuzzy Inference System FIS to build a life-like virtual character capable of expressing emotion from a text input is proposed. The system classifies emotional content of sentences from text input and expresses corresponding emotion by a facial expression. Text input is classified using the text classifier while facial expression of the life-like character are controlled by FIS utilizing results from the text classifier. A number of text classifier methods are employed and their performances are evaluated using Leave-One-Out Cross Validation. In real world application such as animation movie the lifelike virtual character of proposed system needs to be animated. As a demonstration examples of facial expressions with corresponding text input as results from the implementation of our system are shown. The system is able to show facial expressions with admixture blending emotions. This paper also describes animation characteristics of the system using neutral expression as center of facial expression transition from one emotion to another. Emotion transition can be viewed as gradual decrease or increase of emotion intensity from one emotion toward other emotion. Experimental results show that animation of lifelike character expressing emotion transition can be generated automatically using proposed system
    • …
    corecore