42 research outputs found

    Textual Content and Engagement Correlation Analysis with Naive Bayes

    Get PDF
    With the constant improvement of sentiment analysis software, it is possible to determine whether there is a correlation between the sentiment of the content and the content engagement. By combining two platforms we were able to prove that there is a moderate correlation between the content sentiment and content engagement. Furthermore, there are other correlations regarding numeric variables describing the properties of the content, like content length and title length compared to the content consummation and engagement. Determined values are showing strong negative correlation between the content length and content consummation. Content platform was Medium.com social network and software platform for sentiment determination was an online tool based on enhanced Naïve Bayes model. For finding correlations we used the Pearson’s correlation coefficient because it gives information about the magnitude of the association, or correlation, as well as the direction of the relationship

    Value of expressions behind the letter capitalization in product reviews

    Get PDF
    Product reviews from consumers are the source of opinions and expressions about purchased items or services. Thus, it is essential to understand the true meaning behind text reviews. One of the ways is to analyze sentiments, expressions and emotions behind the text. However, there are different styles of writing used in the text. One of widely used in the text is letter capitalization. It is commonly used to strengthen an expression or louder tone within the text. This paper explores the value of expression behind letter capitalization in product reviews. We compared fully capitalized text, text with one capitalized words and text without capitalization through the readers’ perspective by asking them to rate the text based on Likert scale. Furthermore, we tested two samples of text with and without capitalization on 27 available online sentiment tools. Testing was done in order to check how current sentiment tools treat letter capitalization in their sentiment score. Results show that of letter capitalization is able to enforce the different level of expression. If the nature of the review is positive, the capitalization makes it more positive. Similar for the negative reviews, the capitalization tends to increase negativity

    Effect of N-Gram on Document Classification on the Naïve Bayes Classifier Algorithm

    Get PDF
    News has become a major need for everyone, with news we can get the information needed. News can be distributed in the form of print mass media, electronic mass media and online media. The means of spreading the news now have grown very rapidly, making the amount of information being managed are bigger and word management classified also not small.  herefore, we need a system for classifying documents that are not structured. In this study, word processing in a document is done by N-Gram as a feature generation. The document classification process is carried out using the Naïve Bayes Classifier algorithm. This study examines the effect of N-Gram on document classification on the Naïve Bayes Classifier algorithm. The results of the classification accuracy of documents by applying N-Gram is 32.68% and without applying N-Gram is 84.97%. A decrease in the classification results occurs the number of features that result from solving N-Gram that is unique or dominant to another category. The accuracy of the results obtained shows that the application of N-Gram in the classification of documents using the Naïve Bayes Classifier algorithm gives a decreased effect on the performance of the classificatio

    Hidden sentiment behind letter repetition in online reviews

    Get PDF
    Minimal research has been done on how letter repetition affects readers’ perception of expressed sentiment within a text. To the best of the researchers’ knowledge, no studies have tested samples of text with letter repetition using sentiment tools. The main aim of this paper is to investigate whether letter repetition in product reviews are perceived to have any sentiment value, based on ratings by individual participants and analyses using sentiment tools. This study collected and analysed 1,041 consumer reviews in the form of online comments using the UCREL Wmatrix system, and simulated emotional words within the comments to contain repeated letters. A group of 500 participants rated 15 positive comments and 15 negative comments and their respective simulated counterparts, while 32 sentiment tools are used to analyse a pair of positive comment and its simulated counterpart and a pair of negative comment and its simulated counterpart. Results indicate that readers perceive letter repetition to amplify a comment’s sentiment value, in which the effect was found more strongly in negative comments than positive comments. On the other hand, analyses using sentiment tools show that a majority of these tools are unable to detect letter repetition within a word and instead, treats the word as a spelling mistake. As consumers or online users, in general, have been found to use letter repetition to intensify and express their sentiments in their comments, this study’s findings suggest that letter repetition processing in any text-based mechanism needs to be enhanced. The outcome of this paper is useful for improving the measurement of sentiment analysis for the use of marketing applications

    Sentiment analysis on film review in Gujarati language using machine learning

    Get PDF
    Opinion analysis is by a long shot most basic zone of characteristic language handling. It manages the portrayal of information to choose the motivation behind the wellspring of the content. The reason might be of a type of gratefulness (positive) or study (negative). This paper offers a correlation between the outcomes accomplished by applying the calculation arrangement using various classifiers for instance K-nearest neighbor and multinomial naive Bayes. These techniques are utilized to assess a significant assessment with either a positive remark or negative remark. The gathered information considered on the grounds of the extremity film datasets and an association with the results accessible proof has been created for a careful assessment. This paper investigates the word level count vectorizer and term frequency inverse document frequency (TF-IDF) influence on film sentiment analysis. We concluded that multinomial Naive Bayes (MNB) classier generate more accurate result using TF-IDF vectorizer compared to CountVectorizer, K-nearest-neighbors (KNN) classifier has the same accuracy result in case of TF-IDF and CountVectorizer

    Klasifikasi Abstrak Tugas Akhir Mahasiswa DIII Politeknik Harapan Bersama Tegal

    Get PDF
    Abstraksi tugas akhir mahasiswa merupakan initisari dari suatu penelitan yang dilakukan oleh mahasiswa. Berbagai tema diangkat dalam tugas akhir ini. Tetapi dari tema tema tersebut, untuk mengklasifikasi abstrak tugas akhir mahasiswa masih sulit dilakukan dilihat dari akurasi penelitian yang telah dilakukan belum mencapai 90%. Oleh karena itu, penelitian ini dilakukan untuk mengklasifikasi abstrak tugas akhir untuk mendapatkan memudahkan dalam mencari tugas akhir. Dan juga untuk dapat menentukan atribut terbaik dari hasil text prosesing dengan klasifikasi menggunakan naive bayes. permasalahannya adalah tidak adanya pedoman baku dalam menentukan parameter yang akan digunakan pada metode ini sehingga yang dipakai adalah metode eksperimen. Untuk itu diperlukan metode yang dapat menyelesaikan permasalahan tersebut, sehingga parameter yang didapatkan dapat menjadi lebih optimal. Solusi yang dapat diterapkan adalah dengan menerapkan Algoritma genetika (GA) pada Naïve Bayes, untuk dapat menentukan atribut terbaik. Hasil yang didapatkan adalah ternyata penerapan teknik optimasi dengan Algoritma Genetika dapat mempermudah dalam mencari nilai parameter secara optimal dan dapat meningkatkan nilai akurasi pada algoritma Naïve bayes, dengan demikian model yang didapatkan dapat digunakan bagi para pencari referensi tugas akhir untuk mencari referensi tugas akhir yang tepat berdasarkan kata dari atribut yang terbaik