783 research outputs found

    Detecting emotions using a combination of bidirectional encoder representations from transformers embedding and bidirectional long short-term memory

    Get PDF
    One of the most difficult topics in natural language understanding (NLU) is emotion detection in text because human emotions are difficult to understand without knowing facial expressions. Because the structure of Indonesian differs from other languages, this study focuses on emotion detection in Indonesian text. The nine experimental scenarios of this study incorporate word embedding (bidirectional encoder representations from transformers (BERT), Word2Vec, and GloVe) and emotion detection models (bidirectional long short-term memory (BiLSTM), LSTM, and convolutional neural network (CNN)). With values of 88.28%, 88.42%, and 89.20% for Commuter Line, Transjakarta, and Commuter Line+Transjakarta, respectively, BERT-BiLSTM generates the highest accuracy on the data. In general, BiLSTM produces the highest accuracy, followed by LSTM, and finally CNN. When it came to word embedding, BERT embedding outperformed Word2Vec and GloVe. In addition, the BERT-BiLSTM model generates the highest precision, recall, and F1-measure values in each data scenario when compared to other models. According to the results of this study, BERT-BiLSTM can enhance the performance of the classification model when compared to previous studies that only used BERT or BiLSTM for emotion detection in Indonesian texts

    Flagging clickbait in Indonesian online news websites using fine-tuned transformers

    Get PDF
    Click counts are related to the amount of money that online advertisers paid to news sites. Such business models forced some news sites to employ a dirty trick of click-baiting, i.e., using hyperbolic and interesting words, sometimes unfinished sentences in a headline to purposefully tease the readers. Some Indonesian online news sites also joined the party of clickbait, which indirectly degrade other established news sites' credibility. A neural network with a pre-trained language model multilingual bidirectional encoder representations from transformers (BERT) that acted as an embedding layer is then combined with a 100 node-hidden layer and topped with a sigmoid classifier was trained to detect clickbait headlines. With a total of 6,632 headlines as a training dataset, the classifier performed remarkably well. Evaluated with 5-fold cross-validation, it has an accuracy score of 0.914, an F1-score of 0.914, a precision score of 0.916, and a receiver operating characteristic-area under curve (ROC-AUC) of 0.92. The usage of multilingual BERT in the Indonesian text classification task was tested and is possible to be enhanced further. Future possibilities, societal impact, and limitations of clickbait detection are discussed

    Big five personality prediction based in Indonesian tweets using machine learning methods

    Get PDF
    The popularity of social media has drawn the attention of researchers who have conducted cross-disciplinary studies examining the relationship between personality traits and behavior on social media. Most current work focuses on personality prediction analysis of English texts, but Indonesian has received scant attention. Therefore, this research aims to predict user’s personalities based on Indonesian text from social media using machine learning techniques. This paper evaluates several machine learning techniques, including naive Bayes (NB), K-nearest neighbors (KNN), and support vector machine (SVM), based on semantic features including emotion, sentiment, and publicly available Twitter profile. We predict the personality based on the big five personality model, the most appropriate model for predicting user personality in social media. We examine the relationships between the semantic features and the Big Five personality dimensions. The experimental results indicate that the Big Five personality exhibit distinct emotional, sentimental, and social characteristics and that SVM outperformed NB and KNN for Indonesian. In addition, we observe several terms in Indonesian that specifically refer to each personality type, each of which has distinct emotional, sentimental, and social features
    • …
    corecore