3 research outputs found

    StEduCov: An Explored and Benchmarked Dataset on Stance Detection in Tweets towards Online Education during COVID-19 Pandemic

    Get PDF
    In this paper, we present StEduCov, an annotated dataset for the analysis of stances toward online education during the COVID-19 pandemic. StEduCov consists of 16,572 tweets gathered over 15 months, from March 2020 to May 2021, using the Twitter API. The tweets were manually annotated into the classes agree, disagreeor neutral. We performed benchmarking on the dataset using state-of-the-art and traditional machine learning models. Specifically, we trained deep learning models-bidirectional encoder representations from transformers, long short-term memory, convolutional neural networks, attention-based biLSTM and Naive Bayes SVM-in addition to naive Bayes, logistic regression, support vector machines, decision trees, K-nearest neighbor and random forest. The average accuracy in the 10-fold cross-validation of these models ranged from 75% to (Formula presented.) % and from (Formula presented.) % to 68% for binary and multi-class stance classifications, respectively. Performances were affected by high vocabulary overlaps between classes and unreliable transfer learning using deep models pre-trained on general texts in relation to specific domains such as COVID-19 and distance education. 2022 by the authors.Scopu

    Empathy and Persona of English vs. Arabic Chatbots: A Survey and Future Directions

    No full text
    There is a high demand for chatbots across a wide range of sectors. Human-like chatbots engage meaningfully in dialogues while interpreting and expressing emotions and being consistent through understanding the user's personality. Though substantial progress has been achieved in developing empathetic chatbots for English, work on Arabic chatbots is still in its early stages due to various challenges associated with the language constructs and dialects. This survey reviews recent literature on approaches to empathetic response generation, persona modelling and datasets for developing chatbots in the English language. In addition, it presents the challenges of applying these approaches to Arabic and outlines some solutions. We focus on open-domain chatbots developed as end-to-end generative systems due to their capabilities to learn and infer language and emotions. Accordingly, we create four open problems pertaining to gaps in Arabic and English work; namely, (1) feature representation learning based on multiple dialects; (2) modelling the various facets of a persona and emotions; (3) datasets; and (4) evaluation metrics. 2022, Springer Nature Switzerland AG.Acknowledgments. This work was made possible by NPRP13S-0112-200037 grant from Qatar National Research Fund (a member of Qatar Foundation). The statements made herein are solely the responsibility of the authors.Scopu

    Attention-Based Model for Accurate Stance Detection

    No full text
    Effective representation learning is an essential building block for achieving many natural language processing tasks such as stance detection as performed implicitly by humans. Stance detection can assist in understanding how individuals react to certain information by revealing the user's stance on a particular topic. In this work, we propose a new attention-based model for learning feature representations and show its effectiveness in the task of stance detection. The proposed model is based on transfer learning and multi-head attention mechanisms. Specifically, we use BERT and word2vec models to learn text representation vectors from the data and pass both of them simultaneously to the multi-head attention layer to help focus on the best learning features. We present five variations of the model, each with a different combination of BERT and word2vec embeddings for the query and value parameters of the attention layer. The performance of the proposed model is evaluated against multiple baseline and state-of-the-art models. The best of the five proposed variations of the model improved the accuracy on average by 0.4% and achieved 68.4% accuracy for multi-classification, while the best accuracy for binary classification is 86.1% with a 1.3% improvement. 2022, Springer Nature Switzerland AG.Scopu
    corecore