988 research outputs found

    Sentiment analysis in MOOCs: a case study

    Get PDF
    Proceeding of: 2018 IEEE Global Engineering Education Conference (EDUCON2018), 17-20 April, 2018, Santa Cruz de Tenerife, Canary Islands, Spain.Forum messages in MOOCs (Massive Open Online Courses) are the most important source of information about the social interactions happening in these courses. Forum messages can be analyzed to detect patterns and learners' behaviors. Particularly, sentiment analysis (e.g., classification in positive and negative messages) can be used as a first step for identifying complex emotions, such as excitement, frustration or boredom. The aim of this work is to compare different machine learning algorithms for sentiment analysis, using a real case study to check how the results can provide information about learners' emotions or patterns in the MOOC. Both supervised and unsupervised (lexicon-based) algorithms were used for the sentiment analysis. The best approaches found were Random Forest and one lexicon based method, which used dictionaries of words. The analysis of the case study also showed an evolution of the positivity over time with the best moment at the beginning of the course and the worst near the deadlines of peer-review assessments.This work has been co-funded by the Madrid Regional Government, through the eMadrid Excellence Network (S2013/ICE-2715), by the European Commission through Erasmus+ projects MOOC-Maker (561533-EPP-1-2015-1-ESEPPKA2-CBHE-JP), SHEILA (562080-EPP-1-2015-1-BEEPPKA3-PI-FORWARD), and LALA (586120-EPP-1-2017-1-ES-EPPKA2-CBHE-JP), and by the Spanish Ministry of Economy and Competitiveness, projects SNOLA (TIN2015-71669-REDT), RESET (TIN2014-53199-C3-1-R) and Smartlet (TIN2017-85179-C3-1-R). The latter is financed by the State Research Agency in Spain (AEI) and the European Regional Development Fund (FEDER). It has also been supported by the Spanish Ministry of Education, Culture and Sport, under a FPU fellowship (FPU016/00526).Publicad

    MOOCs Meet Measurement Theory: A Topic-Modelling Approach

    Full text link
    This paper adapts topic models to the psychometric testing of MOOC students based on their online forum postings. Measurement theory from education and psychology provides statistical models for quantifying a person's attainment of intangible attributes such as attitudes, abilities or intelligence. Such models infer latent skill levels by relating them to individuals' observed responses on a series of items such as quiz questions. The set of items can be used to measure a latent skill if individuals' responses on them conform to a Guttman scale. Such well-scaled items differentiate between individuals and inferred levels span the entire range from most basic to the advanced. In practice, education researchers manually devise items (quiz questions) while optimising well-scaled conformance. Due to the costly nature and expert requirements of this process, psychometric testing has found limited use in everyday teaching. We aim to develop usable measurement models for highly-instrumented MOOC delivery platforms, by using participation in automatically-extracted online forum topics as items. The challenge is to formalise the Guttman scale educational constraint and incorporate it into topic models. To favour topics that automatically conform to a Guttman scale, we introduce a novel regularisation into non-negative matrix factorisation-based topic modelling. We demonstrate the suitability of our approach with both quantitative experiments on three Coursera MOOCs, and with a qualitative survey of topic interpretability on two MOOCs by domain expert interviews.Comment: 12 pages, 9 figures; accepted into AAAI'201

    Sentiment Analysis of Nigerian Students’ Tweets on Education: A Data Mining Approach

    Get PDF
    The paper is aimed at investigating data mining technologies by acquiring tweets from Nigerian University students on Twitter on how they feel about the current state of the Nigerian university system. The study for this paper was conducted in a way that the tweet data collected using the Twitter Application was pre-processed before being translated from text to vector representation using a feature extraction technique such Bag-of-Words. In the paper, the proposed sentiment analysis architecture was designed using UML and the Naïve Bayes classifier (NBC) approach, which is a simple but effective classifier to determine the polarity of the education dataset, was applied to compute the probabilities of the classes. Furthermore, Naïve Bayes classifier polarized the tweets' wording as negative or positive for polarity. Based on our investigation, the experiment revealed after data cleaning that 4016 of the total data obtained were utilized. Also, Positive attitudes accounted for 40.56%, while negative sentiments accounted for 59.44% of the total data having divided the dataset into 70:30 training and testing ratio, with the Naïve Bayes classifier being taught on the training set and its performance being evaluated on the test set. Because the models were trained on unbalanced data, we employed more relevant evaluation metrics such as precision, recall, F1-score, and balanced accuracy for model evaluation. The classifier's prediction accuracy, misclassification error rate, recall, precision, and f1-score were 63 %, 37%, 63%, 62%, and 62% respectively. All of the analyses were completed using the Python programming language and the Natural Language Tool Kit packages. Finally, the outcome of this prediction is the highest likelihood class. These forecasts can be used by Nigerian Government to improve the educational system and assist students to receive a better education

    A Probabilistic Approach to Modeling Socio-Behavioral Interactions

    Get PDF
    In our ever-increasingly connected world, it is essential to build computational models that represent, reason, and model the underlying characteristics of real-world networks. Data generated from these networks are often heterogeneous, interlinked, and exhibit rich multi-relational graph structures having unobserved latent characteristics. My work focuses on building computational models for representing and reasoning about rich, heterogeneous, interlinked graph data. In my research, I model socio-behavioral interactions and predict user behavior patterns in two important online interaction platforms: online courses and online professional networks. Structured data from these interaction platforms contain rich behavioral and interaction data, and provide an opportunity to design machine learning methods for understanding and interpreting user behavior. The data also contains unstructured data, such as natural language text from forum posts and other online discussions. My research aims at constructing a family of probabilistic models for modeling social interactions involving both structured and unstructured data. In the early part of this thesis, I present a family of probabilistic models for online courses for: 1) modeling student engagement, 2) predicting student completion and dropouts, 3) modeling student sentiment toward various course aspects (e.g., content vs. logistics), 4) detecting coarse and fine-grained course aspects (e.g., grading, video, content), and 5) modeling evolution of topics in repeated offerings of online courses. These methods have the potential to improve student experience and focus limited instructor resources in ways that will have the most impact. In the latter part of this thesis, I present methods to model multi-relational influence in online professional networks. I test the effectiveness of this model via experimentation on the professional network, LinkedIn. My models can potentially be adapted to address a wide range of problems in real-world networks including predicting user interests, user retention, personalization, and making recommendations

    An algorithm and a tool for the automatic grading of MOOC learners from their contributions in the discussion forum

    Get PDF
    MOOCs (massive open online courses) have a built-in forum where learners can share experiences as well as ask questions and get answers. Nevertheless, the work of the learners in the MOOC forum is usually not taken into account when calculating their grade in the course, due to the difficulty of automating the calculation of that grade in a context with a very large number of learners. In some situations, discussion forums might even be the only available evidence to grade learners. In other situations, forum interactions could serve as a complement for calculating the grade in addition to traditional summative assessment activities. This paper proposes an algorithm to automatically calculate learners' grades in the MOOC forum, considering both the quantitative dimension and the relevance in their contributions. In addition, the algorithm has been implemented within a web application, providing instructors with a visual and a numerical representation of the grade for each learner. An exploratory analysis is carried out to assess the algorithm and the tool with a MOOC on programming, obtaining a moderate positive correlation between the forum grades provided by the algorithm and the grades obtained through the summative assessment activities. Nevertheless, the complementary analysis conducted indicates that this correlation may not be enough to use the forum grades as predictors of the grades obtained through summative assessment activities.This work was supported in part by the FEDER/Ministerio de Ciencia, Innovación y Universidades;Agencia Estatal de Investigación, through the Smartlet Project under Grant TIN2017-85179-C3-1-R, and in part by the Madrid Regional Government through the e-Madrid-CM Project under Grant S2018/TCS-4307, a project which is co-funded by the European Structural Funds (FSE and FEDER). Partial support has also been received from the European Commission through Erasmus+ Capacity Building in the Field of Higher Education projects, more specifically through projects LALA (586120-EPP-1-2017-1-ES-EPPKA2-CBHE-JP), InnovaT (598758-EPP-1-2018-1-AT-EPPKA2-CBHE-JP), and PROF-XXI (609767-EPP-1-2019-1-ES-EPPKA2-CBHE-JP)

    ASPECT-BASED SENTIMENT ANALYSIS FOR UNIVERSITY TEACHING ANALYTICS

    Get PDF
    Aspect-based sentiment analysis (ABSA) is a natural language processing method to analyze sentiments from large amounts of unstructured text in a much more fine-grained manner at the aspect level. In this research work, we apply it to analyze open text replies from surveys regarding online teaching. Like most other educational institutions, Copenhagen Business School (CBS) had to shift to online teaching from one day to the next. Using ABSA, we investigated the impact of this forced online learning experiment on teaching quality in the spring semester of 2020. Our findings reveal that students disliked online teaching due to insufficient information and unadjusted teaching methods. However, students liked its flexibility and possibility to learn at an individual pace. We show that ABSA can extract valuable information in an easily interpretable manner to support teaching and learning processes. Finally, our findings show that ABSA is a valuable tool to analyze unstructured text quantitatively

    Computational Sociolinguistics: A Survey

    Get PDF
    Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

    SocialVisTUM: An Interactive Visualization Toolkit for Correlated Neural Topic Models on Social Media Opinion Mining

    Full text link
    Recent research in opinion mining proposed word embedding-based topic modeling methods that provide superior coherence compared to traditional topic modeling. In this paper, we demonstrate how these methods can be used to display correlated topic models on social media texts using SocialVisTUM, our proposed interactive visualization toolkit. It displays a graph with topics as nodes and their correlations as edges. Further details are displayed interactively to support the exploration of large text collections, e.g., representative words and sentences of topics, topic and sentiment distributions, hierarchical topic clustering, and customizable, predefined topic labels. The toolkit optimizes automatically on custom data for optimal coherence. We show a working instance of the toolkit on data crawled from English social media discussions about organic food consumption. The visualization confirms findings of a qualitative consumer research study. SocialVisTUM and its training procedures are accessible online.Comment: Demo paper accepted for publication on RANLP 2021; 8 pages, 5 figures, 1 tabl
    corecore