338 research outputs found

    Analyzing the behavior of students regarding learning activities, badges, and academic dishonesty in MOOC environment

    Get PDF
    Mención Internacional en el título de doctorThe ‘big data’ scene has brought new improvement opportunities to most products and services, including education. Web-based learning has become very widespread over the last decade, which in conjunction with the Massive Open Online Course (MOOC) phenomenon, it has enabled the collection of large and rich data samples regarding the interaction of students with these educational online environments. We have detected different areas in the literature that still need improvement and more research studies. Particularly, in the context of MOOCs and Small Private Online Courses (SPOCs), where we focus our data analysis on the platforms Khan Academy, Open edX and Coursera. More specifically, we are going to work towards learning analytics visualization dashboards, carrying out an evaluation of these visual analytics tools. Additionally, we will delve into the activity and behavior of students with regular and optional activities, badges and their online academically dishonest conduct. The analysis of activity and behavior of students is divided first in exploratory analysis providing descriptive and inferential statistics, like correlations and group comparisons, as well as numerous visualizations that facilitate conveying understandable information. Second, we apply clustering analysis to find different profiles of students for different purposes e.g., to analyze potential adaptation of learning experiences and pedagogical implications. Third, we also provide three machine learning models, two of them to predict learning outcomes (learning gains and certificate accomplishment) and one to classify submissions as illicit or not. We also use these models to discuss about the importance of variables. Finally, we discuss our results in terms of the motivation of students, student profiling, instructional design, potential actuators and the evaluation of visual analytics dashboards providing different recommendations to improve future educational experiments.Las novedades en torno al ‘big data’ han traído nuevas oportunidades de mejorar la mayoría de productos y servicios, incluyendo la educación. El aprendizaje mediante tecnologías web se ha extendido mucho durante la última década, que conjuntamente con el fenómeno de los cursos abiertos masivos en línea (MOOCs), ha permitido que se recojan grandes y ricas muestras de datos sobre la interacción de los estudiantes con estos entornos virtuales de aprendizaje. Nosotros hemos detectado diferentes áreas en la literatura que aún necesitan de mejoras y del desarrollo de más estudios, específicamente en el contexto de MOOCs y cursos privados pequeños en línea (SPOCs). En la tesis nos hemos enfocado en el análisis de datos en las plataformas Khan Academy, Open edX y Coursera. Más específicamente, vamos a trabajar en interfaces de visualizaciones de analítica de aprendizaje, llevando a cabo la evaluación de estas herramientas de analítica visual. Además, profundizaremos en la actividad y el comportamiento de los estudiantes con actividades comunes y opcionales, medallas y sus conductas en torno a la deshonestidad académica. Este análisis de actividad y comportamiento comienza primero con análisis exploratorio proporcionando variables descriptivas y de inferencia estadística, como correlaciones y comparaciones entre grupos, así como numerosas visualizaciones que facilitan la transmisión de información inteligible. En segundo lugar aplicaremos técnicas de agrupamiento para encontrar distintos perfiles de estudiantes con diferentes propósitos, como por ejemplo para analizar posibles adaptaciones de experiencias educativas y sus implicaciones pedagógicas. También proporcionamos tres modelos de aprendizaje máquina, dos de ellos que predicen resultados finales de aprendizaje (ganancias de aprendizaje y la consecución de certificados de terminación) y uno para clasificar que ejercicios han sido entregados de forma deshonesta. También usaremos estos tres modelos para analizar la importancia de las variables. Finalmente, discutimos todos los resultados en términos de la motivación de los estudiantes, diferentes perfiles de estudiante, diseño instruccional, posibles sistemas actuadores, así como la evaluación de interfaces de analítica visual, proporcionando recomendaciones que pueden ayudar a mejorar futuras experiencias educacionales.Programa Oficial de Doctorado en Ingeniería TelemáticaPresidente: Davinia Hernández Leo.- Secretario: Luis Sánchez Fernández.- Vocal: Adolfo Ruiz Callej

    Predicting Certification in MOOCs based on Students’ Weekly Activities

    Get PDF
    Massive Open Online Courses (MOOCs) have been growing rapidly, offering low-cost knowledge for both learners and content providers. However, currently there is a very low level of course purchasing (less than 1% of the total number of enrolled students on a given online course opt to purchase its certificate). This can impact seriously the business model of MOOCs. Nevertheless, MOOC research on learners’ purchasing behaviour on MOOCs remains limited. Thus, the umbrella question that this work tackles is if learner’s data can predict their purchasing decision (certification). Our fine-grained analysis attempts to uncover the latent correlation between learner activities and their decision to purchase. We used a relatively large dataset of 5 courses of 23 runs obtained from the less studied MOOC platform of FutureLearn to: (1) statistically compare the activities of non-paying learners with course purchasers, (2) predict course certification using different classifiers, optimising for this naturally strongly imbalanced dataset. Our results show that learner activities are good predictors of course purchasibility; still, the main challenge was that of early prediction. Using only student number of step accesses, attempts, correct and wrong answers, our model achieve promising accuracies, ranging between 0.81 and 0.95 across the five courses. The outcomes of this study are expected to help design future courses and predict the profitability of future runs; it may also help determine what personalisation features could be provided to increase MOOC revenu

    Predicting Paid Certification in Massive Open Online Courses

    Get PDF
    Massive open online courses (MOOCs) have been proliferating because of the free or low-cost offering of content for learners, attracting the attention of many stakeholders across the entire educational landscape. Since 2012, coined as “the Year of the MOOCs”, several platforms have gathered millions of learners in just a decade. Nevertheless, the certification rate of both free and paid courses has been low, and only about 4.5–13% and 1–3%, respectively, of the total number of enrolled learners obtain a certificate at the end of their courses. Still, most research concentrates on completion, ignoring the certification problem, and especially its financial aspects. Thus, the research described in the present thesis aimed to investigate paid certification in MOOCs, for the first time, in a comprehensive way, and as early as the first week of the course, by exploring its various levels. First, the latent correlation between learner activities and their paid certification decisions was examined by (1) statistically comparing the activities of non-paying learners with course purchasers and (2) predicting paid certification using different machine learning (ML) techniques. Our temporal (weekly) analysis showed statistical significance at various levels when comparing the activities of non-paying learners with those of the certificate purchasers across the five courses analysed. Furthermore, we used the learner’s activities (number of step accesses, attempts, correct and wrong answers, and time spent on learning steps) to build our paid certification predictor, which achieved promising balanced accuracies (BAs), ranging from 0.77 to 0.95. Having employed simple predictions based on a few clickstream variables, we then analysed more in-depth what other information can be extracted from MOOC interaction (namely discussion forums) for paid certification prediction. However, to better explore the learners’ discussion forums, we built, as an original contribution, MOOCSent, a cross- platform review-based sentiment classifier, using over 1.2 million MOOC sentiment-labelled reviews. MOOCSent addresses various limitations of the current sentiment classifiers including (1) using one single source of data (previous literature on sentiment classification in MOOCs was based on single platforms only, and hence less generalisable, with relatively low number of instances compared to our obtained dataset;) (2) lower model outputs, where most of the current models are based on 2-polar iii iv classifier (positive or negative only); (3) disregarding important sentiment indicators, such as emojis and emoticons, during text embedding; and (4) reporting average performance metrics only, preventing the evaluation of model performance at the level of class (sentiment). Finally, and with the help of MOOCSent, we used the learners’ discussion forums to predict paid certification after annotating learners’ comments and replies with the sentiment using MOOCSent. This multi-input model contains raw data (learner textual inputs), sentiment classification generated by MOOCSent, computed features (number of likes received for each textual input), and several features extracted from the texts (character counts, word counts, and part of speech (POS) tags for each textual instance). This experiment adopted various deep predictive approaches – specifically that allow multi-input architecture - to early (i.e., weekly) investigate if data obtained from MOOC learners’ interaction in discussion forums can predict learners’ purchase decisions (certification). Considering the staggeringly low rate of paid certification in MOOCs, this present thesis contributes to the knowledge and field of MOOC learner analytics with predicting paid certification, for the first time, at such a comprehensive (with data from over 200 thousand learners from 5 different discipline courses), actionable (analysing learners decision from the first week of the course) and longitudinal (with 23 runs from 2013 to 2017) scale. The present thesis contributes with (1) investigating various conventional and deep ML approaches for predicting paid certification in MOOCs using learner clickstreams (Chapter 5) and course discussion forums (Chapter 7), (2) building the largest MOOC sentiment classifier (MOOCSent) based on learners’ reviews of the courses from the leading MOOC platforms, namely Coursera, FutureLearn and Udemy, and handles emojis and emoticons using dedicated lexicons that contain over three thousand corresponding explanatory words/phrases, (3) proposing and developing, for the first time, multi-input model for predicting certification based on the data from discussion forums which synchronously processes the textual (comments and replies) and numerical (number of likes posted and received, sentiments) data from the forums, adapting the suitable classifier for each type of data as explained in detail in Chapter 7

    An approach to build in situ models for the prediction of the decrease of academic engagement indicators in Massive Open Online Courses

    Get PDF
    Producción CientíficaThe early detection of learners who are expected to disengage with typical MOOC tasks such as watching lecture videos or submitting assignments is necessary to enable timely interventions aimed at preventing it. This can be done by predicting the decrease of academic engagement indicators that can be derived for di_erent MOOC tasks and computed for each learner. A posteriori prediction models can yield a good performance but cannot be built using the information that is available in an ongoing course at the moment the predictions are required. This paper proposes an approach to build in situ prediction models using such information. Models were derived following both approaches and employed to predict the decrease of three indicators that quantify the engagement of learners with the main tasks typically proposed in a MOOC: watching lectures, solving _nger exercises, and submitting assignments. The results show that in situ models yielded a good performance for the prediction of all engagement indicators, thus showing the feasibility of the proposed approach. This performance was very similar to that of a posteriori models, which have the clear disadvantage that they cannot be used to make predictions in an ongoing course based on its data.Ministerio de Economía, Industria y Competitividad (Projects TIN2014-53199-C3-2-R (AEI, FEDER), TIN2017-85179-C3-2-R)Junta de Castilla y León (programa de apoyo a proyectos de investigación - Ref. VA277U14)European Commission (Proyect 588438-EPP-1-2017-1-EL-EPPKA2-KA

    Who is a Student: Completion in Coursera Courses at Duke University

    Get PDF
    Much of the interest in MOOCs centers on questions about who completes them. Duke’s Coursera-based Massive Open Online Courses (MOOCs) confirm many demographic trends previously delineated by researchers at peer institutions. As found in previous research, this study found individuals who speak English as a first language and who already earned at least a bachelor’s degree are the most likely to complete a Coursera course. MOOC researchers to date have not, however, developed clear operational definitions about who constitutes a learner at the outset of the course. This paper proposes some possible definitions to standardize future research. Further, this study looked at factors that predict different learner participation levels and investigated which activities predict Coursera course completion. Study results indicated that viewing online forums and participation in online discussions are both predictive of course completion. The findings suggest that the socio-demographic composition of the group being investigated will depend on how researchers elect to define what a student is. Thus, while any of the definitions presented in this paper may be appropriate, depending on what is being studied, the decision of which definition to use should be intentional

    Generalizing predictive models of admission test success based on online interactions

    Get PDF
    This article belongs to the Special Issue Sustainability of Learning AnalyticsTo start medical or dentistry studies in Flanders, prospective students need to pass a central admission test. A blended program with four Small Private Online Courses (SPOCs) was designed to support those students. The logs from the platform provide an opportunity to delve into the learners' interactions and to develop predictive models to forecast success in the test. Moreover, the use of different courses allows analyzing how models can generalize across courses. This article has the following objectives: (1) to develop and analyze predictive models to forecast who will pass the admission test, (2) to discover which variables have more effect on success in different courses, (3) to analyze to what extent models can be generalized to other courses and subsequent cohorts, and (4) to discuss the conditions to achieve generalizability. The results show that the average grade in SPOC exercises using only first attempts is the best predictor and that it is possible to transfer predictive models with enough reliability when some context-related conditions are met. The best performance is achieved when transferring within the same cohort to other SPOCs in a similar context. The performance is still acceptable in a consecutive edition of a course. These findings support the sustainability of predictive models.This work was partially funded by the LALA project (grant no. 586120-EPP-1-2017-1-ES-EPPKA2-CBHE-JP). The LALA project has been funded with support from the European Commission. In addition, this work has been partially funded by FEDER/Ministerio de Ciencia, Innovación y Universidades—Agencia Estatal de Investigación/project Smartlet (TIN2017-85179-C3-1-R) and by the Madrid Regional Government through the project e-Madrid-CM (S2018/TCS-4307). The latter is also cofinanced by the Structural Funds (FSE and FEDER). It has also been supported by the Spanish Ministry of Science, Innovation, and Universities, under an FPU fellowship (FPU016/00526
    corecore