49 research outputs found
Analyzing the behavior of students regarding learning activities, badges, and academic dishonesty in MOOC environment
Mención Internacional en el título de doctorThe ‘big data’ scene has brought new improvement opportunities to most products and services,
including education. Web-based learning has become very widespread over the last decade,
which in conjunction with the Massive Open Online Course (MOOC) phenomenon, it has enabled
the collection of large and rich data samples regarding the interaction of students with these educational
online environments.
We have detected different areas in the literature that still need improvement and more research
studies. Particularly, in the context of MOOCs and Small Private Online Courses (SPOCs),
where we focus our data analysis on the platforms Khan Academy, Open edX and Coursera. More
specifically, we are going to work towards learning analytics visualization dashboards, carrying
out an evaluation of these visual analytics tools. Additionally, we will delve into the activity and
behavior of students with regular and optional activities, badges and their online academically
dishonest conduct. The analysis of activity and behavior of students is divided first in exploratory
analysis providing descriptive and inferential statistics, like correlations and group comparisons,
as well as numerous visualizations that facilitate conveying understandable information. Second,
we apply clustering analysis to find different profiles of students for different purposes e.g., to analyze
potential adaptation of learning experiences and pedagogical implications. Third, we also
provide three machine learning models, two of them to predict learning outcomes (learning gains
and certificate accomplishment) and one to classify submissions as illicit or not. We also use these
models to discuss about the importance of variables.
Finally, we discuss our results in terms of the motivation of students, student profiling,
instructional design, potential actuators and the evaluation of visual analytics dashboards
providing different recommendations to improve future educational experiments.Las novedades en torno al ‘big data’ han traído nuevas oportunidades de mejorar la mayoría
de productos y servicios, incluyendo la educación. El aprendizaje mediante tecnologías web se
ha extendido mucho durante la última década, que conjuntamente con el fenómeno de los cursos
abiertos masivos en línea (MOOCs), ha permitido que se recojan grandes y ricas muestras de
datos sobre la interacción de los estudiantes con estos entornos virtuales de aprendizaje.
Nosotros hemos detectado diferentes áreas en la literatura que aún necesitan de mejoras y del
desarrollo de más estudios, específicamente en el contexto de MOOCs y cursos privados pequeños
en línea (SPOCs). En la tesis nos hemos enfocado en el análisis de datos en las plataformas Khan
Academy, Open edX y Coursera. Más específicamente, vamos a trabajar en interfaces de visualizaciones
de analítica de aprendizaje, llevando a cabo la evaluación de estas herramientas
de analítica visual. Además, profundizaremos en la actividad y el comportamiento de los estudiantes
con actividades comunes y opcionales, medallas y sus conductas en torno a la deshonestidad
académica. Este análisis de actividad y comportamiento comienza primero con análisis
exploratorio proporcionando variables descriptivas y de inferencia estadística, como correlaciones
y comparaciones entre grupos, así como numerosas visualizaciones que facilitan la transmisión
de información inteligible. En segundo lugar aplicaremos técnicas de agrupamiento para encontrar
distintos perfiles de estudiantes con diferentes propósitos, como por ejemplo para analizar
posibles adaptaciones de experiencias educativas y sus implicaciones pedagógicas. También proporcionamos
tres modelos de aprendizaje máquina, dos de ellos que predicen resultados finales
de aprendizaje (ganancias de aprendizaje y la consecución de certificados de terminación) y uno
para clasificar que ejercicios han sido entregados de forma deshonesta. También usaremos estos
tres modelos para analizar la importancia de las variables.
Finalmente, discutimos todos los resultados en términos de la motivación de los estudiantes,
diferentes perfiles de estudiante, diseño instruccional, posibles sistemas actuadores, así como la
evaluación de interfaces de analítica visual, proporcionando recomendaciones que pueden ayudar
a mejorar futuras experiencias educacionales.Programa Oficial de Doctorado en Ingeniería TelemáticaPresidente: Davinia Hernández Leo.- Secretario: Luis Sánchez Fernández.- Vocal: Adolfo Ruiz Callej
El Proceso de Implementación de Analíticas de Aprendizaje
With the popularity takeoff of the learning analytics area during the last decade, numerous research studies have emerged and public opinion has echoed this trend as well. However, the fact is that the impact the field has had in practice has been quite limited, and there has been little transfer to educational institutions. One of the possible causes is the high complexity of the field, and that there are no clear implementation processes; therefore, in this work, we propose a pragmatic implementation process of learning analytics in five stages: 1) learning environments, 2) raw data capture, 3) data tidying and feature engineering, 4) analysis and modelling and 5) educational application. In addition, we also review a series of transverse factors that affect this implementation, like technology, learning sciences, privacy, institutions, and educational policies. The detailed process can be helpful for researchers, educational data analysts, teachers and educational institutions that are looking to start working in this area. Achieving the true potential of learning analytics will require close collaboration and conversation between all the actors involved in their development, which might eventually lead to the desired systematic and productive implementation.Con el despegue de la popularidad del área de analítica de aprendizaje durante la última década, numerosas investigaciones han surgido y la opinión pública se ha hecho eco de esta tendencia. Sin embargo, la realidad es que el impacto que ha tenido en la práctica ha sido bastante bajo, y se está produciendo poca transferencia a las instituciones educativas. Una de las posibles causas es la elevada complejidad del campo, y que no existan procesos claros; por ello, en este trabajo, se propone un pragmático proceso de implementación de analíticas de aprendizaje en cinco etapas: 1) entornos de aprendizaje, 2) recolección de datos en crudo, 3) manipulación de datos e ingeniería de características, 4) análisis y modelos y 5) aplicación educacional. Además, se revisan una serie de factores transversales que afectan esta implementación, como la tecnología, ciencias del aprendizaje, privacidad, instituciones y políticas educacionales. El proceso que se detalla puede resultar de utilidad para investigadores, analistas de datos educacionales, educadores e instituciones educativas que busquen introducirse en el área. Alcanzar el verdadero potencial de las analíticas de aprendizaje requerirá de estrecha colaboración y conversación entre todos los actores involucrados en su desarrollo, que permita su implementación de forma sistemática y productiva
Analítica del aprendizaje y educación basada en datos: Un campo en expansión
The growing presence of digital mediation systems in most educational spaces —whether face-to-face or not, formalized or open, and at basic or lifelong learning levels— has accelerated the advance of learning analytics and the use of data in education as a common practice. Using digital educational tools facilitates the interaction between students, teachers and learning resources in the digital world, and generates a remarkable volume of data that can be analyzed by applying a variety of methodologies. Thus, research focused on information generated by student activity in digital spaces has risen exponentially. Based on this evidence, this special issue shows a set of studies in the field of data-driven educational research and the field of digital learning, which enriches knowledge about learning processes and management of teaching in digitally mediated spaces.La creciente utilización de sistemas de mediación digital en la mayoría de espacios educativos —ya sean presenciales o no, formales o abiertos, y tanto en el nivel de educación básica como en situaciones de aprendizaje a lo largo de la vida— está acelerando el avance de la analítica del aprendizaje y haciendo que el uso de la información digital sea una práctica común en el campo de la educación. Las herramientas educativas digitales facilitan la interacción entre estudiantes, profesores y recursos de aprendizaje, y generan de manera continua un notable volumen de datos que pueden analizarse aplicando una variedad de metodologías. Esto ha hecho que aumenten exponencialmente las investigaciones que toman como referencia la información que procede de la actividad de los estudiantes en esos espacios digitales. Partiendo de esas evidencias, este número especial muestra un conjunto de estudios en el campo del aprendizaje digital y la investigación educativa basada en datos, que enriquecen el conocimiento sobre los procesos de aprendizaje y la gestión de la enseñanza en espacios mediados digitalmente
A Survey on Data-Driven Evaluation of Competencies and Capabilities Across Multimedia Environments
The rapid evolution of technology directly impacts the skills and jobs needed in the next decade. Users can, intentionally or unintentionally, develop different skills by creating, interacting with, and consuming the content from online environments and portals where informal learning can emerge. These environments generate large amounts of data; therefore, big data can have a significant impact on education. Moreover, the educational landscape has been shifting from a focus on contents to a focus on competencies and capabilities that will prepare our society for an unknown future during the 21st century. Therefore, the main goal of this literature survey is to examine diverse technology-mediated environments that can generate rich data sets through the users’ interaction and where data can be used to explicitly or implicitly perform a data-driven evaluation of different competencies and capabilities. We thoroughly and comprehensively surveyed the state of the art to identify and analyse digital environments, the data they are producing and the capabilities they can measure and/or develop. Our survey revealed four key multimedia environments that include sites for content sharing & consumption, video games, online learning and social networks that fulfilled our goal. Moreover, different methods were used to measure a large array of diverse capabilities such as expertise, language proficiency and soft skills. Our results prove the potential of the data from diverse digital environments to support the development of lifelong and lifewide 21st-century capabilities for the future society
Identifying Experts in Question \& Answer Portals: A Case Study on Data Science Competencies in Reddit
The irreplaceable key to the triumph of Question & Answer (Q&A) platforms is
their users providing high-quality answers to the challenging questions posted
across various topics of interest. Recently, the expert finding problem
attracted much attention in information retrieval research. In this work, we
inspect the feasibility of supervised learning model to identify data science
experts in Reddit. Our method is based on the manual coding results where two
data science experts labelled expert, non-expert and out-of-scope comments. We
present a semi-supervised approach using the activity behaviour of every user,
including Natural Language Processing (NLP), crowdsourced and user feature
sets. We conclude that the NLP and user feature sets contribute the most to the
better identification of these three classes It means that this method can
generalise well within the domain. Moreover, we present different types of
users, which can be helpful to detect various types of users in the future
Analyzing and testing viewability methods in an advertising network
Many of the current online businesses base completely their revenue models in earnings from online advertisement. A problematic fact is that according to recent studies more than half of display ads are not being detected as viewable. The International Advertising Bureau (IAB) has defined a viewable impression as an impression that at least 50% of its pixels are rendered in the viewport during at least one continuous second. Although there is agreement on this definition for measuring viewable impressions in the industry, there is no systematic methodologies on how it should be implemented or the trustworthiness of these methods. In fact, the Media Rating Council (MRC) announced that there are inconsistencies across multiple reports attempting to measure this metric. In order to understand the magnitude of the problem, we conduct an analysis of different methods to track viewable impressions. Then, we test a subset of geometric and strong interaction methods in a webpage registered in the worldwide ad-network ExoClick, which currently serves over 7 billion geo-targeted ads a day to a global network of 65000 web/mobile publisher platforms. We find that the Intersection Observer API is the method that detects more viewable impressions given its robustness towards the technological constraints that face the rest of implementations available. The motivation of this work is to better understand the limitations and advantages of such methods, which can have an impact at a standardisation level in online advertising industry, as well as to provide guidelines for future research based on the lessons learned.This work was possible thanks to the support of “Plan de Doctorados Industriales de la Secretaría de Universidades e Investigación del Departamento de Empresa y Conocimiento de la Generalitat de Catalunya” and the Spanish Ministry of Economy and Competitiveness through the Juan de la Cierva Formación program (FJCI-2017-34926). We also want to thank ExoClick for their support in conducting thisresearchPeer ReviewedPostprint (published version
Technology, computation and artificial intelligence to improve the web ecosystem
Postprint (published version
A Systematic Literature Review of Digital Game-based Assessment Empirical Studies: Current Trends and Open Challenges
Technology has become an essential part of our everyday life, and its use in
educational environments keeps growing. In addition, games are one of the most
popular activities across cultures and ages, and there is ample evidence that
supports the benefits of using games for assessment. This field is commonly
known as game-based assessment (GBA), which refers to the use of games to
assess learners' competencies, skills, or knowledge. This paper analyzes the
current status of the GBA field by performing the first systematic literature
review on empirical GBA studies, based on 66 research papers that used digital
GBAs to determine: (1) the context where the study has been applied, (2) the
primary purpose, (3) the knowledge domain of the game used, (4) game/tool
availability, (5) the size of the data sample, (6) the data science techniques
and algorithms applied, (7) the targeted stakeholders of the study, and (8)
what limitations and challenges are reported by authors. Based on the
categories established and our analysis, the findings suggest that GBAs are
mainly used in formal education and for assessment purposes, and most GBAs
focus on assessing STEM content and cognitive skills. Furthermore, the current
limitations indicate that future GBA research would benefit from the use of
bigger data samples and more specialized algorithms. Based on our results, we
discuss the status of the field with the current trends and the open challenges
(including replication and validation problems) providing recommendations for
the future research agenda of the GBA field.Comment: 23 pages, 12 figures, 1 tabl
Identifying Professional Photographers Through Image Quality and Aesthetics in Flickr
In our generation, there is an undoubted rise in the use of social media and
specifically photo and video sharing platforms. These sites have proved their
ability to yield rich data sets through the users' interaction which can be
used to perform a data-driven evaluation of capabilities. Nevertheless, this
study reveals the lack of suitable data sets in photo and video sharing
platforms and evaluation processes across them. In this way, our first
contribution is the creation of one of the largest labelled data sets in Flickr
with the multimodal data which has been open sourced as part of this
contribution. Predicated on these data, we explored machine learning models and
concluded that it is feasible to properly predict whether a user is a
professional photographer or not based on self-reported occupation labels and
several feature representations out of the user, photo and crowdsourced sets.
We also examined the relationship between the aesthetics and technical quality
of a picture and the social activity of that picture. Finally, we depicted
which characteristics differentiate professional photographers from
non-professionals. As far as we know, the results presented in this work
represent an important novelty for the users' expertise identification which
researchers from various domains can use for different applications
Design, Implementation and Evaluation of SPOCs at the Universidad Carlos III de Madrid
The Universidad Carlos III de Madrid has been offering several face-to-face remedial courses for new students to review or learn concepts and practical skills that they should know before starting their degree program. During 2012 and 2013, our University adopted MOOC-like technologies to support some of these courses so that a blended learning methodology could be applied in a particular educational context, i.e. by using SPOCs (Small Private Online Courses). This paper gathers a list of issues, challenges and solutions when implementing these SPOCs. Based on these challenges and issues, a design process is proposed for the implementation of SPOCs. In addition, an evaluation is presented of the different use of the offered courses based on indicators such as the number of videos accessed, number of exercises accessed, number of videos completed, number of exercises correctly solved or time spent on the platform.Work partially funded by the RESET project under grant no. TIN2014-53199-C3-1-R (funded by the Spanish Ministry of Economy and Competitiveness), the REMEDISS project under grant no. IPT-2012-0882-430000 (funded by the Spanish Ministry of Economy and Competitiveness) and the “eMadrid” project (funded by the Regional Government of Madrid) under grant no. S2013/ICE-2715. Carlos Delgado Kloos wishes to acknowledge support from Fundación CajaMadrid to visit Harvard University and MIT in the academic year 2012-13