Search CORE

9 research outputs found

Predicting Engagement in Video Lectures

Author: Bulathwela Sahan
Lipani Aldo
Pérez-Ortiz María
Shawe-Taylor John
Yilmaz Emine
Publication venue
Publication date: 01/01/2020
Field of study

The explosion of Open Educational Resources (OERs) in the recent years creates the demand for scalable, automatic approaches to process and evaluate OERs, with the end goal of identifying and recommending the most suitable educational materials for learners. We focus on building models to find the characteristics and features involved in context-agnostic engagement (i.e. population-based), a seldom researched topic compared to other contextualised and personalised approaches that focus more on individual learner engagement. Learner engagement, is arguably a more reliable measure than popularity/number of views, is more abundant than user ratings and has also been shown to be a crucial component in achieving learning outcomes. In this work, we explore the idea of building a predictive model for population-based engagement in education. We introduce a novel, large dataset of video lectures for predicting context-agnostic engagement and propose both cross-modal and modality-specific feature sets to achieve this task. We further test different strategies for quantifying learner engagement signals. We demonstrate the use of our approach in the case of data scarcity. Additionally, we perform a sensitivity analysis of the best performing model, which shows promising performance and can be easily integrated into an educational recommender system for OERs.Comment: In Proceedings of International Conference on Educational Data Mining 202

arXiv.org e-Print Archive

UCL Discovery

X5Learn: A Personalised Learning Companion at the Intersection of AI and HCI

Author: Bulathwela S
Dormann C
Kreitmayer S
Noss R
Perez-Ortiz M
Rogers Y
Shawe-Taylor J
Yilmaz E
Publication venue: ACM: Association for Computing Machinery
Publication date: 14/04/2021
Field of study

X5Learn (available at https://x5learn.org ) is a human-centered AI-powered platform for supporting access to free online educational resources. X5Learn provides users with a number of educational tools for interacting with open educational videos, and a set of tools adapted to suit the pedagogical preferences of users. It is intended to support both teachers and students, alike. For teachers, it provides a powerful platform to reuse, revise, remix, and redistribute open courseware produced by others. These can be videos, pdfs, exercises and other online material. For students, it provides a scaffolded and informative interface to select content to watch, read, make notes and write reviews, as well as a powerful personalised recommendation system that can optimise learning paths and adjust to the user's learning preferences. What makes X5Learn stand out from other educational platforms, is how it combines human-centered design with AI algorithms and software tools with the goal of making it intuitive and easy to use, as well as making the AI transparent to the user. We present the core search tool of X5Learn, intended to support exploring open educational materials

UCL Discovery

PEEK: A Large Dataset of Learner Engagement with Educational Videos

Author: Bulathwela Sahan
Novak Erik
Perez-Ortiz Maria
Shawe-Taylor John
Yilmaz Emine
Publication venue: ORSUM
Publication date: 02/10/2021
Field of study

Educational recommenders have received much less attention in comparison to e-commerce and entertainment-related recommenders, even though efficient intelligent tutors have great potential to improve learning gains. One of the main challenges in advancing this research direction is the scarcity of large, publicly available datasets. In this work, we release a large, novel dataset of learners engaging with educational videos in-the-wild. The dataset, named Personalised Educational Engagement with Knowledge Topics PEEK, is the first publicly available dataset of this nature. The video lectures have been associated with Wikipedia concepts related to the material of the lecture, thus providing a humanly intuitive taxonomy. We believe that granular learner engagement signals in unison with rich content representations will pave the way to building powerful personalization algorithms that will revolutionise educational and informational recommendation systems. Towards this goal, we 1) construct a novel dataset from a popular video lecture repository, 2) identify a set of benchmark algorithms to model engagement, and 3) run extensive experimentation on the PEEK dataset to demonstrate its value. Our experiments with the dataset show promise in building powerful informational recommender systems. The dataset and the support code is available publicly

UCL Discovery

VLEngagement: A Dataset of Scientific Video Lectures for Evaluating Population-based Engagement

Author: Bulathwela Sahan
Perez-Ortiz Maria
Shawe-Taylor John
Yilmaz Emine
Publication venue
Publication date: 02/11/2020
Field of study

With the emergence of e-learning and personalised education, the production and distribution of digital educational resources have boomed. Video lectures have now become one of the primary modalities to impart knowledge to masses in the current digital age. The rapid creation of video lecture content challenges the currently established human-centred moderation and quality assurance pipeline, demanding for more efficient, scalable and automatic solutions for managing learning resources. Although a few datasets related to engagement with educational videos exist, there is still an important need for data and research aimed at understanding learner engagement with scientific video lectures. This paper introduces VLEngagement, a novel dataset that consists of content-based and video-specific features extracted from publicly available scientific video lectures and several metrics related to user engagement. We introduce several novel tasks related to predicting and understanding context-agnostic engagement in video lectures, providing preliminary baselines. This is the largest and most diverse publicly available dataset to our knowledge that deals with such tasks. The extraction of Wikipedia topic-based features also allows associating more sophisticated Wikipedia based features to the dataset to improve the performance in these tasks. The dataset, helper tools and example code snippets are available publicly at https://github.com/sahanbull/context-agnostic-engagemen

arXiv.org e-Print Archive

UCL Discovery

Watch Less and Uncover More: Could Navigation Tools Help Users Search and Explore Videos?

Author: Bulathwela Sahan
Dormann Claire
Kreitmayer Stefan
Noss Richard
Perez-Ortiz Maria
Rogers Yvonne
Shawe-Taylor John
Verma Meghana
Yilmaz Emine
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 10/01/2022
Field of study

Prior research has shown how ‘content preview tools’ improve speed and accuracy of user relevance judgements across different information retrieval tasks. This paper describes a novel user interface tool, the Content Flow Bar, designed to allow users to quickly identify relevant fragments within informational videos to facilitate browsing, through a cognitively augmented form of navigation. It achieves this by providing semantic “snippets” that enable the user to rapidly scan through video content. The tool provides visuallyappealing pop-ups that appear in a time series bar at the bottom of each video, allowing to see in advance and at a glance how topics evolve in the content. We conducted a user study to evaluate how the tool changes the users search experience in video retrieval, as well as how it supports exploration and information seeking. The user questionnaire revealed that participants found the Content Flow Bar helpful and enjoyable for finding relevant information in videos. The interaction logs of the user study, where participants interacted with the tool for completing two informational tasks, showed that it holds promise for enhancing discoverability of content both across and within videos. This discovered potential could leverage a new generation of navigation tools in search and information retrieval

arXiv.org e-Print Archive

UCL Discovery

Power to the Learner: Towards Human-Intuitive and Integrative Recommendations with Open Educational Resources

Author: Bulathwela Sahan
Perez-Ortiz Maria
Shawe-Taylor John
Yilmaz Emine
Publication venue: 'MDPI AG'
Publication date: 01/09/2022
Field of study

Educational recommenders have received much less attention in comparison with e-commerce- and entertainment-related recommenders, even though efficient intelligent tutors could have potential to improve learning gains and enable advances in education that are essential to achieving the world’s sustainability agenda. Through this work, we make foundational advances towards building a state-aware, integrative educational recommender. The proposed recommender accounts for the learners’ interests and knowledge at the same time as content novelty and popularity, with the end goal of improving predictions of learner engagement in a lifelong-learning educational video platform. Towards achieving this goal, we (i) formulate and evaluate multiple probabilistic graphical models to capture learner interest; (ii) identify and experiment with multiple probabilistic and ensemble approaches to combine interest, novelty, and knowledge representations together; and (iii) identify and experiment with different hybrid recommender approaches to fuse population-based engagement prediction to address the cold-start problem, i.e., the scarcity of data in the early stages of a user session, a common challenge in recommendation systems. Our experiments with an in-the-wild interaction dataset of more than 20,000 learners show clear performance advantages by integrating content popularity, learner interest, novelty, and knowledge aspects in an informational recommender system, while preserving scalability. Our recommendation system integrates a human-intuitive representation at its core, and we argue that this transparency will prove important in efforts to give agency to the learner in interacting, collaborating, and governing their own educational algorithms

UCL Discovery

A Closer Look into Recent Video-based Learning Research: A Comprehensive Review of Video Characteristics, Tools, Technologies, and Learning Effectiveness

Author: Ewerth Ralph
Hoppe Anett
Navarrete Evelyn
Nehring Andreas
Schanze Sascha
Publication venue
Publication date: 11/08/2023
Field of study

People increasingly use videos on the Web as a source for learning. To support this way of learning, researchers and developers are continuously developing tools, proposing guidelines, analyzing data, and conducting experiments. However, it is still not clear what characteristics a video should have to be an effective learning medium. In this paper, we present a comprehensive review of 257 articles on video-based learning for the period from 2016 to 2021. One of the aims of the review is to identify the video characteristics that have been explored by previous work. Based on our analysis, we suggest a taxonomy which organizes the video characteristics and contextual aspects into eight categories: (1) audio features, (2) visual features, (3) textual features, (4) instructor behavior, (5) learners activities, (6) interactive features (quizzes, etc.), (7) production style, and (8) instructional design. Also, we identify four representative research directions: (1) proposals of tools to support video-based learning, (2) studies with controlled experiments, (3) data analysis studies, and (4) proposals of design guidelines for learning videos. We find that the most explored characteristics are textual features followed by visual features, learner activities, and interactive features. Text of transcripts, video frames, and images (figures and illustrations) are most frequently used by tools that support learning through videos. The learner activity is heavily explored through log files in data analysis studies, and interactive features have been frequently scrutinized in controlled experiments. We complement our review by contrasting research findings that investigate the impact of video characteristics on the learning effectiveness, report on tasks and technologies used to develop tools that support learning, and summarize trends of design guidelines to produce learning video

arXiv.org e-Print Archive

Prince of Songkla University Students’ Dropout Prediction Using Machine Learning

Author: กฤตกร อินแพง
Publication venue: มหาวิทยาลัยสงขลานครินทร์
Publication date: 01/01/2022
Field of study

วิทยาศาสตรมหาบัณฑิต (วิทยาการข้อมูล), 2565Student retention rate plays a critical role and serves as an essential indicator of a tertiary institution’s success. However, not all first-time students complete their program at the same institution within a specified period of time: some students drop out of the program. Prince of Songkla University Hatyai Campus is no exception. From Academic Years 2013-2017, the student dropout rates rose by 19.18%. This research study adopted data mining and machine learning techniques to explore factors that predict the likelihood of a student dropping out, and to create a learning model of five-decision trees, which will be used for the prediction of the student dropouts. Data were collected from 33,930 students of Prince of Songkla University Hatyai Campus, from 6 intakes ranging from Academic Years 2015-2020, and with 39 variables. Collected data cover students’ learning achievements, students’ basic information, and students’ family background. Data were classified into two categories: Undergraduate and Postgraduate. As for undergraduate category, the study found that Light Gradient Boosting Machine is the most appropriate methodology, as it yielded the highest value of the area under the curve of 93.03%, and the accuracy value of 89.99%. The top factors that predict the likelihood for student dropouts include the accumulated (overall) grade point average (GPAX); academic year; Grade Point Average (GPA); semester; pre-university GPAX; and pre-university English scores, respectively. As for postgraduate category, the study found that the Random Forest is the most appropriate methodology, as it yielded the highest value of the area under the curve of 78.86%, and the accuracy value of 85.28%. The top factors that predict the likelihood for student dropouts include GPAX; academic year; semester; social and humanity science; GPA; supplementary class; and Plan A, A2-Type, respectively. In the final procedure, the researcher implemented the obtained models for making a prediction with the actual data, and visually presented the results of the analysis in the dashboard report, which can be used for monitoring possible risks. This will enable respective staff to give immediate assistance to the students who are in needs or show the likelihood to drop out, and help the management board in making decisions and devising management plans to minimize the dropout rate in their institution.อัตราการคงอยู่ของนักศึกษาเป็นส่วนสำคัญและเป็นตัวชี้วัดหนึ่งในการวัดความสำเร็จของสถาบันการศึกษา อย่างไรก็ตามผู้ที่เข้ามาศึกษาไม่สามารถสำเร็จการศึกษาในระบบได้ทั้งหมด เนื่องจากส่วนหนึ่งต้องออกจากการศึกษากลางคัน เช่นเดียวกับมหาวิทยาลัยสงขลานครินทร์ วิทยาเขตหาดใหญ่ โดยตั้งแต่ปีการศึกษา พ.ศ. 2556 ถึง 2560 พบว่าอัตราการออกกลางคันของนักศึกษาเพิ่มสูงขึ้นอยู่ที่ร้อยละ 19.18 ด้วยเหตุนี้ผู้วิจัยจึงได้นำเสนอการนำเทคนิคเหมืองข้อมูลและการเรียนรู้ของเครื่องมาวิเคราะห์เพื่อค้นหาคุณลักษณะที่สำคัญและสร้างแบบจำลองการเรียนรู้ของเครื่องประเภทต้นไม้ 5 แบบ เพื่อคาดการณ์การออกกลางคันของนักศึกษา โดยใช้ข้อมูลนักศึกษามหาวิทยาลัยสงขลานครินทร์ วิทยาเขตหาดใหญ่ 6 รุ่นปีการศึกษา คือในช่วง พ.ศ. 2558 ถึง 2563 จำนวน 33,930 ราย 39 ตัวแปร ขอบเขตของงานวิจัยนี้คือข้อมูลผลลัพธ์ทางการศึกษา ข้อมูลพื้นฐานของนักศึกษา และข้อมูลครอบครัวของนักศึกษา โดยแบ่งข้อมูลเป็นสองชุดคือ ชุดข้อมูลระดับปริญญาตรี พบว่าแบบจำลองไลท์กาเดียนบูทติ้งแมชชีนเป็นวิธีที่ดีที่สุดให้ค่าพื้นที่ใต้กราฟสูงที่สุดร้อยละ 93.03 และค่าความถูกต้องร้อยละ 89.99 และปัจจัยสำคัญที่ส่งผลต่อการออกกลางคันของนักศึกษาคือ ผลการเรียนเฉลี่ยสะสม รองลงมาคือชั้นปี ผลการเรียนเฉลี่ยปัจจุบัน ภาคการศึกษา ผลการเรียนเฉลี่ยสะสมก่อนเข้าศึกษา และคะแนนภาษาอังกฤษก่อนเข้าศึกษา ตามลำดับ และชุดข้อมูลระดับบัณฑิตศึกษา พบว่าแบบจำลองแรนดอมฟอเรสต์เป็นวิธีที่ดีที่สุดให้ค่าพื้นที่ใต้กราฟสูงที่สุดร้อยละ 78.86 และค่าความถูกต้องร้อยละ 85.28 และปัจจัยที่สำคัญที่ส่งผลต่อการออกกลางคันของนักศึกษา คือ ผลการเรียนเฉลี่ยสะสม รองลงมาคือชั้นปี ภาคการศึกษา กลุ่มสาขาวิชาสังคมศาสตร์และมนุษยศาสตร์ ผลการเรียนเฉลี่ยปัจจุบัน ประเภทภาคสมทบ และแผนการศึกษาแผน ก แบบ ก2 ตามลำดับ โดยขั้นตอนสุดท้ายผู้วิจัยนำแบบจำลองที่ได้ไปทำการคาดการณ์กับข้อมูลจริงและแสดงผลการวิเคราะห์นำเสนอรายงานแดชบอร์ดเพื่อติดตามความเสี่ยง ซึ่งจะช่วยให้เจ้าหน้าที่ที่เกี่ยวข้องสามารถเข้าช่วยเหลือนักศึกษาที่มีความเสี่ยงได้ทันที และเพื่อช่วยผู้บริหารในการสนับสนุนการตัดสินใจและวางแผนการบริหารงานเพื่อลดอัตราการออกกลางคันในมหาวิทยาลัยให้ต่ำลงได

PSU Knowledge Bank

Automatic understanding of multimodal content for Web-based learning

Author: Otto Christian Ralf
Publication venue: Hannover : Institutionelles Repositorium der Gottfried Wilhelm Leibniz Unviersität Hannover
Publication date: 01/01/2023
Field of study

Web-based learning has become an integral part of everyday life for all ages and backgrounds. On the one hand, the advantages of this learning type, such as availability, accessibility, flexibility, and cost, are apparent. On the other hand, the oversupply of content can lead to learners struggling to find optimal resources efficiently. The interdisciplinary research field Search as Learning is concerned with the analysis and improvement of Web-based learning processes, both on the learner and the computer science side. So far, automatic approaches that assess and recommend learning resources in Search as Learning (SAL) focus on textual, resource, and behavioral features. However, these approaches commonly ignore multimodal aspects. This work addresses this research gap by proposing several approaches that address the question of how multimodal retrieval methods can help support learning on the Web. First, we evaluate whether textual metadata of the TIB AV-Portal can be exploited and enriched by semantic word embeddings to generate video recommendations and, in addition, a video summarization technique to improve exploratory search. Then we turn to the challenging task of knowledge gain prediction that estimates the potential learning success given a specific learning resource. We used data from two user studies for our approaches. The first one observes the knowledge gain when learning with videos in a Massive Open Online Course (MOOC) setting, while the second one provides an informal Web-based learning setting where the subjects have unrestricted access to the Internet. We then extend the purely textual features to include visual, audio, and cross-modal features for a holistic representation of learning resources. By correlating these features with the achieved knowledge gain, we can estimate the impact of a particular learning resource on learning success. We further investigate the influence of multimodal data on the learning process by examining how the combination of visual and textual content generally conveys information. For this purpose, we draw on work from linguistics and visual communications, which investigated the relationship between image and text by means of different metrics and categorizations for several decades. We concretize these metrics to enable their compatibility for machine learning purposes. This process includes the derivation of semantic image-text classes from these metrics. We evaluate all proposals with comprehensive experiments and discuss their impacts and limitations at the end of the thesis.Web-basiertes Lernen ist ein fester Bestandteil des Alltags aller Alters- und Bevölkerungsschichten geworden. Einerseits liegen die Vorteile dieser Art des Lernens wie Verfügbarkeit, Zugänglichkeit, Flexibilität oder Kosten auf der Hand. Andererseits kann das Überangebot an Inhalten auch dazu führen, dass Lernende nicht in der Lage sind optimale Ressourcen effizient zu finden. Das interdisziplinäre Forschungsfeld Search as Learning beschäftigt sich mit der Analyse und Verbesserung von Web-basierten Lernprozessen. Bisher sind automatische Ansätze bei der Bewertung und Empfehlung von Lernressourcen fokussiert auf monomodale Merkmale, wie Text oder Dokumentstruktur. Die multimodale Betrachtung ist hingegen noch nicht ausreichend erforscht. Daher befasst sich diese Arbeit mit der Frage wie Methoden des Multimedia Retrievals dazu beitragen können das Lernen im Web zu unterstützen. Zunächst wird evaluiert, ob textuelle Metadaten des TIB AV-Portals genutzt werden können um in Verbindung mit semantischen Worteinbettungen einerseits Videoempfehlungen zu generieren und andererseits Visualisierungen zur Inhaltszusammenfassung von Videos abzuleiten. Anschließend wenden wir uns der anspruchsvollen Aufgabe der Vorhersage des Wissenszuwachses zu, die den potenziellen Lernerfolg einer Lernressource schätzt. Wir haben für unsere Ansätze Daten aus zwei Nutzerstudien verwendet. In der ersten wird der Wissenszuwachs beim Lernen mit Videos in einem MOOC-Setting beobachtet, während die zweite eine informelle web-basierte Lernumgebung bietet, in der die Probanden uneingeschränkten Internetzugang haben. Anschließend erweitern wir die rein textuellen Merkmale um visuelle, akustische und cross-modale Merkmale für eine ganzheitliche Darstellung der Lernressourcen. Durch die Korrelation dieser Merkmale mit dem erzielten Wissenszuwachs können wir den Einfluss einer Lernressource auf den Lernerfolg vorhersagen. Weiterhin untersuchen wir wie verschiedene Kombinationen von visuellen und textuellen Inhalten Informationen generell vermitteln. Dazu greifen wir auf Arbeiten aus der Linguistik und der visuellen Kommunikation zurück, die seit mehreren Jahrzehnten die Beziehung zwischen Bild und Text untersucht haben. Wir konkretisieren vorhandene Metriken, um ihre Verwendung für maschinelles Lernen zu ermöglichen. Dieser Prozess beinhaltet die Ableitung semantischer Bild-Text-Klassen. Wir evaluieren alle Ansätze mit umfangreichen Experimenten und diskutieren ihre Auswirkungen und Limitierungen am Ende der Arbeit

Institutionelles Repositorium der Leibniz Universität Hannover