9 research outputs found
Predicting Engagement in Video Lectures
The explosion of Open Educational Resources (OERs) in the recent years
creates the demand for scalable, automatic approaches to process and evaluate
OERs, with the end goal of identifying and recommending the most suitable
educational materials for learners. We focus on building models to find the
characteristics and features involved in context-agnostic engagement (i.e.
population-based), a seldom researched topic compared to other contextualised
and personalised approaches that focus more on individual learner engagement.
Learner engagement, is arguably a more reliable measure than popularity/number
of views, is more abundant than user ratings and has also been shown to be a
crucial component in achieving learning outcomes. In this work, we explore the
idea of building a predictive model for population-based engagement in
education. We introduce a novel, large dataset of video lectures for predicting
context-agnostic engagement and propose both cross-modal and modality-specific
feature sets to achieve this task. We further test different strategies for
quantifying learner engagement signals. We demonstrate the use of our approach
in the case of data scarcity. Additionally, we perform a sensitivity analysis
of the best performing model, which shows promising performance and can be
easily integrated into an educational recommender system for OERs.Comment: In Proceedings of International Conference on Educational Data Mining
202
X5Learn: A Personalised Learning Companion at the Intersection of AI and HCI
X5Learn (available at https://x5learn.org ) is a human-centered AI-powered platform for supporting access to free online educational resources. X5Learn provides users with a number of educational tools for interacting with open educational videos, and a set of tools adapted to suit the pedagogical preferences of users. It is intended to support both teachers and students, alike. For teachers, it provides a powerful platform to reuse, revise, remix, and redistribute open courseware produced by others. These can be videos, pdfs, exercises and other online material. For students, it provides a scaffolded and informative interface to select content to watch, read, make notes and write reviews, as well as a powerful personalised recommendation system that can optimise learning paths and adjust to the user's learning preferences. What makes X5Learn stand out from other educational platforms, is how it combines human-centered design with AI algorithms and software tools with the goal of making it intuitive and easy to use, as well as making the AI transparent to the user. We present the core search tool of X5Learn, intended to support exploring open educational materials
PEEK: A Large Dataset of Learner Engagement with Educational Videos
Educational recommenders have received much less attention in comparison to
e-commerce and entertainment-related recommenders, even though efficient
intelligent tutors have great potential to improve learning gains. One of the
main challenges in advancing this research direction is the scarcity of large,
publicly available datasets. In this work, we release a large, novel dataset of
learners engaging with educational videos in-the-wild. The dataset, named
Personalised Educational Engagement with Knowledge Topics PEEK, is the first
publicly available dataset of this nature. The video lectures have been
associated with Wikipedia concepts related to the material of the lecture, thus
providing a humanly intuitive taxonomy. We believe that granular learner
engagement signals in unison with rich content representations will pave the
way to building powerful personalization algorithms that will revolutionise
educational and informational recommendation systems. Towards this goal, we 1)
construct a novel dataset from a popular video lecture repository, 2) identify
a set of benchmark algorithms to model engagement, and 3) run extensive
experimentation on the PEEK dataset to demonstrate its value. Our experiments
with the dataset show promise in building powerful informational recommender
systems. The dataset and the support code is available publicly
VLEngagement: A Dataset of Scientific Video Lectures for Evaluating Population-based Engagement
With the emergence of e-learning and personalised education, the production
and distribution of digital educational resources have boomed. Video lectures
have now become one of the primary modalities to impart knowledge to masses in
the current digital age. The rapid creation of video lecture content challenges
the currently established human-centred moderation and quality assurance
pipeline, demanding for more efficient, scalable and automatic solutions for
managing learning resources. Although a few datasets related to engagement with
educational videos exist, there is still an important need for data and
research aimed at understanding learner engagement with scientific video
lectures. This paper introduces VLEngagement, a novel dataset that consists of
content-based and video-specific features extracted from publicly available
scientific video lectures and several metrics related to user engagement. We
introduce several novel tasks related to predicting and understanding
context-agnostic engagement in video lectures, providing preliminary baselines.
This is the largest and most diverse publicly available dataset to our
knowledge that deals with such tasks. The extraction of Wikipedia topic-based
features also allows associating more sophisticated Wikipedia based features to
the dataset to improve the performance in these tasks. The dataset, helper
tools and example code snippets are available publicly at
https://github.com/sahanbull/context-agnostic-engagemen
Watch Less and Uncover More: Could Navigation Tools Help Users Search and Explore Videos?
Prior research has shown how âcontent preview toolsâ improve
speed and accuracy of user relevance judgements across different information retrieval tasks. This paper describes a novel user interface tool, the Content Flow Bar, designed to allow users to quickly identify relevant fragments within informational videos to facilitate browsing, through a cognitively augmented form of navigation. It achieves this by providing semantic âsnippetsâ that enable the user to rapidly scan through video content. The tool provides visuallyappealing pop-ups that appear in a time series bar at the bottom of each video, allowing to see in advance and at a glance how topics evolve in the content. We conducted a user study to evaluate how the tool changes the users search experience in video retrieval, as well as how it supports exploration and information seeking. The user questionnaire revealed that participants found the Content Flow Bar helpful and enjoyable for finding relevant information in videos. The interaction logs of the user study, where participants interacted with the tool for completing two informational tasks, showed that it holds promise for enhancing discoverability of content both across and within videos. This discovered potential could leverage a new generation of navigation tools in search and information retrieval
Power to the Learner: Towards Human-Intuitive and Integrative Recommendations with Open Educational Resources
Educational recommenders have received much less attention in comparison with e-commerce- and entertainment-related recommenders, even though efficient intelligent tutors could have potential to improve learning gains and enable advances in education that are essential to achieving the worldâs sustainability agenda. Through this work, we make foundational advances towards building a state-aware, integrative educational recommender. The proposed recommender accounts for the learnersâ interests and knowledge at the same time as content novelty and popularity, with the end goal of improving predictions of learner engagement in a lifelong-learning educational video platform. Towards achieving this goal, we (i) formulate and evaluate multiple probabilistic graphical models to capture learner interest; (ii) identify and experiment with multiple probabilistic and ensemble approaches to combine interest, novelty, and knowledge representations together; and (iii) identify and experiment with different hybrid recommender approaches to fuse population-based engagement prediction to address the cold-start problem, i.e., the scarcity of data in the early stages of a user session, a common challenge in recommendation systems. Our experiments with an in-the-wild interaction dataset of more than 20,000 learners show clear performance advantages by integrating content popularity, learner interest, novelty, and knowledge aspects in an informational recommender system, while preserving scalability. Our recommendation system integrates a human-intuitive representation at its core, and we argue that this transparency will prove important in efforts to give agency to the learner in interacting, collaborating, and governing their own educational algorithms
A Closer Look into Recent Video-based Learning Research: A Comprehensive Review of Video Characteristics, Tools, Technologies, and Learning Effectiveness
People increasingly use videos on the Web as a source for learning. To
support this way of learning, researchers and developers are continuously
developing tools, proposing guidelines, analyzing data, and conducting
experiments. However, it is still not clear what characteristics a video should
have to be an effective learning medium. In this paper, we present a
comprehensive review of 257 articles on video-based learning for the period
from 2016 to 2021. One of the aims of the review is to identify the video
characteristics that have been explored by previous work. Based on our
analysis, we suggest a taxonomy which organizes the video characteristics and
contextual aspects into eight categories: (1) audio features, (2) visual
features, (3) textual features, (4) instructor behavior, (5) learners
activities, (6) interactive features (quizzes, etc.), (7) production style, and
(8) instructional design. Also, we identify four representative research
directions: (1) proposals of tools to support video-based learning, (2) studies
with controlled experiments, (3) data analysis studies, and (4) proposals of
design guidelines for learning videos. We find that the most explored
characteristics are textual features followed by visual features, learner
activities, and interactive features. Text of transcripts, video frames, and
images (figures and illustrations) are most frequently used by tools that
support learning through videos. The learner activity is heavily explored
through log files in data analysis studies, and interactive features have been
frequently scrutinized in controlled experiments. We complement our review by
contrasting research findings that investigate the impact of video
characteristics on the learning effectiveness, report on tasks and technologies
used to develop tools that support learning, and summarize trends of design
guidelines to produce learning video
Prince of Songkla University Studentsâ Dropout Prediction Using Machine Learning
āļ§āļīāļāļĒāļēāļĻāļēāļŠāļāļĢāļĄāļŦāļēāļāļąāļāļāļīāļ (āļ§āļīāļāļĒāļēāļāļēāļĢāļāđāļāļĄāļđāļĨ), 2565Student retention rate plays a critical role and serves as an essential indicator of a tertiary institutionâs success. However, not all first-time students complete their program at the same institution within a specified period of time: some students drop out of the program. Prince of Songkla University Hatyai Campus is no exception. From Academic Years 2013-2017, the student dropout rates rose by 19.18%. This research study adopted data mining and machine learning techniques to explore factors that predict the likelihood of a student dropping out, and to create a learning model of five-decision trees, which will be used for the prediction of the student dropouts. Data were collected from 33,930 students of Prince of Songkla University Hatyai Campus, from 6 intakes ranging from Academic Years 2015-2020, and with 39 variables. Collected data cover studentsâ learning achievements, studentsâ basic information, and studentsâ family background. Data were classified into two categories: Undergraduate and Postgraduate. As for undergraduate category, the study found that Light Gradient Boosting Machine is the most appropriate methodology, as it yielded the highest value of the area under the curve of 93.03%, and the accuracy value of 89.99%. The top factors that predict the likelihood for student dropouts include the accumulated (overall) grade point average (GPAX); academic year; Grade Point Average (GPA); semester; pre-university GPAX; and pre-university English scores, respectively. As for postgraduate category, the study found that the Random Forest is the most appropriate methodology, as it yielded the highest value of the area under the curve of 78.86%, and the accuracy value of 85.28%. The top factors that predict the likelihood for student dropouts include GPAX; academic year; semester; social and humanity science; GPA; supplementary class; and Plan A, A2-Type, respectively. In the final procedure, the researcher implemented the obtained models for making a prediction with the actual data, and visually presented the results of the analysis in the dashboard report, which can be used for monitoring possible risks. This will enable respective staff to give immediate assistance to the students who are in needs or show the likelihood to drop out, and help the management board in making decisions and devising management plans to minimize the dropout rate in their institution.āļāļąāļāļĢāļēāļāļēāļĢāļāļāļāļĒāļđāđāļāļāļāļāļąāļāļĻāļķāļāļĐāļēāđāļāđāļāļŠāđāļ§āļāļŠāļģāļāļąāļāđāļĨāļ°āđāļāđāļāļāļąāļ§āļāļĩāđāļ§āļąāļāļŦāļāļķāđāļāđāļāļāļēāļĢāļ§āļąāļāļāļ§āļēāļĄāļŠāļģāđāļĢāđāļāļāļāļāļŠāļāļēāļāļąāļāļāļēāļĢāļĻāļķāļāļĐāļē āļāļĒāđāļēāļāđāļĢāļāđāļāļēāļĄāļāļđāđāļāļĩāđāđāļāđāļēāļĄāļēāļĻāļķāļāļĐāļēāđāļĄāđāļŠāļēāļĄāļēāļĢāļāļŠāļģāđāļĢāđāļāļāļēāļĢāļĻāļķāļāļĐāļēāđāļāļĢāļ°āļāļāđāļāđāļāļąāđāļāļŦāļĄāļ āđāļāļ·āđāļāļāļāļēāļāļŠāđāļ§āļāļŦāļāļķāđāļāļāđāļāļāļāļāļāļāļēāļāļāļēāļĢāļĻāļķāļāļĐāļēāļāļĨāļēāļāļāļąāļ āđāļāđāļāđāļāļĩāļĒāļ§āļāļąāļāļĄāļŦāļēāļ§āļīāļāļĒāļēāļĨāļąāļĒāļŠāļāļāļĨāļēāļāļāļĢāļīāļāļāļĢāđ āļ§āļīāļāļĒāļēāđāļāļāļŦāļēāļāđāļŦāļāđ āđāļāļĒāļāļąāđāļāđāļāđāļāļĩāļāļēāļĢāļĻāļķāļāļĐāļē āļ.āļĻ. 2556 āļāļķāļ 2560 āļāļāļ§āđāļēāļāļąāļāļĢāļēāļāļēāļĢāļāļāļāļāļĨāļēāļāļāļąāļāļāļāļāļāļąāļāļĻāļķāļāļĐāļēāđāļāļīāđāļĄāļŠāļđāļāļāļķāđāļāļāļĒāļđāđāļāļĩāđāļĢāđāļāļĒāļĨāļ° 19.18 āļāđāļ§āļĒāđāļŦāļāļļāļāļĩāđāļāļđāđāļ§āļīāļāļąāļĒāļāļķāļāđāļāđāļāļģāđāļŠāļāļāļāļēāļĢāļāļģāđāļāļāļāļīāļāđāļŦāļĄāļ·āļāļāļāđāļāļĄāļđāļĨāđāļĨāļ°āļāļēāļĢāđāļĢāļĩāļĒāļāļĢāļđāđāļāļāļāđāļāļĢāļ·āđāļāļāļĄāļēāļ§āļīāđāļāļĢāļēāļ°āļŦāđāđāļāļ·āđāļāļāđāļāļŦāļēāļāļļāļāļĨāļąāļāļĐāļāļ°āļāļĩāđāļŠāļģāļāļąāļāđāļĨāļ°āļŠāļĢāđāļēāļāđāļāļāļāļģāļĨāļāļāļāļēāļĢāđāļĢāļĩāļĒāļāļĢāļđāđāļāļāļāđāļāļĢāļ·āđāļāļāļāļĢāļ°āđāļ āļāļāđāļāđāļĄāđ 5 āđāļāļ āđāļāļ·āđāļāļāļēāļāļāļēāļĢāļāđāļāļēāļĢāļāļāļāļāļĨāļēāļāļāļąāļāļāļāļāļāļąāļāļĻāļķāļāļĐāļē āđāļāļĒāđāļāđāļāđāļāļĄāļđāļĨāļāļąāļāļĻāļķāļāļĐāļēāļĄāļŦāļēāļ§āļīāļāļĒāļēāļĨāļąāļĒāļŠāļāļāļĨāļēāļāļāļĢāļīāļāļāļĢāđ āļ§āļīāļāļĒāļēāđāļāļāļŦāļēāļāđāļŦāļāđ 6 āļĢāļļāđāļāļāļĩāļāļēāļĢāļĻāļķāļāļĐāļē āļāļ·āļāđāļāļāđāļ§āļ āļ.āļĻ. 2558 āļāļķāļ 2563 āļāļģāļāļ§āļ 33,930 āļĢāļēāļĒ 39 āļāļąāļ§āđāļāļĢ āļāļāļāđāļāļāļāļāļāļāļēāļāļ§āļīāļāļąāļĒāļāļĩāđāļāļ·āļāļāđāļāļĄāļđāļĨāļāļĨāļĨāļąāļāļāđāļāļēāļāļāļēāļĢāļĻāļķāļāļĐāļē āļāđāļāļĄāļđāļĨāļāļ·āđāļāļāļēāļāļāļāļāļāļąāļāļĻāļķāļāļĐāļē āđāļĨāļ°āļāđāļāļĄāļđāļĨāļāļĢāļāļāļāļĢāļąāļ§āļāļāļāļāļąāļāļĻāļķāļāļĐāļē āđāļāļĒāđāļāđāļāļāđāļāļĄāļđāļĨāđāļāđāļāļŠāļāļāļāļļāļāļāļ·āļ āļāļļāļāļāđāļāļĄāļđāļĨāļĢāļ°āļāļąāļāļāļĢāļīāļāļāļēāļāļĢāļĩ āļāļāļ§āđāļēāđāļāļāļāļģāļĨāļāļāđāļĨāļāđāļāļēāđāļāļĩāļĒāļāļāļđāļāļāļīāđāļāđāļĄāļāļāļĩāļāđāļāđāļāļ§āļīāļāļĩāļāļĩāđāļāļĩāļāļĩāđāļŠāļļāļāđāļŦāđāļāđāļēāļāļ·āđāļāļāļĩāđāđāļāđāļāļĢāļēāļāļŠāļđāļāļāļĩāđāļŠāļļāļāļĢāđāļāļĒāļĨāļ° 93.03 āđāļĨāļ°āļāđāļēāļāļ§āļēāļĄāļāļđāļāļāđāļāļāļĢāđāļāļĒāļĨāļ° 89.99 āđāļĨāļ°āļāļąāļāļāļąāļĒāļŠāļģāļāļąāļāļāļĩāđāļŠāđāļāļāļĨāļāđāļāļāļēāļĢāļāļāļāļāļĨāļēāļāļāļąāļāļāļāļāļāļąāļāļĻāļķāļāļĐāļēāļāļ·āļ āļāļĨāļāļēāļĢāđāļĢāļĩāļĒāļāđāļāļĨāļĩāđāļĒāļŠāļ°āļŠāļĄ āļĢāļāļāļĨāļāļĄāļēāļāļ·āļāļāļąāđāļāļāļĩ āļāļĨāļāļēāļĢāđāļĢāļĩāļĒāļāđāļāļĨāļĩāđāļĒāļāļąāļāļāļļāļāļąāļ āļ āļēāļāļāļēāļĢāļĻāļķāļāļĐāļē āļāļĨāļāļēāļĢāđāļĢāļĩāļĒāļāđāļāļĨāļĩāđāļĒāļŠāļ°āļŠāļĄāļāđāļāļāđāļāđāļēāļĻāļķāļāļĐāļē āđāļĨāļ°āļāļ°āđāļāļāļ āļēāļĐāļēāļāļąāļāļāļĪāļĐāļāđāļāļāđāļāđāļēāļĻāļķāļāļĐāļē āļāļēāļĄāļĨāļģāļāļąāļ āđāļĨāļ°āļāļļāļāļāđāļāļĄāļđāļĨāļĢāļ°āļāļąāļāļāļąāļāļāļīāļāļĻāļķāļāļĐāļē āļāļāļ§āđāļēāđāļāļāļāļģāļĨāļāļāđāļĢāļāļāļāļĄāļāļāđāļĢāļŠāļāđāđāļāđāļāļ§āļīāļāļĩāļāļĩāđāļāļĩāļāļĩāđāļŠāļļāļāđāļŦāđāļāđāļēāļāļ·āđāļāļāļĩāđāđāļāđāļāļĢāļēāļāļŠāļđāļāļāļĩāđāļŠāļļāļāļĢāđāļāļĒāļĨāļ° 78.86 āđāļĨāļ°āļāđāļēāļāļ§āļēāļĄāļāļđāļāļāđāļāļāļĢāđāļāļĒāļĨāļ° 85.28 āđāļĨāļ°āļāļąāļāļāļąāļĒāļāļĩāđāļŠāļģāļāļąāļāļāļĩāđāļŠāđāļāļāļĨāļāđāļāļāļēāļĢāļāļāļāļāļĨāļēāļāļāļąāļāļāļāļāļāļąāļāļĻāļķāļāļĐāļē āļāļ·āļ āļāļĨāļāļēāļĢāđāļĢāļĩāļĒāļāđāļāļĨāļĩāđāļĒāļŠāļ°āļŠāļĄ āļĢāļāļāļĨāļāļĄāļēāļāļ·āļāļāļąāđāļāļāļĩ āļ āļēāļāļāļēāļĢāļĻāļķāļāļĐāļē āļāļĨāļļāđāļĄāļŠāļēāļāļēāļ§āļīāļāļēāļŠāļąāļāļāļĄāļĻāļēāļŠāļāļĢāđāđāļĨāļ°āļĄāļāļļāļĐāļĒāļĻāļēāļŠāļāļĢāđ āļāļĨāļāļēāļĢāđāļĢāļĩāļĒāļāđāļāļĨāļĩāđāļĒāļāļąāļāļāļļāļāļąāļ āļāļĢāļ°āđāļ āļāļ āļēāļāļŠāļĄāļāļ āđāļĨāļ°āđāļāļāļāļēāļĢāļĻāļķāļāļĐāļēāđāļāļ āļ āđāļāļ āļ2 āļāļēāļĄāļĨāļģāļāļąāļ āđāļāļĒāļāļąāđāļāļāļāļāļŠāļļāļāļāđāļēāļĒāļāļđāđāļ§āļīāļāļąāļĒāļāļģāđāļāļāļāļģāļĨāļāļāļāļĩāđāđāļāđāđāļāļāļģāļāļēāļĢāļāļēāļāļāļēāļĢāļāđāļāļąāļāļāđāļāļĄāļđāļĨāļāļĢāļīāļāđāļĨāļ°āđāļŠāļāļāļāļĨāļāļēāļĢāļ§āļīāđāļāļĢāļēāļ°āļŦāđāļāļģāđāļŠāļāļāļĢāļēāļĒāļāļēāļāđāļāļāļāļāļĢāđāļāđāļāļ·āđāļāļāļīāļāļāļēāļĄāļāļ§āļēāļĄāđāļŠāļĩāđāļĒāļ āļāļķāđāļāļāļ°āļāđāļ§āļĒāđāļŦāđāđāļāđāļēāļŦāļāđāļēāļāļĩāđāļāļĩāđāđāļāļĩāđāļĒāļ§āļāđāļāļāļŠāļēāļĄāļēāļĢāļāđāļāđāļēāļāđāļ§āļĒāđāļŦāļĨāļ·āļāļāļąāļāļĻāļķāļāļĐāļēāļāļĩāđāļĄāļĩāļāļ§āļēāļĄāđāļŠāļĩāđāļĒāļāđāļāđāļāļąāļāļāļĩ āđāļĨāļ°āđāļāļ·āđāļāļāđāļ§āļĒāļāļđāđāļāļĢāļīāļŦāļēāļĢāđāļāļāļēāļĢāļŠāļāļąāļāļŠāļāļļāļāļāļēāļĢāļāļąāļāļŠāļīāļāđāļāđāļĨāļ°āļ§āļēāļāđāļāļāļāļēāļĢāļāļĢāļīāļŦāļēāļĢāļāļēāļāđāļāļ·āđāļāļĨāļāļāļąāļāļĢāļēāļāļēāļĢāļāļāļāļāļĨāļēāļāļāļąāļāđāļāļĄāļŦāļēāļ§āļīāļāļĒāļēāļĨāļąāļĒāđāļŦāđāļāđāļģāļĨāļāđāļ
Automatic understanding of multimodal content for Web-based learning
Web-based learning has become an integral part of everyday life for all ages and backgrounds. On the one hand, the advantages of this learning type, such as availability, accessibility, flexibility, and cost, are apparent. On the other hand, the oversupply of content can lead to learners struggling to find optimal resources efficiently. The interdisciplinary research field Search as Learning is concerned with the analysis and improvement of Web-based learning processes, both on the learner and the computer science side.
So far, automatic approaches that assess and recommend learning resources in Search as Learning (SAL) focus on textual, resource, and behavioral features. However, these approaches commonly ignore multimodal aspects. This work addresses this research gap by proposing several approaches that address the question of how multimodal retrieval methods can help support learning on the Web. First, we evaluate whether textual metadata of the TIB AV-Portal can be exploited and enriched by semantic word embeddings to generate video recommendations and, in addition, a video summarization technique to improve exploratory search. Then we turn to the challenging task of knowledge gain prediction that estimates the potential learning success given a specific learning resource. We used data from two user studies for our approaches. The first one observes the knowledge gain when learning with videos in a Massive Open Online Course (MOOC) setting, while the second one provides an informal Web-based learning setting where the subjects have unrestricted access to the Internet. We then extend the purely textual features to include visual, audio, and cross-modal features for a holistic representation of learning resources. By correlating these features with the achieved knowledge gain, we can estimate the impact of a particular learning resource on learning success.
We further investigate the influence of multimodal data on the learning process by examining how the combination of visual and textual content generally conveys information. For this purpose, we draw on work from linguistics and visual communications, which investigated the relationship between image and text by means of different metrics and categorizations for several decades. We concretize these metrics to enable their compatibility for machine learning purposes. This process includes the derivation of semantic image-text classes from these metrics. We evaluate all proposals with comprehensive experiments and discuss their impacts and limitations at the end of the thesis.Web-basiertes Lernen ist ein fester Bestandteil des Alltags aller Alters- und BevÃķlkerungsschichten geworden. Einerseits liegen die Vorteile dieser Art des Lernens wie VerfÞgbarkeit, ZugÃĪnglichkeit, FlexibilitÃĪt oder Kosten auf der Hand. Andererseits kann das Ãberangebot an Inhalten auch dazu fÞhren, dass Lernende nicht in der Lage sind optimale Ressourcen effizient zu finden. Das interdisziplinÃĪre Forschungsfeld Search as Learning beschÃĪftigt sich mit der Analyse und Verbesserung von Web-basierten Lernprozessen.
Bisher sind automatische AnsÃĪtze bei der Bewertung und Empfehlung von Lernressourcen fokussiert auf monomodale Merkmale, wie Text oder Dokumentstruktur. Die multimodale Betrachtung ist hingegen noch nicht ausreichend erforscht. Daher befasst sich diese Arbeit mit der Frage wie Methoden des Multimedia Retrievals dazu beitragen kÃķnnen das Lernen im Web zu unterstÞtzen. ZunÃĪchst wird evaluiert, ob textuelle Metadaten des TIB AV-Portals genutzt werden kÃķnnen um in Verbindung mit semantischen Worteinbettungen einerseits Videoempfehlungen zu generieren und andererseits Visualisierungen zur Inhaltszusammenfassung von Videos abzuleiten. AnschlieÃend wenden wir uns der anspruchsvollen Aufgabe der Vorhersage des Wissenszuwachses zu, die den potenziellen Lernerfolg einer Lernressource schÃĪtzt. Wir haben fÞr unsere AnsÃĪtze Daten aus zwei Nutzerstudien verwendet. In der ersten wird der Wissenszuwachs beim Lernen mit Videos in einem MOOC-Setting beobachtet, wÃĪhrend die zweite eine informelle web-basierte Lernumgebung bietet, in der die Probanden uneingeschrÃĪnkten Internetzugang haben. AnschlieÃend erweitern wir die rein textuellen Merkmale um visuelle, akustische und cross-modale Merkmale fÞr eine ganzheitliche Darstellung der Lernressourcen. Durch die Korrelation dieser Merkmale mit dem erzielten Wissenszuwachs kÃķnnen wir den Einfluss einer Lernressource auf den Lernerfolg vorhersagen.
Weiterhin untersuchen wir wie verschiedene Kombinationen von visuellen und textuellen Inhalten Informationen generell vermitteln. Dazu greifen wir auf Arbeiten aus der Linguistik und der visuellen Kommunikation zurÞck, die seit mehreren Jahrzehnten die Beziehung zwischen Bild und Text untersucht haben. Wir konkretisieren vorhandene Metriken, um ihre Verwendung fÞr maschinelles Lernen zu ermÃķglichen. Dieser Prozess beinhaltet die Ableitung semantischer Bild-Text-Klassen. Wir evaluieren alle AnsÃĪtze mit umfangreichen Experimenten und diskutieren ihre Auswirkungen und Limitierungen am Ende der Arbeit