6 research outputs found

    Current and future multimodal learning analytics data challenges

    Get PDF
    Multimodal Learning Analytics (MMLA) captures, integrates and analyzes learning traces from different sources in order to obtain a more holistic understanding of the learning process, wherever it happens. MMLA leverages the increasingly widespread availability of diverse sensors, highfrequency data collection technologies and sophisticated machine learning and artificial intelligence techniques. The aim of this workshop is twofold: first, to expose participants to, and develop, different multimodal datasets that reflect how MMLA can bring new insights and opportunities to investigate complex learning processes and environments; second, to collaboratively identify a set of grand challenges for further MMLA research, built upon the foundations of previous workshops on the topic

    Investigating Correlations of Automatically Extracted Multimodal Features and Lecture Video Quality

    Full text link
    Ranking and recommendation of multimedia content such as videos is usually realized with respect to the relevance to a user query. However, for lecture videos and MOOCs (Massive Open Online Courses) it is not only required to retrieve relevant videos, but particularly to find lecture videos of high quality that facilitate learning, for instance, independent of the video's or speaker's popularity. Thus, metadata about a lecture video's quality are crucial features for learning contexts, e.g., lecture video recommendation in search as learning scenarios. In this paper, we investigate whether automatically extracted features are correlated to quality aspects of a video. A set of scholarly videos from a Mass Open Online Course (MOOC) is analyzed regarding audio, linguistic, and visual features. Furthermore, a set of cross-modal features is proposed which are derived by combining transcripts, audio, video, and slide content. A user study is conducted to investigate the correlations between the automatically collected features and human ratings of quality aspects of a lecture video. Finally, the impact of our features on the knowledge gain of the participants is discussed

    Teaching Analytics: Towards Automatic Extraction of Orchestration Graphs Using Wearable Sensors

    Get PDF
    "Teaching analytics" is the application of learning analytics techniques to understand teaching and learning processes, and eventually enable supportive interventions. However, in the case of (often, half-improvised) teaching in face-to-face classrooms, such interventions would require first an understanding of what the teacher actually did, as the starting point for teacher reflection and inquiry. Currently, such teacher enactment characterization requires costly manual coding by researchers. This paper presents a case study exploring the potential of machine learning techniques to automatically extract teaching actions during classroom enactment, from five data sources collected using wearable sensors (eye-tracking, EEG, accelerometer, audio and video). Our results highlight the feasibility of this approach, with high levels of accuracy in determining the social plane of interaction (90%, k=0.8). The reliable detection of concrete teaching activity (e.g., explanation vs. questioning) accurately still remains challenging (67%, k=0.56), a fact that will prompt further research on multimodal features and models for teaching activity extraction, as well as the collection of a larger multimodal dataset to improve the accuracy and generalizability of these methods

    Automatic understanding of multimodal content for Web-based learning

    Get PDF
    Web-based learning has become an integral part of everyday life for all ages and backgrounds. On the one hand, the advantages of this learning type, such as availability, accessibility, flexibility, and cost, are apparent. On the other hand, the oversupply of content can lead to learners struggling to find optimal resources efficiently. The interdisciplinary research field Search as Learning is concerned with the analysis and improvement of Web-based learning processes, both on the learner and the computer science side. So far, automatic approaches that assess and recommend learning resources in Search as Learning (SAL) focus on textual, resource, and behavioral features. However, these approaches commonly ignore multimodal aspects. This work addresses this research gap by proposing several approaches that address the question of how multimodal retrieval methods can help support learning on the Web. First, we evaluate whether textual metadata of the TIB AV-Portal can be exploited and enriched by semantic word embeddings to generate video recommendations and, in addition, a video summarization technique to improve exploratory search. Then we turn to the challenging task of knowledge gain prediction that estimates the potential learning success given a specific learning resource. We used data from two user studies for our approaches. The first one observes the knowledge gain when learning with videos in a Massive Open Online Course (MOOC) setting, while the second one provides an informal Web-based learning setting where the subjects have unrestricted access to the Internet. We then extend the purely textual features to include visual, audio, and cross-modal features for a holistic representation of learning resources. By correlating these features with the achieved knowledge gain, we can estimate the impact of a particular learning resource on learning success. We further investigate the influence of multimodal data on the learning process by examining how the combination of visual and textual content generally conveys information. For this purpose, we draw on work from linguistics and visual communications, which investigated the relationship between image and text by means of different metrics and categorizations for several decades. We concretize these metrics to enable their compatibility for machine learning purposes. This process includes the derivation of semantic image-text classes from these metrics. We evaluate all proposals with comprehensive experiments and discuss their impacts and limitations at the end of the thesis.Web-basiertes Lernen ist ein fester Bestandteil des Alltags aller Alters- und Bevölkerungsschichten geworden. Einerseits liegen die Vorteile dieser Art des Lernens wie Verfügbarkeit, Zugänglichkeit, Flexibilität oder Kosten auf der Hand. Andererseits kann das Überangebot an Inhalten auch dazu führen, dass Lernende nicht in der Lage sind optimale Ressourcen effizient zu finden. Das interdisziplinäre Forschungsfeld Search as Learning beschäftigt sich mit der Analyse und Verbesserung von Web-basierten Lernprozessen. Bisher sind automatische Ansätze bei der Bewertung und Empfehlung von Lernressourcen fokussiert auf monomodale Merkmale, wie Text oder Dokumentstruktur. Die multimodale Betrachtung ist hingegen noch nicht ausreichend erforscht. Daher befasst sich diese Arbeit mit der Frage wie Methoden des Multimedia Retrievals dazu beitragen können das Lernen im Web zu unterstützen. Zunächst wird evaluiert, ob textuelle Metadaten des TIB AV-Portals genutzt werden können um in Verbindung mit semantischen Worteinbettungen einerseits Videoempfehlungen zu generieren und andererseits Visualisierungen zur Inhaltszusammenfassung von Videos abzuleiten. Anschließend wenden wir uns der anspruchsvollen Aufgabe der Vorhersage des Wissenszuwachses zu, die den potenziellen Lernerfolg einer Lernressource schätzt. Wir haben für unsere Ansätze Daten aus zwei Nutzerstudien verwendet. In der ersten wird der Wissenszuwachs beim Lernen mit Videos in einem MOOC-Setting beobachtet, während die zweite eine informelle web-basierte Lernumgebung bietet, in der die Probanden uneingeschränkten Internetzugang haben. Anschließend erweitern wir die rein textuellen Merkmale um visuelle, akustische und cross-modale Merkmale für eine ganzheitliche Darstellung der Lernressourcen. Durch die Korrelation dieser Merkmale mit dem erzielten Wissenszuwachs können wir den Einfluss einer Lernressource auf den Lernerfolg vorhersagen. Weiterhin untersuchen wir wie verschiedene Kombinationen von visuellen und textuellen Inhalten Informationen generell vermitteln. Dazu greifen wir auf Arbeiten aus der Linguistik und der visuellen Kommunikation zurück, die seit mehreren Jahrzehnten die Beziehung zwischen Bild und Text untersucht haben. Wir konkretisieren vorhandene Metriken, um ihre Verwendung für maschinelles Lernen zu ermöglichen. Dieser Prozess beinhaltet die Ableitung semantischer Bild-Text-Klassen. Wir evaluieren alle Ansätze mit umfangreichen Experimenten und diskutieren ihre Auswirkungen und Limitierungen am Ende der Arbeit

    The Big Five:Addressing Recurrent Multimodal Learning Data Challenges

    Get PDF
    The analysis of multimodal data in learning is a growing field of research, which has led to the development of different analytics solutions. However, there is no standardised approach to handle multimodal data. In this paper, we describe and outline a solution for five recurrent challenges in the analysis of multimodal data: the data collection, storing, annotation, processing and exploitation. For each of these challenges, we envision possible solutions. The prototypes for some of the proposed solutions will be discussed during the Multimodal Challenge of the fourth Learning Analytics & Knowledge Hackathon, a two-day hands-on workshop in which the authors will open up the prototypes for trials, validation and feedback

    Multimodal Challenge: Analytics Beyond User-computer Interaction Data

    Get PDF
    This contribution describes one the challenges explored in the Fourth LAK Hackathon. This challenge aims at shifting the focus from learning situations which can be easily traced through user-computer interactions data and concentrate more on user-world interactions events, typical of co-located and practice-based learning experiences. This mission, pursued by the multimodal learning analytics (MMLA) community, seeks to bridge the gap between digital and physical learning spaces. The “multimodal” approach consists in combining learners’ motoric actions with physiological responses and data about the learning contexts. These data can be collected through multiple wearable sensors and Internet of Things (IoT) devices. This Hackathon table will confront with three main challenges arising from the analysis and valorisation of multimodal datasets: 1) the data collection and storing, 2) the data annotation, 3) the data processing and exploitation. Some research questions which will be considered in this Hackathon challenge are the following: how to process the raw sensor data streams and extract relevant features? which data mining and machine learning techniques can be applied? how can we compare two action recordings? How to combine sensor data with Experience API (xAPI)? what are meaningful visualisations for these data
    corecore