Search CORE

2,988 research outputs found

Multimodal music information processing and retrieval: survey and future challenges

Author: Avanzini Federico
Ntalampiras Stavros
Simonetta Federico
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/02/2019
Field of study

Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion, and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years

arXiv.org e-Print Archive

Crossref

Unobtrusive Assessment Of Student Engagement Levels In Online Classroom Environment Using Emotion Analysis

Author: Anbusegaran Sasirekha
Publication venue: Digital Commons@Georgia Southern
Publication date: 01/01/2021
Field of study

Measuring student engagement has emerged as a significant factor in the process of learning and a good indicator of the knowledge retention capacity of the student. As synchronous online classes have become more prevalent in recent years, gauging a student\u27s attention level is more critical in validating the progress of every student in an online classroom environment. This paper details the study on profiling the student attentiveness to different gradients of engagement level using multiple machine learning models. Results from the high accuracy model and the confidence score obtained from the cloud-based computer vision platform - Amazon Rekognition were then used to statistically validate any correlation between student attentiveness and emotions. This statistical analysis helps to identify the significant emotions that are essential in gauging various engagement levels. This study identified emotions like calm, happy, surprise, and fear are critical in gauging the student\u27s attention level. These findings help in the earlier detection of students with lower attention levels, consequently helping the instructors focus their support and guidance on the students in need, leading to a better online learning environment

Georgia Southern University: Digital Commons@Georgia Southern

Multi-Armed Bandits for Intelligent Tutoring Systems

Author: Benjamin Clement
Didier Roy
Inria Bordeaux
Manuel Lopes
Pierre-yves Oudeyer
Publication venue
Publication date: 01/01/2015
Field of study

We present an approach to Intelligent Tutoring Systems which adaptively personalizes sequences of learning activities to maximize skills acquired by students, taking into account the limited time and motivational resources. At a given point in time, the system proposes to the students the activity which makes them progress faster. We introduce two algorithms that rely on the empirical estimation of the learning progress, RiARiT that uses information about the difficulty of each exercise and ZPDES that uses much less knowledge about the problem. The system is based on the combination of three approaches. First, it leverages recent models of intrinsically motivated learning by transposing them to active teaching, relying on empirical estimation of learning progress provided by specific activities to particular students. Second, it uses state-of-the-art Multi-Arm Bandit (MAB) techniques to efficiently manage the exploration/exploitation challenge of this optimization process. Third, it leverages expert knowledge to constrain and bootstrap initial exploration of the MAB, while requiring only coarse guidance information of the expert and allowing the system to deal with didactic gaps in its knowledge. The system is evaluated in a scenario where 7-8 year old schoolchildren learn how to decompose numbers while manipulating money. Systematic experiments are presented with simulated students, followed by results of a user study across a population of 400 school children

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Data-Driven Intelligent Tutoring System for Accelerating Practical Skills Development:A Deep Learning Approach

Author: C Frasson
CW Wessner
DE Rumelhart
DK Jain
F Gutierrez
J Gutierrez
JA Kulik
LMY Chan
P Phobun
RJ Williams
T Fawcett
Publication venue: Springer
Publication date: 01/01/2021
Field of study

Crossref

University of Twente Research Information

Personalized face and gesture analysis using hierarchical neural networks

Author: Joshi Ajjen Das
Publication venue
Publication date: 05/02/2019
Field of study

The video-based computational analyses of human face and gesture signals encompass a myriad of challenging research problems involving computer vision, machine learning and human computer interaction. In this thesis, we focus on the following challenges: a) the classification of hand and body gestures along with the temporal localization of their occurrence in a continuous stream, b) the recognition of facial expressivity levels in people with Parkinson's Disease using multimodal feature representations, c) the prediction of student learning outcomes in intelligent tutoring systems using affect signals, and d) the personalization of machine learning models, which can adapt to subject and group-specific nuances in facial and gestural behavior. Specifically, we first conduct a quantitative comparison of two approaches to the problem of segmenting and classifying gestures on two benchmark gesture datasets: a method that simultaneously segments and classifies gestures versus a cascaded method that performs the tasks sequentially. Second, we introduce a framework that computationally predicts an accurate score for facial expressivity and validate it on a dataset of interview videos of people with Parkinson's disease. Third, based on a unique dataset of videos of students interacting with MathSpring, an intelligent tutoring system, collected by our collaborative research team, we build models to predict learning outcomes from their facial affect signals. Finally, we propose a novel solution to a relatively unexplored area in automatic face and gesture analysis research: personalization of models to individuals and groups. We develop hierarchical Bayesian neural networks to overcome the challenges posed by group or subject-specific variations in face and gesture signals. We successfully validate our formulation on the problems of personalized subject-specific gesture classification, context-specific facial expressivity recognition and student-specific learning outcome prediction. We demonstrate the flexibility of our hierarchical framework by validating the utility of both fully connected and recurrent neural architectures

Boston University Institutional Repository (OpenBU)

An original framework for understanding human actions and body language by using deep neural networks

Author: MASSARONI CRISTIANO
Publication venue
Publication date: 28/02/2020
Field of study

The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

Archivio della ricerca- Università di Roma La Sapienza

Balanced difficulty task finder: an adaptive recommendation method for learning tasks based on the concept of state of flow

Author: Abolpour Mofrad Asieh
Arntzen Erik
Goodwin Morten
Hammer Hugo Lewi
Yazidi Anis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

publishedVersio

NORA - Norwegian Open Research Archives

Agder University Research Archive

The Multimodal Tutor: Adaptive Feedback from Multimodal Experiences

Author: Di Mitri D.
Publication venue: Open Universiteit
Publication date: 04/09/2020
Field of study

This doctoral thesis describes the journey of ideation, prototyping and empirical testing of the Multimodal Tutor, a system designed for providing digital feedback that supports psychomotor skills acquisition using learning and multimodal data capturing. The feedback is given in real-time with machine-driven assessment of the learner's task execution. The predictions are tailored by supervised machine learning models trained with human annotated samples. The main contributions of this thesis are: a literature survey on multimodal data for learning, a conceptual model (the Multimodal Learning Analytics Model), a technological framework (the Multimodal Pipeline), a data annotation tool (the Visual Inspection Tool) and a case study in Cardiopulmonary Resuscitation training (CPR Tutor). The CPR Tutor generates real-time, adaptive feedback using kinematic and myographic data and neural networks

Open University of the Netherlands Research Portal