Search CORE

18,313 research outputs found

VLEngagement: A Dataset of Scientific Video Lectures for Evaluating Population-based Engagement

Author: Bulathwela Sahan
Perez-Ortiz Maria
Shawe-Taylor John
Yilmaz Emine
Publication venue
Publication date: 02/11/2020
Field of study

With the emergence of e-learning and personalised education, the production and distribution of digital educational resources have boomed. Video lectures have now become one of the primary modalities to impart knowledge to masses in the current digital age. The rapid creation of video lecture content challenges the currently established human-centred moderation and quality assurance pipeline, demanding for more efficient, scalable and automatic solutions for managing learning resources. Although a few datasets related to engagement with educational videos exist, there is still an important need for data and research aimed at understanding learner engagement with scientific video lectures. This paper introduces VLEngagement, a novel dataset that consists of content-based and video-specific features extracted from publicly available scientific video lectures and several metrics related to user engagement. We introduce several novel tasks related to predicting and understanding context-agnostic engagement in video lectures, providing preliminary baselines. This is the largest and most diverse publicly available dataset to our knowledge that deals with such tasks. The extraction of Wikipedia topic-based features also allows associating more sophisticated Wikipedia based features to the dataset to improve the performance in these tasks. The dataset, helper tools and example code snippets are available publicly at https://github.com/sahanbull/context-agnostic-engagemen

arXiv.org e-Print Archive

UCL Discovery

Social analytics for health integration, intelligence, and monitoring

Author: Ji Xiang
Publication venue: Digital Commons @ NJIT
Publication date: 31/08/2015
Field of study

Nowadays, patient-generated social health data are abundant and Healthcare is changing from the authoritative provider-centric model to collaborative and patient-oriented care. The aim of this dissertation is to provide a Social Health Analytics framework to utilize social data to solve the interdisciplinary research challenges of Big Data Science and Health Informatics. Specific research issues and objectives are described below. The first objective is semantic integration of heterogeneous health data sources, which can vary from structured to unstructured and include patient-generated social data as well as authoritative data. An information seeker has to spend time selecting information from many websites and integrating it into a coherent mental model. An integrated health data model is designed to allow accommodating data features from different sources. The model utilizes semantic linked data for lightweight integration and allows a set of analytics and inferences over data sources. A prototype analytical and reasoning tool called “Social InfoButtons” that can be linked from existing EHR systems is developed to allow doctors to understand and take into consideration the behaviors, patterns or trends of patients’ healthcare practices during a patient’s care. The tool can also shed insights for public health officials to make better-informed policy decisions. The second objective is near-real time monitoring of disease outbreaks using social media. The research for epidemics detection based on search query terms entered by millions of users is limited by the fact that query terms are not easily accessible by non-affiliated researchers. Publically available Twitter data is exploited to develop the Epidemics Outbreak and Spread Detection System (EOSDS). EOSDS provides four visual analytics tools for monitoring epidemics, i.e., Instance Map, Distribution Map, Filter Map, and Sentiment Trend to investigate public health threats in space and time. The third objective is to capture, analyze and quantify public health concerns through sentiment classifications on Twitter data. For traditional public health surveillance systems, it is hard to detect and monitor health related concerns and changes in public attitudes to health-related issues, due to their expenses and significant time delays. A two-step sentiment classification model is built to measure the concern. In the first step, Personal tweets are distinguished from Non-Personal tweets. In the second step, Personal Negative tweets are further separated from Personal Non-Negative tweets. In the proposed classification, training data is labeled by an emotion-oriented, clue-based method, and three Machine Learning models are trained and tested. Measure of Concern (MOC) is computed based on the number of Personal Negative sentiment tweets. A timeline trend of the MOC is also generated to monitor public concern levels, which is important for health emergency resource allocations and policy making. The fourth objective is predicting medical condition incidence and progression trajectories by using patients’ self-reported data on PatientsLikeMe. Some medical conditions are correlated with each other to a measureable degree (“comorbidities”). A prediction model is provided to predict the comorbidities and rank future conditions by their likelihood and to predict the possible progression trajectories given an observed medical condition. The novel models for trajectory prediction of medical conditions are validated to cover the comorbidities reported in the medical literature

Digital Commons @ New Jersey Institute of Technology (NJIT)

Can Population-based Engagement Improve Personalisation? A Novel Dataset and Experiments

Author: Bulathwela Sahan
Perez-Ortiz Maria
Shawe-Taylor John
Verma Meghana
Yilmaz Emine
Publication venue: International Educational Data Mining Society
Publication date: 18/07/2022
Field of study

This work explores how population-based engagement prediction can address cold-start at scale in large learning resource collections. This paper introduces i) VLE, a novel dataset that consists of content and video based features extracted from publicly available scientific video lectures coupled with implicit and explicit signals related to learner engagement, ii) two standard tasks related to predicting and ranking context-agnostic engagement in video lectures with preliminary baselines and iii) a set of experiments that validate the usefulness of the proposed dataset. Our experimental results indicate that the newly proposed VLE dataset leads to building context-agnostic engagement prediction models that are significantly performant than ones based on previous datasets, mainly attributing to the increase of training examples. VLE dataset’s suitability in building models towards Computer Science/ Artificial Intelligence education focused on e-learning/ MOOC use-cases is also evidenced. Further experiments in combining the built model with a personalising algorithm show promising improvements in addressing the cold-start problem encountered in educational recommenders. This is the largest and most diverse publicly available dataset to our knowledge that deals with learner engagement prediction tasks. The dataset, helper tools, descriptive statistics and example code snippets are available publicly

UCL Discovery

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Automatic cataloguing of Web resources on a personalized taxonomy

Author: Bidarra José
Escudeiro Paula
Publication venue
Publication date: 01/01/2008
Field of study

Information overload is a major concerns retrieval systems face. Information is ubiquitous, available from many distinct sources and the main issue is to get just the right piece of information that might satisfy our specific needs. Many of these sources organize their informational resources on a given ontology. However, these ontologies are static and do not allow for personalization. This fact degrades the value of the service if there is no easy mental mapping between user specific needs and the general source ontology. Organizing informational resources according to particular needs might increase users’ satisfaction and save their time. In this paper we present a methodology to filter and organize informational resources according to users’ interests, thus granting users with a personalized edition of the resource, especially tailored towards their specific needs. We believe that this methodology may be applied in educational scenarios, where we have a repository of educational objects that are organized according to specific objectives, automatically producing specific courseware. Our experimental results confirm that it is possible to automatically personalize document resources with high precision at a reduced editor workload.This work is supported by the POSC/EIA/58367/2004/Site-o-Matic Project (Fundação Ciência e Tecnologia), FEDER e Programa de Financiamento Plurianual de Unidades de I & D. We would like to thank the Expresso newspaper for their support throughout this work.info:eu-repo/semantics/publishedVersio

Repositório Aberto da Universidade Aberta

The snowflake effect: the future of mashups and learning

Author: Hodgins Wayne
Publication venue: British Educational Communications and Technology Agency (BECTA)
Publication date: 01/01/2008
Field of study

Emerging technologies for learning report - Article exploring web mashups and their potential for educatio

Digital Education Resource Archive