18,313 research outputs found
VLEngagement: A Dataset of Scientific Video Lectures for Evaluating Population-based Engagement
With the emergence of e-learning and personalised education, the production
and distribution of digital educational resources have boomed. Video lectures
have now become one of the primary modalities to impart knowledge to masses in
the current digital age. The rapid creation of video lecture content challenges
the currently established human-centred moderation and quality assurance
pipeline, demanding for more efficient, scalable and automatic solutions for
managing learning resources. Although a few datasets related to engagement with
educational videos exist, there is still an important need for data and
research aimed at understanding learner engagement with scientific video
lectures. This paper introduces VLEngagement, a novel dataset that consists of
content-based and video-specific features extracted from publicly available
scientific video lectures and several metrics related to user engagement. We
introduce several novel tasks related to predicting and understanding
context-agnostic engagement in video lectures, providing preliminary baselines.
This is the largest and most diverse publicly available dataset to our
knowledge that deals with such tasks. The extraction of Wikipedia topic-based
features also allows associating more sophisticated Wikipedia based features to
the dataset to improve the performance in these tasks. The dataset, helper
tools and example code snippets are available publicly at
https://github.com/sahanbull/context-agnostic-engagemen
Social analytics for health integration, intelligence, and monitoring
Nowadays, patient-generated social health data are abundant and Healthcare is changing from the authoritative provider-centric model to collaborative and patient-oriented care. The aim of this dissertation is to provide a Social Health Analytics framework to utilize social data to solve the interdisciplinary research challenges of Big Data Science and Health Informatics. Specific research issues and objectives are described below.
The first objective is semantic integration of heterogeneous health data sources, which can vary from structured to unstructured and include patient-generated social data as well as authoritative data. An information seeker has to spend time selecting information from many websites and integrating it into a coherent mental model. An integrated health data model is designed to allow accommodating data features from different sources. The model utilizes semantic linked data for lightweight integration and allows a set of analytics and inferences over data sources. A prototype analytical and reasoning tool called “Social InfoButtons” that can be linked from existing EHR systems is developed to allow doctors to understand and take into consideration the behaviors, patterns or trends of patients’ healthcare practices during a patient’s care. The tool can also shed insights for public health officials to make better-informed policy decisions.
The second objective is near-real time monitoring of disease outbreaks using social media. The research for epidemics detection based on search query terms entered by millions of users is limited by the fact that query terms are not easily accessible by non-affiliated researchers. Publically available Twitter data is exploited to develop the Epidemics Outbreak and Spread Detection System (EOSDS). EOSDS provides four visual analytics tools for monitoring epidemics, i.e., Instance Map, Distribution Map, Filter Map, and Sentiment Trend to investigate public health threats in space and time.
The third objective is to capture, analyze and quantify public health concerns through sentiment classifications on Twitter data. For traditional public health surveillance systems, it is hard to detect and monitor health related concerns and changes in public attitudes to health-related issues, due to their expenses and significant time delays. A two-step sentiment classification model is built to measure the concern. In the first step, Personal tweets are distinguished from Non-Personal tweets. In the second step, Personal Negative tweets are further separated from Personal Non-Negative tweets. In the proposed classification, training data is labeled by an emotion-oriented, clue-based method, and three Machine Learning models are trained and tested. Measure of Concern (MOC) is computed based on the number of Personal Negative sentiment tweets. A timeline trend of the MOC is also generated to monitor public concern levels, which is important for health emergency resource allocations and policy making.
The fourth objective is predicting medical condition incidence and progression trajectories by using patients’ self-reported data on PatientsLikeMe. Some medical conditions are correlated with each other to a measureable degree (“comorbidities”). A prediction model is provided to predict the comorbidities and rank future conditions by their likelihood and to predict the possible progression trajectories given an observed medical condition. The novel models for trajectory prediction of medical conditions are validated to cover the comorbidities reported in the medical literature
Can Population-based Engagement Improve Personalisation? A Novel Dataset and Experiments
This work explores how population-based engagement prediction can address cold-start at scale in large learning resource collections. This paper introduces i) VLE, a novel dataset that consists of content and video based features extracted from publicly available scientific video lectures coupled with implicit and explicit signals related to learner engagement, ii) two standard tasks related to predicting and ranking context-agnostic engagement in video lectures with preliminary baselines and iii) a set of experiments that validate the usefulness of the proposed dataset. Our experimental results indicate that the newly proposed VLE dataset leads to building context-agnostic engagement prediction models that are significantly performant than ones based on previous datasets, mainly attributing to the increase of training examples. VLE dataset’s suitability in building models towards Computer Science/ Artificial Intelligence education focused on e-learning/ MOOC use-cases is also evidenced. Further experiments in combining the built model with a personalising algorithm show promising improvements in addressing the cold-start problem encountered in educational recommenders. This is the largest and most diverse publicly available dataset to our knowledge that deals with learner engagement prediction tasks. The dataset, helper tools, descriptive statistics and example code snippets are available publicly
Automatic cataloguing of Web resources on a personalized taxonomy
Information overload is a major concerns retrieval systems face. Information is
ubiquitous, available from many distinct sources and the main issue is to get just the right piece of
information that might satisfy our specific needs. Many of these sources organize their
informational resources on a given ontology. However, these ontologies are static and do not allow
for personalization. This fact degrades the value of the service if there is no easy mental mapping
between user specific needs and the general source ontology. Organizing informational resources
according to particular needs might increase users’ satisfaction and save their time. In this paper we
present a methodology to filter and organize informational resources according to users’ interests,
thus granting users with a personalized edition of the resource, especially tailored towards their
specific needs. We believe that this methodology may be applied in educational scenarios, where
we have a repository of educational objects that are organized according to specific objectives,
automatically producing specific courseware. Our experimental results confirm that it is possible to
automatically personalize document resources with high precision at a reduced editor workload.This work is supported by the POSC/EIA/58367/2004/Site-o-Matic Project (Fundação Ciência e Tecnologia),
FEDER e Programa de Financiamento Plurianual de Unidades de I & D. We would like to thank the Expresso newspaper for their support throughout this work.info:eu-repo/semantics/publishedVersio
The snowflake effect: the future of mashups and learning
Emerging technologies for learning report - Article exploring web mashups and their potential for educatio
- …