8,463 research outputs found
Recommended from our members
Forecasting audience increase on YouTube
User proïŹles constructed on Social Web platforms are often motivated by the need to maximise user reputation within a community. Subscriber, or follower, counts are an indicator of the inïŹuence and standing that the user has, where greater values indicate a greater perception or regard for what the user has to say or share. However, at present there lacks an understanding of the factors that lead to an increase in such audience levels, and how a userâs behaviour can a!ect their reputation. In this paper we attempt to ïŹll this gap, by examining data collected from YouTube over regular time intervals. We explore the correlation between the subscriber counts and several behaviour features - extracted from both the userâs proïŹle and the content they have shared. Through the use of a Multiple Linear Regression model we are able to forecast the audience levels that users will yield based on observed behaviour. Combining such a model with an exhaustive feature selection process, we yield statistically signiïŹcant performance over a baseline model containing all features
MOOCs Meet Measurement Theory: A Topic-Modelling Approach
This paper adapts topic models to the psychometric testing of MOOC students
based on their online forum postings. Measurement theory from education and
psychology provides statistical models for quantifying a person's attainment of
intangible attributes such as attitudes, abilities or intelligence. Such models
infer latent skill levels by relating them to individuals' observed responses
on a series of items such as quiz questions. The set of items can be used to
measure a latent skill if individuals' responses on them conform to a Guttman
scale. Such well-scaled items differentiate between individuals and inferred
levels span the entire range from most basic to the advanced. In practice,
education researchers manually devise items (quiz questions) while optimising
well-scaled conformance. Due to the costly nature and expert requirements of
this process, psychometric testing has found limited use in everyday teaching.
We aim to develop usable measurement models for highly-instrumented MOOC
delivery platforms, by using participation in automatically-extracted online
forum topics as items. The challenge is to formalise the Guttman scale
educational constraint and incorporate it into topic models. To favour topics
that automatically conform to a Guttman scale, we introduce a novel
regularisation into non-negative matrix factorisation-based topic modelling. We
demonstrate the suitability of our approach with both quantitative experiments
on three Coursera MOOCs, and with a qualitative survey of topic
interpretability on two MOOCs by domain expert interviews.Comment: 12 pages, 9 figures; accepted into AAAI'201
Lightweight Adaptation of Classifiers to Users and Contexts: Trends of the Emerging Domain
Intelligent computer applications need to adapt their behaviour to contexts and users, but conventional classifier adaptation methods require long data collection and/or training times. Therefore classifier adaptation is often performed as follows: at design time application developers define typical usage contexts and provide reasoning models for each of these contexts, and then at runtime an appropriate model is selected from available ones. Typically, definition of usage contexts and reasoning models heavily relies on domain knowledge. However, in practice many applications are used in so diverse situations that no developer can predict them all and collect for each situation adequate training and test databases. Such applications have to adapt to a new user or unknown context at runtime just from interaction with the user, preferably in fairly lightweight ways, that is, requiring limited user effort to collect training data and limited time of performing the adaptation. This paper analyses adaptation trends in several emerging domains and outlines promising ideas, proposed for making multimodal classifiers user-specific and context-specific without significant user efforts, detailed domain knowledge, and/or complete retraining of the classifiers. Based on this analysis, this paper identifies important application characteristics and presents guidelines to consider these characteristics in adaptation design
Dirichlet belief networks for topic structure learning
Recently, considerable research effort has been devoted to developing deep
architectures for topic models to learn topic structures. Although several deep
models have been proposed to learn better topic proportions of documents, how
to leverage the benefits of deep structures for learning word distributions of
topics has not yet been rigorously studied. Here we propose a new multi-layer
generative process on word distributions of topics, where each layer consists
of a set of topics and each topic is drawn from a mixture of the topics of the
layer above. As the topics in all layers can be directly interpreted by words,
the proposed model is able to discover interpretable topic hierarchies. As a
self-contained module, our model can be flexibly adapted to different kinds of
topic models to improve their modelling accuracy and interpretability.
Extensive experiments on text corpora demonstrate the advantages of the
proposed model.Comment: accepted in NIPS 201
Continuous Estimation of Emotions in Speech by Dynamic Cooperative Speaker Models
Automatic emotion recognition from speech has been recently focused on the prediction of time-continuous dimensions (e.g., arousal and valence) of spontaneous and realistic expressions of emotion, as found in real-life interactions. However, the automatic prediction of such emotions poses several challenges, such as the subjectivity found in the definition of a gold standard from a pool of raters and the issue of data scarcity in training models. In this work, we introduce a novel emotion recognition system, based on ensemble of single-speaker-regression-models (SSRMs). The estimation of emotion is provided by combining a subset of the initial pool of SSRMs selecting those that are most concordance among them. The proposed approach allows the addition or removal of speakers from the ensemble without the necessity to re-build the entire machine learning system. The simplicity of this aggregation strategy, coupled with the flexibility assured by the modular architecture, and the promising results obtained on the RECOLA database highlight the potential implications of the proposed method in a real-life scenario and in particular in WEB-based applications
- âŠ