192,335 research outputs found
PREDICTING THE INDIVIDUAL MOOD LEVEL BASED ON DIARY DATA
Understanding mood changes of individuals with depressive disorders is crucial in order to guide personalized therapeutic interventions. Based on diary data, in which clients of an online depression treatment report their activities as free text, we categorize these activities and predict the mood level of clients. We apply a bag-of-words text-mining approach for activity categorization and explore recurrent neuronal networks to support this task. Using the identified activities, we develop partial ordered logit models with varying levels of heterogeneity among clients to predict their mood. We estimate the parameters of these models by employing Markov Chain Monte Carlo techniques and compare the models regarding their predictive performance. Therefore, by combining text-mining and Bayesian estimation techniques, we apply a two-stage analysis approach in order to reveal relationships between various activity categories and the individual mood level. Our findings indicate that the mood level is influenced negatively when participants report about sickness or rumination. Social activities have a positive influence on the mood. By understanding the influences of daily activities on the individual mood level, we hope to improve the efficacy of online behavior therapy, provide support in the context of clinical decision-making, and contribute to the development of personalized interventions
Text-based Sentiment Analysis and Music Emotion Recognition
Nowadays, with the expansion of social media, large amounts of user-generated
texts like tweets, blog posts or product reviews are shared online. Sentiment polarity
analysis of such texts has become highly attractive and is utilized in recommender
systems, market predictions, business intelligence and more. We also witness deep
learning techniques becoming top performers on those types of tasks. There are
however several problems that need to be solved for efficient use of deep neural
networks on text mining and text polarity analysis.
First of all, deep neural networks are data hungry. They need to be fed with
datasets that are big in size, cleaned and preprocessed as well as properly labeled.
Second, the modern natural language processing concept of word embeddings as a
dense and distributed text feature representation solves sparsity and dimensionality
problems of the traditional bag-of-words model. Still, there are various uncertainties
regarding the use of word vectors: should they be generated from the same dataset
that is used to train the model or it is better to source them from big and popular
collections that work as generic text feature representations? Third, it is not easy for
practitioners to find a simple and highly effective deep learning setup for various
document lengths and types. Recurrent neural networks are weak with longer texts
and optimal convolution-pooling combinations are not easily conceived. It is thus
convenient to have generic neural network architectures that are effective and can
adapt to various texts, encapsulating much of design complexity.
This thesis addresses the above problems to provide methodological and practical
insights for utilizing neural networks on sentiment analysis of texts and achieving
state of the art results. Regarding the first problem, the effectiveness of various
crowdsourcing alternatives is explored and two medium-sized and emotion-labeled
song datasets are created utilizing social tags. One of the research interests of Telecom
Italia was the exploration of relations between music emotional stimulation and
driving style. Consequently, a context-aware music recommender system that aims
to enhance driving comfort and safety was also designed. To address the second
problem, a series of experiments with large text collections of various contents and
domains were conducted. Word embeddings of different parameters were exercised
and results revealed that their quality is influenced (mostly but not only) by the
size of texts they were created from. When working with small text datasets, it is
thus important to source word features from popular and generic word embedding
collections. Regarding the third problem, a series of experiments involving convolutional
and max-pooling neural layers were conducted. Various patterns relating
text properties and network parameters with optimal classification accuracy were
observed. Combining convolutions of words, bigrams, and trigrams with regional
max-pooling layers in a couple of stacks produced the best results. The derived
architecture achieves competitive performance on sentiment polarity analysis of
movie, business and product reviews.
Given that labeled data are becoming the bottleneck of the current deep learning
systems, a future research direction could be the exploration of various data programming
possibilities for constructing even bigger labeled datasets. Investigation
of feature-level or decision-level ensemble techniques in the context of deep neural
networks could also be fruitful. Different feature types do usually represent complementary
characteristics of data. Combining word embedding and traditional text
features or utilizing recurrent networks on document splits and then aggregating the
predictions could further increase prediction accuracy of such models
Effectiveness of dismantling strategies on moderated vs. unmoderated online social platforms
Online social networks are the perfect test bed to better understand
large-scale human behavior in interacting contexts. Although they are broadly
used and studied, little is known about how their terms of service and posting
rules affect the way users interact and information spreads. Acknowledging the
relation between network connectivity and functionality, we compare the
robustness of two different online social platforms, Twitter and Gab, with
respect to dismantling strategies based on the recursive censor of users
characterized by social prominence (degree) or intensity of inflammatory
content (sentiment). We find that the moderated (Twitter) vs unmoderated (Gab)
character of the network is not a discriminating factor for intervention
effectiveness. We find, however, that more complex strategies based upon the
combination of topological and content features may be effective for network
dismantling. Our results provide useful indications to design better strategies
for countervailing the production and dissemination of anti-social content in
online social platforms
Understanding Communication Patterns in MOOCs: Combining Data Mining and qualitative methods
Massive Open Online Courses (MOOCs) offer unprecedented opportunities to
learn at scale. Within a few years, the phenomenon of crowd-based learning has
gained enormous popularity with millions of learners across the globe
participating in courses ranging from Popular Music to Astrophysics. They have
captured the imaginations of many, attracting significant media attention -
with The New York Times naming 2012 "The Year of the MOOC." For those engaged
in learning analytics and educational data mining, MOOCs have provided an
exciting opportunity to develop innovative methodologies that harness big data
in education.Comment: Preprint of a chapter to appear in "Data Mining and Learning
Analytics: Applications in Educational Research
Recommended from our members
Improving tag recommendation using social networks
In this paper we address the task of recommending additional tags to partially annotated media objects, in our case images. We propose an extendable framework that can recommend tags using a combination of different personalised and collective contexts. We combine information from four contexts: (1) all the photos in the system, (2) a user's own photos, (3) the photos of a user's social contacts, and (4) the photos posted in the groups of which a user is a member. Variants of methods (1) and (2) have been proposed in previous work, but the use of (3) and (4) is novel.
For each of the contexts we use the same probabilistic model and Borda Count based aggregation approach to generate recommendations from different contexts into a unified ranking of recommended tags. We evaluate our system using a large set of real-world data from Flickr. We show that by using personalised contexts we can significantly improve tag recommendation compared to using collective knowledge alone. We also analyse our experimental results to explore the capabilities of our system with respect to a user's social behaviour
- …