1,463 research outputs found
Exploiting Behavioral Consistence for Universal User Representation
User modeling is critical for developing personalized services in industry. A
common way for user modeling is to learn user representations that can be
distinguished by their interests or preferences. In this work, we focus on
developing universal user representation model. The obtained universal
representations are expected to contain rich information, and be applicable to
various downstream applications without further modifications (e.g., user
preference prediction and user profiling). Accordingly, we can be free from the
heavy work of training task-specific models for every downstream task as in
previous works. In specific, we propose Self-supervised User Modeling Network
(SUMN) to encode behavior data into the universal representation. It includes
two key components. The first one is a new learning objective, which guides the
model to fully identify and preserve valuable user information under a
self-supervised learning framework. The other one is a multi-hop aggregation
layer, which benefits the model capacity in aggregating diverse behaviors.
Extensive experiments on benchmark datasets show that our approach can
outperform state-of-the-art unsupervised representation methods, and even
compete with supervised ones.Comment: Preprint of accepted AAAI2021 pape
Learning Representations of Social Media Users
User representations are routinely used in recommendation systems by platform
developers, targeted advertisements by marketers, and by public policy
researchers to gauge public opinion across demographic groups. Computer
scientists consider the problem of inferring user representations more
abstractly; how does one extract a stable user representation - effective for
many downstream tasks - from a medium as noisy and complicated as social media?
The quality of a user representation is ultimately task-dependent (e.g. does
it improve classifier performance, make more accurate recommendations in a
recommendation system) but there are proxies that are less sensitive to the
specific task. Is the representation predictive of latent properties such as a
person's demographic features, socioeconomic class, or mental health state? Is
it predictive of the user's future behavior?
In this thesis, we begin by showing how user representations can be learned
from multiple types of user behavior on social media. We apply several
extensions of generalized canonical correlation analysis to learn these
representations and evaluate them at three tasks: predicting future hashtag
mentions, friending behavior, and demographic features. We then show how user
features can be employed as distant supervision to improve topic model fit.
Finally, we show how user features can be integrated into and improve existing
classifiers in the multitask learning framework. We treat user representations
- ground truth gender and mental health features - as auxiliary tasks to
improve mental health state prediction. We also use distributed user
representations learned in the first chapter to improve tweet-level stance
classifiers, showing that distant user information can inform classification
tasks at the granularity of a single message.Comment: PhD thesi
Learning Representations of Social Media Users
User representations are routinely used in recommendation systems by platform
developers, targeted advertisements by marketers, and by public policy
researchers to gauge public opinion across demographic groups. Computer
scientists consider the problem of inferring user representations more
abstractly; how does one extract a stable user representation - effective for
many downstream tasks - from a medium as noisy and complicated as social media?
The quality of a user representation is ultimately task-dependent (e.g. does
it improve classifier performance, make more accurate recommendations in a
recommendation system) but there are proxies that are less sensitive to the
specific task. Is the representation predictive of latent properties such as a
person's demographic features, socioeconomic class, or mental health state? Is
it predictive of the user's future behavior?
In this thesis, we begin by showing how user representations can be learned
from multiple types of user behavior on social media. We apply several
extensions of generalized canonical correlation analysis to learn these
representations and evaluate them at three tasks: predicting future hashtag
mentions, friending behavior, and demographic features. We then show how user
features can be employed as distant supervision to improve topic model fit.
Finally, we show how user features can be integrated into and improve existing
classifiers in the multitask learning framework. We treat user representations
- ground truth gender and mental health features - as auxiliary tasks to
improve mental health state prediction. We also use distributed user
representations learned in the first chapter to improve tweet-level stance
classifiers, showing that distant user information can inform classification
tasks at the granularity of a single message.Comment: PhD thesi
Domain-based user embedding for competing events on social media
Online social networks offer vast opportunities for computational social
science, but effective user embedding is crucial for downstream tasks.
Traditionally, researchers have used pre-defined network-based user features,
such as degree, and centrality measures, and/or content-based features, such as
posts and reposts. However, these measures may not capture the complex
characteristics of social media users. In this study, we propose a user
embedding method based on the URL domain co-occurrence network, which is simple
but effective for representing social media users in competing events. We
assessed the performance of this method in binary classification tasks using
benchmark datasets that included Twitter users related to COVID-19 infodemic
topics (QAnon, Biden, Ivermectin). Our results revealed that user embeddings
generated directly from the retweet network, and those based on language,
performed below expectations. In contrast, our domain-based embeddings
outperformed these methods while reducing computation time. These findings
suggest that the domain-based user embedding can serve as an effective tool to
characterize social media users participating in competing events, such as
political campaigns and public health crises.Comment: Computational social science applicatio
"When they say weed causes depression, but it's your fav antidepressant": Knowledge-aware Attention Framework for Relationship Extraction
With the increasing legalization of medical and recreational use of cannabis,
more research is needed to understand the association between depression and
consumer behavior related to cannabis consumption. Big social media data has
potential to provide deeper insights about these associations to public health
analysts. In this interdisciplinary study, we demonstrate the value of
incorporating domain-specific knowledge in the learning process to identify the
relationships between cannabis use and depression. We develop an end-to-end
knowledge infused deep learning framework (Gated-K-BERT) that leverages the
pre-trained BERT language representation model and domain-specific declarative
knowledge source (Drug Abuse Ontology (DAO)) to jointly extract entities and
their relationship using gated fusion sharing mechanism. Our model is further
tailored to provide more focus to the entities mention in the sentence through
entity-position aware attention layer, where ontology is used to locate the
target entities position. Experimental results show that inclusion of the
knowledge-aware attentive representation in association with BERT can extract
the cannabis-depression relationship with better coverage in comparison to the
state-of-the-art relation extractor
Detecting Mental Distresses Using Social Behavior Analysis in the Context of COVID-19: A Survey
Online social media provides a channel for monitoring people\u27s social behaviors from which to infer and detect their mental distresses. During the COVID-19 pandemic, online social networks were increasingly used to express opinions, views, and moods due to the restrictions on physical activities and in-person meetings, leading to a significant amount of diverse user-generated social media content. This offers a unique opportunity to examine how COVID-19 changed global behaviors regarding its ramifications on mental well-being. In this article, we surveyed the literature on social media analysis for the detection of mental distress, with a special emphasis on the studies published since the COVID-19 outbreak. We analyze relevant research and its characteristics and propose new approaches to organizing the large amount of studies arising from this emerging research area, thus drawing new views, insights, and knowledge for interested communities. Specifically, we first classify the studies in terms of feature extraction types, language usage patterns, aesthetic preferences, and online behaviors. We then explored various methods (including machine learning and deep learning techniques) for detecting mental health problems. Building upon the in-depth review, we present our findings and discuss future research directions and niche areas in detecting mental health problems using social media data. We also elaborate on the challenges of this fast-growing research area, such as technical issues in deploying such systems at scale as well as privacy and ethical concerns
- …