13,884 research outputs found
False News On Social Media: A Data-Driven Survey
In the past few years, the research community has dedicated growing interest
to the issue of false news circulating on social networks. The widespread
attention on detecting and characterizing false news has been motivated by
considerable backlashes of this threat against the real world. As a matter of
fact, social media platforms exhibit peculiar characteristics, with respect to
traditional news outlets, which have been particularly favorable to the
proliferation of deceptive information. They also present unique challenges for
all kind of potential interventions on the subject. As this issue becomes of
global concern, it is also gaining more attention in academia. The aim of this
survey is to offer a comprehensive study on the recent advances in terms of
detection, characterization and mitigation of false news that propagate on
social media, as well as the challenges and the open questions that await
future research on the field. We use a data-driven approach, focusing on a
classification of the features that are used in each study to characterize
false information and on the datasets used for instructing classification
methods. At the end of the survey, we highlight emerging approaches that look
most promising for addressing false news
Leveraging Node Attributes for Incomplete Relational Data
Relational data are usually highly incomplete in practice, which inspires us
to leverage side information to improve the performance of community detection
and link prediction. This paper presents a Bayesian probabilistic approach that
incorporates various kinds of node attributes encoded in binary form in
relational models with Poisson likelihood. Our method works flexibly with both
directed and undirected relational networks. The inference can be done by
efficient Gibbs sampling which leverages sparsity of both networks and node
attributes. Extensive experiments show that our models achieve the
state-of-the-art link prediction results, especially with highly incomplete
relational data.Comment: Appearing in ICML 201
Extracting Implicit Social Relation for Social Recommendation Techniques in User Rating Prediction
Recommendation plays an increasingly important role in our daily lives.
Recommender systems automatically suggest items to users that might be
interesting for them. Recent studies illustrate that incorporating social trust
in Matrix Factorization methods demonstrably improves accuracy of rating
prediction. Such approaches mainly use the trust scores explicitly expressed by
users. However, it is often challenging to have users provide explicit trust
scores of each other. There exist quite a few works, which propose Trust
Metrics to compute and predict trust scores between users based on their
interactions. In this paper, first we present how social relation can be
extracted from users' ratings to items by describing Hellinger distance between
users in recommender systems. Then, we propose to incorporate the predicted
trust scores into social matrix factorization models. By analyzing social
relation extraction from three well-known real-world datasets, which both:
trust and recommendation data available, we conclude that using the implicit
social relation in social recommendation techniques has almost the same
performance compared to the actual trust scores explicitly expressed by users.
Hence, we build our method, called Hell-TrustSVD, on top of the
state-of-the-art social recommendation technique to incorporate both the
extracted implicit social relations and ratings given by users on the
prediction of items for an active user. To the best of our knowledge, this is
the first work to extend TrustSVD with extracted social trust information. The
experimental results support the idea of employing implicit trust into matrix
factorization whenever explicit trust is not available, can perform much better
than the state-of-the-art approaches in user rating prediction
Russian word sense induction by clustering averaged word embeddings
The paper reports our participation in the shared task on word sense
induction and disambiguation for the Russian language (RUSSE-2018). Our team
was ranked 2nd for the wiki-wiki dataset (containing mostly homonyms) and 5th
for the bts-rnc and active-dict datasets (containing mostly polysemous words)
among all 19 participants.
The method we employed was extremely naive. It implied representing contexts
of ambiguous words as averaged word embedding vectors, using off-the-shelf
pre-trained distributional models. Then, these vector representations were
clustered with mainstream clustering techniques, thus producing the groups
corresponding to the ambiguous word senses. As a side result, we show that word
embedding models trained on small but balanced corpora can be superior to those
trained on large but noisy data - not only in intrinsic evaluation, but also in
downstream tasks like word sense induction.Comment: Proceedings of the 24rd International Conference on Computational
Linguistics and Intellectual Technologies (Dialogue-2018
Semi-Supervised Learning for Neural Keyphrase Generation
We study the problem of generating keyphrases that summarize the key points
for a given document. While sequence-to-sequence (seq2seq) models have achieved
remarkable performance on this task (Meng et al., 2017), model training often
relies on large amounts of labeled data, which is only applicable to
resource-rich domains. In this paper, we propose semi-supervised keyphrase
generation methods by leveraging both labeled data and large-scale unlabeled
samples for learning. Two strategies are proposed. First, unlabeled documents
are first tagged with synthetic keyphrases obtained from unsupervised keyphrase
extraction methods or a selflearning algorithm, and then combined with labeled
samples for training. Furthermore, we investigate a multi-task learning
framework to jointly learn to generate keyphrases as well as the titles of the
articles. Experimental results show that our semi-supervised learning-based
methods outperform a state-of-the-art model trained with labeled data only.Comment: To appear in EMNLP 2018 (12 pages, 7 figures, 6 tables
Hete-CF: Social-Based Collaborative Filtering Recommendation using Heterogeneous Relations
Collaborative filtering algorithms haven been widely used in recommender
systems. However, they often suffer from the data sparsity and cold start
problems. With the increasing popularity of social media, these problems may be
solved by using social-based recommendation. Social-based recommendation, as an
emerging research area, uses social information to help mitigate the data
sparsity and cold start problems, and it has been demonstrated that the
social-based recommendation algorithms can efficiently improve the
recommendation performance. However, few of the existing algorithms have
considered using multiple types of relations within one social network. In this
paper, we investigate the social-based recommendation algorithms on
heterogeneous social networks and proposed Hete-CF, a Social Collaborative
Filtering algorithm using heterogeneous relations. Distinct from the exiting
methods, Hete-CF can effectively utilize multiple types of relations in a
heterogeneous social network. In addition, Hete-CF is a general approach and
can be used in arbitrary social networks, including event based social
networks, location based social networks, and any other types of heterogeneous
information networks associated with social information. The experimental
results on two real-world data sets, DBLP (a typical heterogeneous information
network) and Meetup (a typical event based social network) show the
effectiveness and efficiency of our algorithm
- …