13,884 research outputs found

    False News On Social Media: A Data-Driven Survey

    Full text link
    In the past few years, the research community has dedicated growing interest to the issue of false news circulating on social networks. The widespread attention on detecting and characterizing false news has been motivated by considerable backlashes of this threat against the real world. As a matter of fact, social media platforms exhibit peculiar characteristics, with respect to traditional news outlets, which have been particularly favorable to the proliferation of deceptive information. They also present unique challenges for all kind of potential interventions on the subject. As this issue becomes of global concern, it is also gaining more attention in academia. The aim of this survey is to offer a comprehensive study on the recent advances in terms of detection, characterization and mitigation of false news that propagate on social media, as well as the challenges and the open questions that await future research on the field. We use a data-driven approach, focusing on a classification of the features that are used in each study to characterize false information and on the datasets used for instructing classification methods. At the end of the survey, we highlight emerging approaches that look most promising for addressing false news

    Leveraging Node Attributes for Incomplete Relational Data

    Full text link
    Relational data are usually highly incomplete in practice, which inspires us to leverage side information to improve the performance of community detection and link prediction. This paper presents a Bayesian probabilistic approach that incorporates various kinds of node attributes encoded in binary form in relational models with Poisson likelihood. Our method works flexibly with both directed and undirected relational networks. The inference can be done by efficient Gibbs sampling which leverages sparsity of both networks and node attributes. Extensive experiments show that our models achieve the state-of-the-art link prediction results, especially with highly incomplete relational data.Comment: Appearing in ICML 201

    Extracting Implicit Social Relation for Social Recommendation Techniques in User Rating Prediction

    Full text link
    Recommendation plays an increasingly important role in our daily lives. Recommender systems automatically suggest items to users that might be interesting for them. Recent studies illustrate that incorporating social trust in Matrix Factorization methods demonstrably improves accuracy of rating prediction. Such approaches mainly use the trust scores explicitly expressed by users. However, it is often challenging to have users provide explicit trust scores of each other. There exist quite a few works, which propose Trust Metrics to compute and predict trust scores between users based on their interactions. In this paper, first we present how social relation can be extracted from users' ratings to items by describing Hellinger distance between users in recommender systems. Then, we propose to incorporate the predicted trust scores into social matrix factorization models. By analyzing social relation extraction from three well-known real-world datasets, which both: trust and recommendation data available, we conclude that using the implicit social relation in social recommendation techniques has almost the same performance compared to the actual trust scores explicitly expressed by users. Hence, we build our method, called Hell-TrustSVD, on top of the state-of-the-art social recommendation technique to incorporate both the extracted implicit social relations and ratings given by users on the prediction of items for an active user. To the best of our knowledge, this is the first work to extend TrustSVD with extracted social trust information. The experimental results support the idea of employing implicit trust into matrix factorization whenever explicit trust is not available, can perform much better than the state-of-the-art approaches in user rating prediction

    Russian word sense induction by clustering averaged word embeddings

    Full text link
    The paper reports our participation in the shared task on word sense induction and disambiguation for the Russian language (RUSSE-2018). Our team was ranked 2nd for the wiki-wiki dataset (containing mostly homonyms) and 5th for the bts-rnc and active-dict datasets (containing mostly polysemous words) among all 19 participants. The method we employed was extremely naive. It implied representing contexts of ambiguous words as averaged word embedding vectors, using off-the-shelf pre-trained distributional models. Then, these vector representations were clustered with mainstream clustering techniques, thus producing the groups corresponding to the ambiguous word senses. As a side result, we show that word embedding models trained on small but balanced corpora can be superior to those trained on large but noisy data - not only in intrinsic evaluation, but also in downstream tasks like word sense induction.Comment: Proceedings of the 24rd International Conference on Computational Linguistics and Intellectual Technologies (Dialogue-2018

    Semi-Supervised Learning for Neural Keyphrase Generation

    Full text link
    We study the problem of generating keyphrases that summarize the key points for a given document. While sequence-to-sequence (seq2seq) models have achieved remarkable performance on this task (Meng et al., 2017), model training often relies on large amounts of labeled data, which is only applicable to resource-rich domains. In this paper, we propose semi-supervised keyphrase generation methods by leveraging both labeled data and large-scale unlabeled samples for learning. Two strategies are proposed. First, unlabeled documents are first tagged with synthetic keyphrases obtained from unsupervised keyphrase extraction methods or a selflearning algorithm, and then combined with labeled samples for training. Furthermore, we investigate a multi-task learning framework to jointly learn to generate keyphrases as well as the titles of the articles. Experimental results show that our semi-supervised learning-based methods outperform a state-of-the-art model trained with labeled data only.Comment: To appear in EMNLP 2018 (12 pages, 7 figures, 6 tables

    Hete-CF: Social-Based Collaborative Filtering Recommendation using Heterogeneous Relations

    Full text link
    Collaborative filtering algorithms haven been widely used in recommender systems. However, they often suffer from the data sparsity and cold start problems. With the increasing popularity of social media, these problems may be solved by using social-based recommendation. Social-based recommendation, as an emerging research area, uses social information to help mitigate the data sparsity and cold start problems, and it has been demonstrated that the social-based recommendation algorithms can efficiently improve the recommendation performance. However, few of the existing algorithms have considered using multiple types of relations within one social network. In this paper, we investigate the social-based recommendation algorithms on heterogeneous social networks and proposed Hete-CF, a Social Collaborative Filtering algorithm using heterogeneous relations. Distinct from the exiting methods, Hete-CF can effectively utilize multiple types of relations in a heterogeneous social network. In addition, Hete-CF is a general approach and can be used in arbitrary social networks, including event based social networks, location based social networks, and any other types of heterogeneous information networks associated with social information. The experimental results on two real-world data sets, DBLP (a typical heterogeneous information network) and Meetup (a typical event based social network) show the effectiveness and efficiency of our algorithm
    • …
    corecore