10,025 research outputs found

    False News On Social Media: A Data-Driven Survey

    Full text link
    In the past few years, the research community has dedicated growing interest to the issue of false news circulating on social networks. The widespread attention on detecting and characterizing false news has been motivated by considerable backlashes of this threat against the real world. As a matter of fact, social media platforms exhibit peculiar characteristics, with respect to traditional news outlets, which have been particularly favorable to the proliferation of deceptive information. They also present unique challenges for all kind of potential interventions on the subject. As this issue becomes of global concern, it is also gaining more attention in academia. The aim of this survey is to offer a comprehensive study on the recent advances in terms of detection, characterization and mitigation of false news that propagate on social media, as well as the challenges and the open questions that await future research on the field. We use a data-driven approach, focusing on a classification of the features that are used in each study to characterize false information and on the datasets used for instructing classification methods. At the end of the survey, we highlight emerging approaches that look most promising for addressing false news

    Information System Articulation Development - Managing Veracity Attributes and Quantifying Relationship with Readability of Textual Data

    Get PDF
    Often the textual data are either disorganized or misinterpreted because of unstructured Big Data in multiple dimensions. Managing readable textual alphanumeric data and its analytics is challenging. In spatial dimensions, the facts can be ambiguous and inconsistent, posing interpretation and new knowledge discovery challenges. The information can be wordy, erratic, and noisy. The research aims to assimilate the data characteristics through Information System (IS) artefacts that are appropriate to data analytics, especially in application domains that involve big data sources. Data heterogeneity and multidimensionality can make and preclude IS-guided veracity models in the data integration process, including customer analytics services. The veracity of big data thus can impact visualization and value, including knowledge enhancement in the vast amount of textual data qualitatively. The manner the veracity features construed in each schematic, semantic and syntactic attribute dimension in several IS artefacts and relevant documents can enhance the readability of textual data robustly

    Towards Evaluating Veracity of Textual Statements on the Web

    Get PDF
    The quality of digital information on the web has been disquieting due to the absence of careful checking. Consequently, a large volume of false textual information is being produced and disseminated with misstatements of facts. The potential negative influence on the public, especially in time-sensitive emergencies, is a growing concern. This concern has motivated this thesis to deal with the problem of veracity evaluation. In this thesis, we set out to develop machine learning models for the veracity evaluation of textual claims based on stance and user engagements. Such evaluation is achieved from three aspects: news stance detection engaged user replies in social media and the engagement dynamics. First of all, we study stance detection in the context of online news articles where a claim is predicted to be true if it is supported by the evidential articles. We propose to manifest a hierarchical structure among stance classes: the high-level aims at identifying relatedness, while the low-level aims at classifying, those identified as related, into the other three classes, i.e., agree, disagree, and discuss. This model disentangles the semantic difference of related/unrelated and the other three stances and helps address the class imbalance problem. Beyond news articles, user replies on social media platforms also contain stances and can infer claim veracity. Claims and user replies in social media are usually short and can be ambiguous; to deal with semantic ambiguity, we design a deep latent variable model with a latent distribution to allow multimodal semantic distribution. Also, marginalizing the latent distribution enables the model to be more robust in relatively smalls-sized datasets. Thirdly, we extend the above content-based models by tracking the dynamics of user engagement in misinformation propagation. To capture these dynamics, we formulate user engagements as a dynamic graph and extract its temporal evolution patterns and geometric features based on an attention-modified Temporal Point Process. This allows to forecast the cumulative number of engaged users and can be useful in assessing the threat level of an individual piece of misinformation. The ability to evaluate veracity and forecast the scale growth of engagement networks serves to practically assist the minimization of online false information’s negative impacts

    Use of a controlled experiment and computational models to measure the impact of sequential peer exposures on decision making

    Full text link
    It is widely believed that one's peers influence product adoption behaviors. This relationship has been linked to the number of signals a decision-maker receives in a social network. But it is unclear if these same principles hold when the pattern by which it receives these signals vary and when peer influence is directed towards choices which are not optimal. To investigate that, we manipulate social signal exposure in an online controlled experiment using a game with human participants. Each participant in the game makes a decision among choices with differing utilities. We observe the following: (1) even in the presence of monetary risks and previously acquired knowledge of the choices, decision-makers tend to deviate from the obvious optimal decision when their peers make similar decision which we call the influence decision, (2) when the quantity of social signals vary over time, the forwarding probability of the influence decision and therefore being responsive to social influence does not necessarily correlate proportionally to the absolute quantity of signals. To better understand how these rules of peer influence could be used in modeling applications of real world diffusion and in networked environments, we use our behavioral findings to simulate spreading dynamics in real world case studies. We specifically try to see how cumulative influence plays out in the presence of user uncertainty and measure its outcome on rumor diffusion, which we model as an example of sub-optimal choice diffusion. Together, our simulation results indicate that sequential peer effects from the influence decision overcomes individual uncertainty to guide faster rumor diffusion over time. However, when the rate of diffusion is slow in the beginning, user uncertainty can have a substantial role compared to peer influence in deciding the adoption trajectory of a piece of questionable information

    The Impact of Twitter Features on Credibility Ratings - An Explorative Examination Combining Psychological Measurements and Feature Based Selection Methods

    Get PDF
    In a post-truth age determined by Social Media channels providing large amounts of information of questionable credibility while at the same time people increasingly tend to rely on online information, the ability to detect whether content is believable is developing into an important challenge. Most of the work in that field suggested automated approaches to perform binary classification to determine information veracity. Recipients´ perspectives and multidimensional psychological credibility measurements have rarely been considered. To fill this gap and gain more insights into the impact of a tweet´s features on perceived credibility, we conducted a survey asking participants (N=2626) to rate the credibility of crises related tweets. The resulting 24.823 ratings were used for an explorative feature selection analysis revealing that mostly meta-related features like the number of followers of the author, the count of tweets produced and the ratio of tweet number and days since account creation affect credibility judgements

    Taming Uncertainty in Big Data - Evidence from Social Media in Urban Areas

    Get PDF
    While the classic definition of Big Data included the dimensions volume, velocity, and variety, a fourth dimension, veracity, has recently come to the attention of researchers and practitioners. The increasing amount of user-generated data associated with the rise of social media emphasizes the need for methods to deal with the uncertainty inherent to these data sources. In this paper we address one aspect of uncertainty by developing a new methodology to establish the reliability of user-generated data based upon causal links with recurring patterns. We associate a large data set of geo-tagged Twitter messages in San Francisco with points of interest, such as bars, restaurants, or museums, within the city. This model is validated by causal relationships between a point of interest and the amount of messages in its vicinity. We subsequently analyze the behavior of these messages over time using a jackknifing procedure to identify categories of points of interest that exhibit consistent patterns over time. Ultimately, we condense this analysis into an indicator that gives evidence on the certainty of a data set based on these causal relationships and recurring patterns in temporal and spatial dimensions
    • …
    corecore