42 research outputs found

    Collaborative Inference of Coexisting Information Diffusions

    Full text link
    Recently, \textit{diffusion history inference} has become an emerging research topic due to its great benefits for various applications, whose purpose is to reconstruct the missing histories of information diffusion traces according to incomplete observations. The existing methods, however, often focus only on single information diffusion trace, while in a real-world social network, there often coexist multiple information diffusions over the same network. In this paper, we propose a novel approach called Collaborative Inference Model (CIM) for the problem of the inference of coexisting information diffusions. By exploiting the synergism between the coexisting information diffusions, CIM holistically models multiple information diffusions as a sparse 4th-order tensor called Coexisting Diffusions Tensor (CDT) without any prior assumption of diffusion models, and collaboratively infers the histories of the coexisting information diffusions via a low-rank approximation of CDT with a fusion of heterogeneous constraints generated from additional data sources. To improve the efficiency, we further propose an optimal algorithm called Time Window based Parallel Decomposition Algorithm (TWPDA), which can speed up the inference without compromise on the accuracy by utilizing the temporal locality of information diffusions. The extensive experiments conducted on real world datasets and synthetic datasets verify the effectiveness and efficiency of CIM and TWPDA

    Estimating community feedback effect on topic choice in social media with predictive modeling

    Get PDF
    Social media users post content on various topics. A defining feature of social media is that other users can provide feedback—called community feedback—to their content in the form of comments, replies, and retweets. We hypothesize that the amount of received feedback influences the choice of topics on which a social media user posts. However, it is challenging to test this hypothesis as user heterogeneity and external confounders complicate measuring the feedback effect. Here, we investigate this hypothesis with a predictive approach based on an interpretable model of an author’s decision to continue the topic of their previous post. We explore the confounding factors, including author’s topic preferences and unobserved external factors such as news and social events, by optimizing the predictive accuracy. This approach enables us to identify which users are susceptible to community feedback. Overall, we find that 33% and 14% of active users in Reddit and Twitter, respectively, are influenced by community feedback. The model suggests that this feedback alters the probability of topic continuation up to 14%, depending on the user and the amount of feedback

    The impact of social bots on public COVID-19 perceptions during the 2020 U.S. presidential election

    Full text link
    Plusieurs études ont démontré que les contenus nuisibles et perturbateurs en ligne sont en partie produits par des acteurs communément appelés robots sociaux. Ils représentent des entités autonomes ou semi-autonomes capables de partager, aimer et poster des messages à des fins préjudiciables. Plusieurs auteurs ont mis en évidence une stratégie utilisée par ces acteurs, l’utilisation du cadrage conflictuel des enjeux. Dans ce mémoire, j’examine les caractéristiques et le potentiel rôle des robots sociaux sur la perception de la COVID-19 en période de forte polarisation au moment de l’élection présidentielle américaine de 2020. Je m’appuie sur plusieurs méthodes en science computationnelle pour analyser les caractéristiques (stratégies et comportements) des robots sociaux ainsi que leur portée politique en utilisant des données Twitter durant l’élection présidentielle de 2020. Les résultats de cette étude montrent que les robots sociaux conservateurs envoient plus de tweets de conspiration que leurs homologues libéraux. Cependant, en termes d’émotion liée à la COVID-19, les humains et les robots ont tous les deux un sentiment positif à l’égard de cet enjeu. Finalement, aucune évidence ne suggère que le contenu négatif et la proportion des robots sociaux ont un effet sur la perception de la COVID-19 par les utilisateurs.Increasing evidence suggests that a growing amount of disruptive and harmful content is generated by rogue actors known as malicious social bots. They are autonomous entities that can share, like, or post messages for detrimental purposes. Several authors have highlighted one strategy employed by those automated actors, the use of a conflicting frame of issues, employed throughout this paper. In this work, I present a framework to depict their potential role in online discussions related to COVID-19 topics around the 2020 U.S. presidential election. I leverage different computational methods to look into their online characteristics and potential impact on the users’ COVID-19 perception using Twitter data during the 2020 U.S. presidential election. The results of this study show that conservative bot users send more conspiracy tweets, but human and bot users talk positively about COVID-19. Social bots do not send more negative tweets or retweets over time than human users. Additionally, no evidence suggests that the negativity of bots’ content, as well as their online proportion, will cause a change in users’ COVID-19 perception

    Statistical Methods for Analyzing Time Series Data Drawn from Complex Social Systems

    Get PDF
    The rise of human interaction in digital environments has lead to an abundance of behavioral traces. These traces allow for model-based investigation of human-human and human-machine interaction `in the wild.' Stochastic models allow us to both predict and understand human behavior. In this thesis, we present statistical procedures for learning such models from the behavioral traces left in digital environments. First, we develop a non-parametric method for smoothing time series data corrupted by serially correlated noise. The method determines the simplest smoothing of the data that simultaneously gives the simplest residuals, where simplicity of the residuals is measured by their statistical complexity. We find that complexity regularized regression outperforms generalized cross validation in the presence of serially correlated noise. Next, we cast the task of modeling individual-level user behavior on social media into a predictive framework. We demonstrate the performance of two contrasting approaches, computational mechanics and echo state networks, on a heterogeneous data set drawn from user behavior on Twitter. We demonstrate that the behavior of users can be well-modeled as processes with self-feedback. We find that the two modeling approaches perform very similarly for most users, but that users where the two methods differ in performance highlight the challenges faced in applying predictive models to dynamic social data. We then expand the predictive problem of the previous work to modeling the aggregate behavior of large collections of users. We use three models, corresponding to seasonal, aggregate autoregressive, and aggregation-of-individual approaches, and find that the performance of the methods at predicting times of high activity depends strongly on the tradeoff between true and false positives, with no method dominating. Our results highlight the challenges and opportunities involved in modeling complex social systems, and demonstrate how influencers interested in forecasting potential user engagement can use complexity modeling to make better decisions. Finally, we turn from a predictive to a descriptive framework, and investigate how well user behavior can be attributed to time of day, self-memory, and social inputs. The models allow us to describe how a user processes their past behavior and their social inputs. We find that despite the diversity of observed user behavior, most models inferred fall into a small subclass of all possible finitary processes. Thus, our work demonstrates that user behavior, while quite complex, belies simple underlying computational structures

    Predictive Analysis on Twitter: Techniques and Applications

    Full text link
    Predictive analysis of social media data has attracted considerable attention from the research community as well as the business world because of the essential and actionable information it can provide. Over the years, extensive experimentation and analysis for insights have been carried out using Twitter data in various domains such as healthcare, public health, politics, social sciences, and demographics. In this chapter, we discuss techniques, approaches and state-of-the-art applications of predictive analysis of Twitter data. Specifically, we present fine-grained analysis involving aspects such as sentiment, emotion, and the use of domain knowledge in the coarse-grained analysis of Twitter data for making decisions and taking actions, and relate a few success stories

    Prediction Markets, Social Media and Information Efficiency

    Get PDF
    We consider the impact of breaking news on market prices. We measure activity on the micro-blogging platform Twitter surrounding a unique, newsworthy and identifiable event and investigate subsequent movements of betting prices on the prominent betting exchange, Bet- fair. The event we use is the Bigotgate scandal, which occurred during the 2010 UK General Election campaign. We use recent developments in time series econometric methods to identify and quantify movements in both Twitter activity and Betfair prices, and compare the timings of the two. We find that the response of market prices appears somewhat sluggish and is indicative of market inefficiency, as Betfair prices adjust with a delay, and there is evidence for post-news drift. This slow movement may be explained by the need for corroborating evidence via more traditional forms of media. Once important Tweeters begin to Tweet, including importantly breaking news Twitter feeds from traditional media sources, prices begin to move

    Identifying Twitter users who repost unreliable news sources with linguistic information

    Get PDF
    Social media has become a popular source for online news consumption with millions of users worldwide. However, it has become a primary platform for spreading disinformation with severe societal implications. Automatically identifying social media users that are likely to propagate posts from handles of unreliable news sources sometime in the future is of utmost importance for early detection and prevention of disinformation diffusion in a network, and has yet to be explored. To that end, we present a novel task for predicting whether a user will repost content from Twitter handles of unreliable news sources by leveraging linguistic information from the user’s own posts. We develop a new dataset of approximately 6.2K Twitter users mapped into two categories: (1) those that have reposted content from unreliable news sources; and (2) those that repost content only from reliable sources. For our task, we evaluate a battery of supervised machine learning models as well as state-of-the-art neural models, achieving up to 79.7 macro F1. In addition, our linguistic feature analysis uncovers differences in language use and style between the two user categories
    corecore