18 research outputs found

    Homofilia por tĂłpicos no espalhamento de memes em redes sociais online

    Get PDF
    Orientador: André SantanchèDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de ComputaçãoResumo: Um dos problemas centrais na ciência social computacional é entender como a informação se espalha em redes sociais online. Alguns trabalhos afirmam que pessoas que usam estas redes podem não ser capazes de lidar com a quantidade de informação devido às restrições cognitivas, o que resulta em um limite de atenção gasta para ler e compartilhar mensagens. Disso emerge um cenário de competição, em que memes das mensagens visam ser lembrados e compartilhados para que durem mais do que os outros. Esta pesquisa está preocupada em construir uma evidência empírica de que a homofilia desempenha um papel no sucesso de cada meme na competição. A homofilia é um efeito observado quando pessoas preferem interagir com aqueles com os quais se identificam. Coletando dados no Twitter, nós aglomeramos memes em tópicos que são usados para a caracterização da homofilia. Executamos um experimento computacional, baseado num modelo simplificado de memória para adoção de memes, e verificamos que a adoção é influenciada pela homofilia por tópicosAbstract: One of the central problems in the computational social science is to understand how information spreads in online social networks. Some works state that people using these networks may not cope with the amount of information due to cognitive restrictions, resulting in a limit of attention spent reading and sharing messages. A competition scenario emerges, where memes of messages want to be remembered and shared in order to outlast others. This research is concerned with building empirical evidence that homophily plays a role in the success of each meme over the competition. Homophily is an effect observed when people prefer to interact with those they identify with. By gathering data from Twitter, we clustered memes into topics that are used to characterize the homophily. We executed a computational experiment, based on a simplified memory model of meme adoption, and verified that the adoption is influenced by topical homophilyMestradoCiência da ComputaçãoMestre em Ciência da Computação131090/2017-8CNP

    Information spreading during emergencies and anomalous events

    Full text link
    The most critical time for information to spread is in the aftermath of a serious emergency, crisis, or disaster. Individuals affected by such situations can now turn to an array of communication channels, from mobile phone calls and text messages to social media posts, when alerting social ties. These channels drastically improve the speed of information in a time-sensitive event, and provide extant records of human dynamics during and afterward the event. Retrospective analysis of such anomalous events provides researchers with a class of "found experiments" that may be used to better understand social spreading. In this chapter, we study information spreading due to a number of emergency events, including the Boston Marathon Bombing and a plane crash at a western European airport. We also contrast the different information which may be gleaned by social media data compared with mobile phone data and we estimate the rate of anomalous events in a mobile phone dataset using a proposed anomaly detection method.Comment: 19 pages, 11 figure

    Dynamics of Information Diffusion and Social Sensing

    Full text link
    Statistical inference using social sensors is an area that has witnessed remarkable progress and is relevant in applications including localizing events for targeted advertising, marketing, localization of natural disasters and predicting sentiment of investors in financial markets. This chapter presents a tutorial description of four important aspects of sensing-based information diffusion in social networks from a communications/signal processing perspective. First, diffusion models for information exchange in large scale social networks together with social sensing via social media networks such as Twitter is considered. Second, Bayesian social learning models and risk averse social learning is considered with applications in finance and online reputation systems. Third, the principle of revealed preferences arising in micro-economics theory is used to parse datasets to determine if social sensors are utility maximizers and then determine their utility functions. Finally, the interaction of social sensors with YouTube channel owners is studied using time series analysis methods. All four topics are explained in the context of actual experimental datasets from health networks, social media and psychological experiments. Also, algorithms are given that exploit the above models to infer underlying events based on social sensing. The overview, insights, models and algorithms presented in this chapter stem from recent developments in network science, economics and signal processing. At a deeper level, this chapter considers mean field dynamics of networks, risk averse Bayesian social learning filtering and quickest change detection, data incest in decision making over a directed acyclic graph of social sensors, inverse optimization problems for utility function estimation (revealed preferences) and statistical modeling of interacting social sensors in YouTube social networks.Comment: arXiv admin note: text overlap with arXiv:1405.112

    Social Sharing Design

    Get PDF
    This dissertation studies the effects of sharing mechanisms and content characteristics on social sharing processes. Social sharing describes any exchange of resources available in a social system (news, products, ideas, behaviors, etc.). The dissertation consists of four empirical studies, each addressing a different research question. The first empirical project focuses on the effects of user control over the sharing process, preservation of user’s privacy, and symbolic expressions of self-focus. The results from a laboratory experiment and two field studies reveal that content sharing is negatively affected by sharing mechanisms that allow greater control over the sharing process, aim to preserve the user’s privacy and express a self-focus. The second research project investigates how the sharing mechanisms which allow the non-disclosure of the users’ identity impact social sharing. The results show that content related to controversial topics are less likely to be shared on Facebook, whereas they are actively discussed on discussion boards. The third research project analyzes how the payment of incentives influences the social sharing. The results of three field experiments show that the payment of incentives increases the number of consumer reviews. Moreover, paid customers write less positive reviews and are less willing to make recommendations to their peers. The last study explores whether positive or negative content is shared with peers. The results show that the relationship between content’s positivity and its virality follows an inverted U-shape

    Measuring Collective Attention in Online Content: Sampling, Engagement, and Network Effects

    Get PDF
    The production and consumption of online content have been increasing rapidly, whereas human attention is a scarce resource. Understanding how the content captures collective attention has become a challenge of growing importance. In this thesis, we tackle this challenge from three fronts -- quantifying sampling effects of social media data; measuring engagement behaviors towards online content; and estimating network effects induced by the recommender systems. Data sampling is a fundamental problem. To obtain a list of items, one common method is sampling based on the item prevalence in social media streams. However, social data is often noisy and incomplete, which may affect the subsequent observations. For each item, user behaviors can be conceptualized as two steps -- the first step is relevant to the content appeal, measured by the number of clicks; the second step is relevant to the content quality, measured by the post-clicking metrics, e.g., dwell time, likes, or comments. We categorize online attention (behaviors) into two classes: popularity (clicking) and engagement (watching, liking, or commenting). Moreover, modern platforms use recommender systems to present the users with a tailoring content display for maximizing satisfaction. The recommendation alters the appeal of an item by changing its ranking, and consequently impacts its popularity. Our research is enabled by the data available from the largest video hosting site YouTube. We use YouTube URLs shared on Twitter as a sampling protocol to obtain a collection of videos, and we track their prevalence from 2015 to 2019. This method creates a longitudinal dataset consisting of more than 5 billion tweets. Albeit the volume is substantial, we find Twitter still subsamples the data. Our dataset covers about 80% of all tweets with YouTube URLs. We present a comprehensive measurement study of the Twitter sampling effects across different timescales and different subjects. We find that the volume of missing tweets can be estimated by Twitter rate limit messages, true entity ranking can be inferred based on sampled observations, and sampling compromises the quality of network and diffusion models. Next, we present the first large-scale measurement study of how users collectively engage with YouTube videos. We study the time and percentage of each video being watched. We propose a duration-calibrated metric, called relative engagement, which is correlated with recognized notion of content quality, stable over time, and predictable even before a video's upload. Lastly, we examine the network effects induced by the YouTube recommender system. We construct the recommendation network for 60,740 music videos from 4,435 professional artists. An edge indicates that the target video is recommended on the webpage of source video. We discover the popularity bias -- videos are disproportionately recommended towards more popular videos. We use the bow-tie structure to characterize the network and find that the largest strongly connected component consists of 23.1% of videos while occupying 82.6% of attention. We also build models to estimate the latent influence between videos and artists. By taking into account the network structure, we can predict video popularity 9.7% better than other baselines. Altogether, we explore the collective consuming patterns of human attention towards online content. Methods and findings from this thesis can be used by content producers, hosting sites, and online users alike to improve content production, advertising strategies, and recommender systems. We expect our new metrics, methods, and observations can generalize to other multimedia platforms such as the music streaming service Spotify

    Signaling and Reciprocity:Robust Decentralized Information Flows in Social, Communication, and Computer Networks

    Get PDF
    Complex networks exist for a number of purposes. The neural, metabolic and food networks ensure our survival, while the social, economic, transportation and communication networks allow us to prosper. Independently of the purposes and particularities of the physical embodiment of the networks, one of their fundamental functions is the delivery of information from one part of the network to another. Gossip and diseases diffuse in the social networks, electrochemical signals propagate in the neural networks and data packets travel in the Internet. Engineering networks for robust information flows is a challenging task. First, the mechanism through which the network forms and changes its topology needs to be defined. Second, within a given topology, the information must be routed to the appropriate recipients. Third, both the network formation and the routing mechanisms need to be robust against a wide spectrum of failures and adversaries. Fourth, the network formation, routing and failure recovery must operate under the resource constraints, either intrinsic or extrinsic to the network. Finally, the autonomously operating parts of the network must be incentivized to contribute their resources to facilitate the information flows. This thesis tackles the above challenges within the context of several types of networks: 1) peer-to-peer overlays – computers interconnected over the Internet to form an overlay in which participants provide various services to one another, 2) mobile ad-hoc networks – mobile nodes distributed in physical space communicating wirelessly with the goal of delivering data from one part of the network to another, 3) file-sharing networks – networks whose participants interconnect over the Internet to exchange files, 4) social networks – humans disseminating and consuming information through the network of social relationships. The thesis makes several contributions. Firstly, we propose a general algorithm, which given a set of nodes embedded in an arbitrary metric space, interconnects them into a network that efficiently routes information. We apply the algorithm to the peer-to-peer overlays and experimentally demonstrate its high performance, scalability as well as resilience to continuous peer arrivals and departures. We then shift our focus to the problem of the reliability of routing in the peer-to-peer overlays. Each overlay peer has limited resources and when they are exhausted this ultimately leads to delayed or lost overlay messages. All the solutions addressing this problem rely on message redundancy, which significantly increases the resource costs of fault-tolerance. We propose a bandwidth-efficient single-path Forward Feedback Protocol (FFP) for overlay message routing in which successfully delivered messages are followed by a feedback signal to reinforce the routing paths. Internet testbed evaluation shows that FFP uses 2-5 times less network bandwidth than the existing protocols relying on message redundancy, while achieving comparable fault-tolerance levels under a variety of failure scenarios. While the Forward Feedback Protocol is robust to message loss and delays, it is vulnerable to malicious message injection. We address this and other security problems by proposing Castor, a variant of FFP for mobile ad-hoc networks (MANETs). In Castor, we use the same general mechanism as in FFP; each time a message is routed, the routing path is either enforced or weakened by the feedback signal depending on whether the routing succeeded or not. However, unlike FFP, Castor employs cryptographic mechanisms for ensuring the integrity and authenticity of the messages. We compare Castor to four other MANET routing protocols. Despite Castor's simplicity, it achieves up to 40% higher packet delivery rates than the other protocols and recovers at least twice as fast as the other protocols in a wide range of attacks and failure scenarios. Both of our protocols, FFP and Castor, rely on simple signaling to improve the routing robustness in peer-to-peer and mobile ad-hoc networks. Given the success of the signaling mechanism in shaping the information flows in these two types of networks, we examine if signaling plays a similar crucial role in the on-line social networks. We characterize the propagation of URLs in the social network of Twitter. The data analysis uncovers several statistical regularities in the user activity, the social graph, the structure of the URL cascades as well as the communication and signaling dynamics. Based on these results, we propose a propagation model that accurately predicts which users are likely to mention which URLs. We outline a number of applications where the social network information flow modelling would be crucial: content ranking and filtering, viral marketing and spam detection. Finally, we consider the problem of freeriding in peer-to-peer file-sharing applications, when users can download data from others, but never reciprocate by uploading. To address the problem, we propose a variant of the BitTorrent system in which two peers are only allowed to connect if their owners know one another in the real world. When the users know which other users their BitTorrent client connects to, they are more likely to cooperate. The social network becomes the content distribution network and the freeriding problem is solved by leveraging the social norms and reciprocity to stabilize cooperation rather than relying on technological means. Our extensive simulation shows that the social network topology is an efficient and scalable content distribution medium, while at the same time provides robustness to freeriding

    Collective attention in online social networks

    Get PDF
    Social media is an ever-present tool in modern society, and its widespread usage positions it as a valuable source of insights into society at large. The study of collective attention in particular is one application that benefits from the scale of social media data. In this thesis we will investigate how collective attention manifests on social media and how it can be understood. We approach this challenge from several perspectives across network and data science. We first focus on a period of increased media attention to climate change to see how robust the previously observed polarised structures are under a collective attention event. Our experiments will show that while the level of engagement with the climate change debate increases, there is little disruption to the existing polarised structure in the communication network. Understanding the climate media debate requires addressing a methodological concern about the most effective method for weighting bipartite network projections with respect to the accuracy of community detection. We test seven weighting schemes on constructed networks with known community structure and then use the preferred methodology we identify to study collective attention in the climate change debate on Twitter. Following on from this, we will investigate how collective attention changes over the course of a single event over a longer period, namely the COVID-19 pandemic. We measure how the disruption to in-person social interactions as a consequence of attempts to limit the spread of COVID-19 in England and Wales have affected social interaction patterns as they appear on Twitter. Using a dataset of tweets with location tags, we will see how the spatial attention to locations and collective attention to discussion topics are affected by social distancing and population movement restrictions in different stages of the pandemic. Finally we present a new analysis framework for collective attention events that allows direct comparisons across different time and volume scales, such as those seen in the climate change and COVID-19 experiments. We demonstrate that this approach performs better than traditional approaches that rely on binning the timeseries at certain resolutions and comment on the mechanistic properties highlighted by our new methodology.Engineering and Physical Sciences Research Council (EPSRC
    corecore