253 research outputs found

    Web-scale provenance reconstruction of implicit information diffusion on social media

    Get PDF
    Fast, massive, and viral data diffused on social media affects a large share of the online population, and thus, the (prospective) information diffusion mechanisms behind it are of great interest to researchers. The (retrospective) provenance of such data is equally important because it contributes to the understanding of the relevance and trustworthiness of the information. Furthermore, computing provenance in a timely way is crucial for particular use cases and practitioners, such as online journalists that promptly need to assess specific pieces of information. Social media currently provide insufficient mechanisms for provenance tracking, publication and generation, while state-of-the-art on social media research focuses mainly on explicit diffusion mechanisms (like retweets in Twitter or reshares in Facebook).The implicit diffusion mechanisms remain understudied due to the difficulties of being captured and properly understood. From a technical side, the state of the art for provenance reconstruction evaluates small datasets after the fact, sidestepping requirements for scale and speed of current social media data. In this paper, we investigate the mechanisms of implicit information diffusion by computing its fine-grained provenance. We prove that explicit mechanisms are insufficient to capture influence and our analysis unravels a significant part of implicit interactions and influence in social media. Our approach works incrementally and can be scaled up to cover a truly Web-scale scenario like major events. We can process datasets consisting of up to several millions of messages on a single machine at rates that cover bursty behaviour, without compromising result quality. By doing that, we provide to online journalists and social media users in general, fine grained provenance reconstruction which sheds lights on implicit interactions not captured by social media providers. These results are provided in an online fashion which also allows for fast relevance and trustworthiness assessment

    From Information Cascade to Knowledge Transfer:Predictive Analyses on Social Networks

    Get PDF
    As social media continues to influence our daily life, much research has focused on analyzing characteristics of social networks and tracking how information flows in social media. Information cascade originated from the study of information diffusion which focused on how decision making is affected by others depending on the network structure. An example of such study is the SIR (Susceptible, Infected, Removed) model. The current research on information cascade mainly focuses on three open questions: diffusion model, network inference, and influence maximization. Different from these studies, this dissertation aims at deriving a better understanding to the problem of who will transfer information to whom. Particularly, we want to investigate how knowledge is transferred in social media. The process of transferring knowledge is similar to the information cascade observed in other social networks in the way that both processes transfer particular information from information container to users who do not have the information. The study first works on understanding information cascade in term of detecting information outbreak in Twitter and the factors affecting the cascades. Then we analyze how knowledge is transferred in the sense of adopting research topic among scholars in the DBLP network. However, the knowledge transfer is not able to be well modeled by scholars’ publications since a “publication” action is a result of many complicated factors which is not controlled by the knowledge transfer only. So, we turn to Q&A forum, a different type of social media that explicitly contain the process of transferring knowledge, where knowledge transfer is embodied by the question and answering process. This dissertation further investigates Stack-Overflow, a popular Q&A forum, and models how knowledge is transferred among StackOverflow users. The knowledge transfer includes two parts: whether a question will receive answers, and whether an answer will be accepted. By investigating these two problems, it turns out that the knowledge transfer process is affected by the temporal factor and the knowledge level, defined as the combination of the user reputation and posted text. Take these factors into consideration, this work proposes TKTM (Time based Knowledge Transfer Modeling) where the likelihood of a user transfers knowledge to another is modeled as a continuous function of time and the knowledge level being transferred. TKTM is applied to solve several predictive problems: how many user accounts will be involved in the thread to provide answers and comments over time; who will provide the answer; and who will provide the accepted answer. The result is compared to NetRate, QLI, and regression methods such as RandomForest, linear regression. In all experiments, TKTM outperforms other methods significantly

    An Analysis on The Network Structure of Influential Communities in Twitter

    Get PDF
    Over the past years online social networks have become a major target for marketing strategies, generating a need for methods to efficiently spread information through these networks. Close knit communities have developed on these platforms through groups of users connecting with like minded individuals. In this thesis we use data pulled from Twitter's API and from simulations designed to mirror the Twitter network to pursue an in depth analysis of the network structure and influence of these communities. Through this analysis we draw several conclusions. First, the influence of users in these communities is correlated to the total number of followers in their neighborhood. Second, influential communities tend to be more tightly clustered than other areas of the network. Using these observations, we develop an algorithm to detect influential communities in Twitter and show that correctly prioritizing connections yields significant gains in message visibility

    Computing Twitter Influence with a GPU

    Get PDF
    Masteroppgave i informatikkINF399MAMN-INFMAMN-PRO

    Influential spreaders in the political Twitter sphere of the 2013 Malaysian general election

    Get PDF
    Purpose – The article investigates political influential spreaders in Twitter at the juncture before and after the Malaysian General Election in 2013 (MGE2013) for the purpose of understanding if the political sphere within Twitter reflects the intentions, popularity and influence of political figures in the year in which Malaysia has its first ‘social media election’. Design/methodology/approach – A Big Data approach was used for acquiring a series of longitudinal data sets during the election period. The work differs from existing methods focusing on the general statistics of the number of followers, supporters, sentiment analysis and etc. A retweeting network has been extracted from tweets and retweets and has been mapped to a novel information flow and propagation network we developed. We conducted quantitative studies using k-shell decomposition, which enables the construction of a quantitative Twitter political propagation sphere where members posited at the core areas are more influential than those in the outer circles and periphery. Findings – We conducted a comparative study of the influential members of Twitter political propagation sphere on the election day and the day after. We found that representatives of political parties which are located at the center of the propagation network are winners of the presidential election. This may indicate that influential power within Twitter is positively related to the final election results, at least in MGE2013. Furthermore, a number of non-politicians located at the center of the propagation network also significantly influenced the election. Research limitations/implications – This research is based on a large electoral campaign in a specific election period, and within a predefined nation. While the result is significant and meaningful, more case studies are needed for generalised application for identifying potential winning candidates in future social-media fueled political elections. Practical implications – We presented a simple yet effective model for identifying influential spreaders in the Twitter political sphere. The application of our approach yielded the conclusion that online ‘coreness’ score has significant influence to the final offline electoral results. This presents great opportunities for applying our novel methodology in the upcoming Malaysian General Election in 2018. The discovery presented here can be used for understanding how different players of political parties engage themselves in the election game in Twitter. Our approach can also be adopted as a factor of influence for offline electoral activities. The conception of a quantitative approach in electoral results greatly influenced by social media means that comparative studies could be made in future elections. Originality/value – Existing works related to general elections of various nations have either bypassed or ignored the subtle links between online and offline influential propagations. The modeling of influence from social media using a longitudinal and multilayered approach is also rarely studied. This simple yet effective method provides a new perspective of practice for understanding how different players behave and mutually shape each other over time in the election game

    When Less is More: Systematic Analysis of Cascade-based Community Detection

    Get PDF
    Information diffusion, spreading of infectious diseases, and spreading of rumors are fundamental processes occurring in real-life networks. In many practical cases, one can observe when nodes become infected, but the underlying network, over which a contagion or information propagates, is hidden. Inferring properties of the underlying network is important since these properties can be used for constraining infections, forecasting, viral marketing, etc. Moreover, for many applications, it is sufficient to recover only coarse high-level properties of this network rather than all its edges. In this paper, we conduct a systematic and extensive analysis of the following problem: given only the infection times, find communities of highly interconnected nodes. We carry out a thorough comparison between existing and new approaches on several large datasets and cover methodological challenges that are specific to this problem. One of the main conclusions is that the most stable performance and the most significant improvement on the current state-of-the-art are achieved by our proposed simple heuristic approaches that are agnostic to a particular graph structure and epidemic model. We also show that some well-known community detection algorithms can be enhanced by including edge weights based on the cascade data
    • …
    corecore