177 research outputs found

    From past to present: spam detection and identifying opinion leaders in social networks

    Get PDF
    On microblogging sites, which are gaining more and more users every day, a wide range of ideas are quickly emerging, spreading, and creating interactive environments. In some cases, in Turkey as well as in the rest of the world, it was noticed that events were published on microblogging sites before appearing in visual, audio and printed news sources. Thanks to the rapid flow of information in social networks, it can reach millions of people in seconds. In this context, social media can be seen as one of the most important sources of information affecting public opinion. Since the information in social networks became accessible, research started to be conducted using the information on the social networks. While the studies about spam detection and identification of opinion leaders gained popularity, surveys about these topics began to be published. This study also shows the importance of spam detection and identification of opinion leaders in social networks. It is seen that the data collected from social platforms, especially in recent years, has sourced many state-of-art applications. There are independent surveys that focus on filtering the spam content and detecting influencers on social networks. This survey analyzes both spam detection studies and opinion leader identification and categorizes these studies by their methodologies. As far as we know there is no survey that contains approaches for both spam detection and opinion leader identification in social networks. This survey contains an overview of the past and recent advances in both spam detection and opinion leader identification studies in social networks. Furthermore, readers of this survey have the opportunity of understanding general aspects of different studies about spam detection and opinion leader identification while observing key points and comparisons of these studies.This work is supported in part by the Scientific and Technological Research Council of Turkey (TUBITAK) through grant number 118E315 and grant number 120E187. Points of view in this document are those of the authors and do not necessarily represent the official position or policies of TUBITAK.Publisher's VersionEmerging Sources Citation Index (ESCI)Q4WOS:00080858480001

    Understanding Bots on Social Media - An Application in Disaster Response

    Get PDF
    abstract: Social media has become a primary platform for real-time information sharing among users. News on social media spreads faster than traditional outlets and millions of users turn to this platform to receive the latest updates on major events especially disasters. Social media bridges the gap between the people who are affected by disasters, volunteers who offer contributions, and first responders. On the other hand, social media is a fertile ground for malicious users who purposefully disturb the relief processes facilitated on social media. These malicious users take advantage of social bots to overrun social media posts with fake images, rumors, and false information. This process causes distress and prevents actionable information from reaching the affected people. Social bots are automated accounts that are controlled by a malicious user and these bots have become prevalent on social media in recent years. In spite of existing efforts towards understanding and removing bots on social media, there are at least two drawbacks associated with the current bot detection algorithms: general-purpose bot detection methods are designed to be conservative and not label a user as a bot unless the algorithm is highly confident and they overlook the effect of users who are manipulated by bots and (unintentionally) spread their content. This study is trifold. First, I design a Machine Learning model that uses content and context of social media posts to detect actionable ones among them; it specifically focuses on tweets in which people ask for help after major disasters. Second, I focus on bots who can be a facilitator of malicious content spreading during disasters. I propose two methods for detecting bots on social media with a focus on the recall of the detection. Third, I study the characteristics of users who spread the content of malicious actors. These features have the potential to improve methods that detect malicious content such as fake news.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Twitter and society

    Get PDF

    Discovering and Mitigating Social Data Bias

    Get PDF
    abstract: Exabytes of data are created online every day. This deluge of data is no more apparent than it is on social media. Naturally, finding ways to leverage this unprecedented source of human information is an active area of research. Social media platforms have become laboratories for conducting experiments about people at scales thought unimaginable only a few years ago. Researchers and practitioners use social media to extract actionable patterns such as where aid should be distributed in a crisis. However, the validity of these patterns relies on having a representative dataset. As this dissertation shows, the data collected from social media is seldom representative of the activity of the site itself, and less so of human activity. This means that the results of many studies are limited by the quality of data they collect. The finding that social media data is biased inspires the main challenge addressed by this thesis. I introduce three sets of methodologies to correct for bias. First, I design methods to deal with data collection bias. I offer a methodology which can find bias within a social media dataset. This methodology works by comparing the collected data with other sources to find bias in a stream. The dissertation also outlines a data collection strategy which minimizes the amount of bias that will appear in a given dataset. It introduces a crawling strategy which mitigates the amount of bias in the resulting dataset. Second, I introduce a methodology to identify bots and shills within a social media dataset. This directly addresses the concern that the users of a social media site are not representative. Applying these methodologies allows the population under study on a social media site to better match that of the real world. Finally, the dissertation discusses perceptual biases, explains how they affect analysis, and introduces computational approaches to mitigate them. The results of the dissertation allow for the discovery and removal of different levels of bias within a social media dataset. This has important implications for social media mining, namely that the behavioral patterns and insights extracted from social media will be more representative of the populations under study.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Graph Mining for Cybersecurity: A Survey

    Full text link
    The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society. Securing cyberspace has become an utmost concern for organizations and governments. Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities. In recent years, with the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance. It is imperative to summarize existing graph-based cybersecurity solutions to provide a guide for future studies. Therefore, as a key contribution of this paper, we provide a comprehensive review of graph mining for cybersecurity, including an overview of cybersecurity tasks, the typical graph mining techniques, and the general process of applying them to cybersecurity, as well as various solutions for different cybersecurity tasks. For each task, we probe into relevant methods and highlight the graph types, graph approaches, and task levels in their modeling. Furthermore, we collect open datasets and toolkits for graph-based cybersecurity. Finally, we outlook the potential directions of this field for future research

    Disrupting networks of hate: Characterising hateful networks and removing critical nodes

    Get PDF
    Hateful individuals and groups have increasingly been using the Internet to express their ideas, spread their beliefs, and recruit new members. Under- standing the network characteristics of these hateful groups could help understand individuals’ exposure to hate and derive intervention strategies to mitigate the dangers of such networks by disrupting communications. This article analyses two hateful followers net- works and three hateful retweet networks of Twitter users who post content subsequently classified by hu- man annotators as containing hateful content. Our analysis shows similar connectivity characteristics between the hateful followers networks and likewise between the hateful retweet networks. The study shows that the hateful networks exhibit higher connectivity characteristics when compared to other ”risky” networks, which can be seen as a risk in terms of the likelihood of expo- sure to, and propagation of, online hate. Three network performance metrics are used to quantify the hateful content exposure and contagion: giant component (GC) size, density and average shortest path. In order to efficiently identify nodes whose removal reduced the flow of hate in a network, we propose a range of structured node-removal strategies and test their effectiveness. Results show that removing users with a high degree is most effective in reducing the hateful followers network connectivity (GC, size and density), and therefore reducing the risk of exposure to cyberhate and stemming its propagation
    corecore