2 research outputs found

    Characterising and evaluating online communities from live microblogging user interactions

    Get PDF
    Microblogging (mainly represented by Twitter) is a type of social media that focuses on fast open real-time communication using short messages between users and their followers. This system is attractive due to its open nature and agile content sharing, leading to a compelling and popular social media platform which generates large amounts of content by the minute. Community finding techniques are an interesting approach for organising this massive content but there is no clear agreement in the literature for a standard definition of user community for the microblogging use case, leading to unreliable ground-truth data and evaluation. In this work, we differentiate between functional and structural definitions of communities for microblogging. A functional community groups its users by a common independent social function, e.g. fans of the same football team, while in a structural community the members exclusively depend on their connectivity in a network, e.g. modularity. We build and characterise eight types of functional communities to be used as user-labelled ground-truth and five types of live user interactions networks from Twitter. We then evaluate thirteen popular structural community definitions using five different Twitter datasets, exploring their goodness and robustness for detecting the functional ground-truth under different perturbation strategies. Our results show that definitions based on internal connectivity, e.g. Triangle Participation Ratio, Fraction Over Median Degree or Conductance work best for the Twitter use-case and are very robust. On the other hand, classic scores such as Modularity are limited and do not fit very well due to the sparsity and noise of microblogging. An implementation of our experimental framework is also made availabl
    corecore