2 research outputs found
Characterising and evaluating online communities from live microblogging user interactions
Microblogging (mainly represented by Twitter) is
a type of social media that focuses on fast open real-time
communication using short messages between users and their
followers. This system is attractive due to its open nature and
agile content sharing, leading to a compelling and popular social
media platform which generates large amounts of content by
the minute. Community finding techniques are an interesting
approach for organising this massive content but there is no
clear agreement in the literature for a standard definition of user
community for the microblogging use case, leading to unreliable
ground-truth data and evaluation. In this work, we differentiate
between functional and structural definitions of communities
for microblogging. A functional community groups its users by
a common independent social function, e.g. fans of the same
football team, while in a structural community the members
exclusively depend on their connectivity in a network, e.g.
modularity. We build and characterise eight types of functional
communities to be used as user-labelled ground-truth and five
types of live user interactions networks from Twitter. We then
evaluate thirteen popular structural community definitions using
five different Twitter datasets, exploring their goodness and
robustness for detecting the functional ground-truth under different
perturbation strategies. Our results show that definitions
based on internal connectivity, e.g. Triangle Participation Ratio,
Fraction Over Median Degree or Conductance work best for the
Twitter use-case and are very robust. On the other hand, classic
scores such as Modularity are limited and do not fit very well due
to the sparsity and noise of microblogging. An implementation
of our experimental framework is also made availabl