The development of social media user stance detection and bot detection
methods rely heavily on large-scale and high-quality benchmarks. However, in
addition to low annotation quality, existing benchmarks generally have
incomplete user relationships, suppressing graph-based account detection
research. To address these issues, we propose a Multi-Relational Graph-Based
Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based
benchmark for account detection. To our knowledge, MGTAB was built based on the
largest original data in the field, with over 1.55 million users and 130
million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of
relationships, ensuring high-quality annotation and diversified relations. In
MGTAB, we extracted the 20 user property features with the greatest information
gain and user tweet features as the user features. In addition, we performed a
thorough evaluation of MGTAB and other public datasets. Our experiments found
that graph-based approaches are generally more effective than feature-based
approaches and perform better when introducing multiple relations. By analyzing
experiment results, we identify effective approaches for account detection and
provide potential future research directions in this field. Our benchmark and
standardized evaluation procedures are freely available at:
https://github.com/GraphDetec/MGTAB.Comment: 14 pages, 7 figure