18,878 research outputs found
Automated Detection of Human Users in Twitter
AbstractThis paper compares Suppport Vector Machine (SVM) classification and a number of clustering approaches to separate human from not human users in Twitter in order to identify normal human activity. These approaches have similar F1 accuracy scores of 90% with both experienc- ing difficulties in classifying human users behaving abnormally. A second stage classification step was then used to further separate not human users into brands, celebrities and promoters / information achieving an average F1 accuracy of 74%. These accuracies were achieved by reducing the size of the feature space using stepwise feature selection and category balancing from manual inspection of classification results
A Network Topology Approach to Bot Classification
Automated social agents, or bots, are increasingly becoming a problem on
social media platforms. There is a growing body of literature and multiple
tools to aid in the detection of such agents on online social networking
platforms. We propose that the social network topology of a user would be
sufficient to determine whether the user is a automated agent or a human. To
test this, we use a publicly available dataset containing users on Twitter
labelled as either automated social agent or human. Using an unsupervised
machine learning approach, we obtain a detection accuracy rate of 70%
Who tweets? Deriving the demographic characteristics of age, occupation and social class from Twitter user meta-data
This paper specifies, designs and critically evaluates two tools for the automated identification of demographic data (age, occupation and social class) from the profile descriptions of Twitter users in the United Kingdom (UK). Meta-data data routinely collected through the Collaborative Social Media Observatory (COSMOS: http://www.cosmosproject.net/) relating to UK Twitter users is matched with the occupational lookup tables between job and social class provided by the Office for National Statistics (ONS) using SOC2010. Using expert human validation, the validity and reliability of the automated matching process is critically assessed and a prospective class distribution of UK Twitter users is offered with 2011 Census baseline comparisons. The pattern matching rules for identifying age are explained and enacted following a discussion on how to minimise false positives. The age distribution of Twitter users, as identified using the tool, is presented alongside the age distribution of the UK population from the 2011 Census. The automated occupation detection tool reliably identifies certain occupational groups, such as professionals, for which job titles cannot be confused with hobbies or are used in common parlance within alternative contexts. An alternative explanation on the prevalence of hobbies is that the creative sector is overrepresented on Twitter compared to 2011 Census data. The age detection tool illustrates the youthfulness of Twitter users compared to the general UK population as of the 2011 Census according to proportions, but projections demonstrate that there is still potentially a large number of older platform users. It is possible to detect “signatures” of both occupation and age from Twitter meta-data with varying degrees of accuracy (particularly dependent on occupational groups) but further confirmatory work is needed
Social Bots: Human-Like by Means of Human Control?
Social bots are currently regarded an influential but also somewhat
mysterious factor in public discourse and opinion making. They are considered
to be capable of massively distributing propaganda in social and online media
and their application is even suspected to be partly responsible for recent
election results. Astonishingly, the term `Social Bot' is not well defined and
different scientific disciplines use divergent definitions. This work starts
with a balanced definition attempt, before providing an overview of how social
bots actually work (taking the example of Twitter) and what their current
technical limitations are. Despite recent research progress in Deep Learning
and Big Data, there are many activities bots cannot handle well. We then
discuss how bot capabilities can be extended and controlled by integrating
humans into the process and reason that this is currently the most promising
way to go in order to realize effective interactions with other humans.Comment: 36 pages, 13 figure
- …