8 research outputs found
Machine Learning as a Tool for Wildlife Management and Research: The Case of Wild Pig-Related Content on Twitter
Wild pigs (Sus scrofa) are a non-native, invasive species that cause considerable damage and transmit a variety of diseases to livestock, people, and wildlife. We explored Twitter, the most popular social media micro-blogging platform, to demonstrate how social media data can be leveraged to investigate social identity and sentiment toward wild pigs. In doing so, we employed a sophisticated machine learning approach to investigate: (1) the overall sentiment associated with the dataset, (2) online identities via user profile descriptions, and (3) the extent to which sentiment varied by online identity. Results indicated that the largest groups of online identity represented in our dataset were females and people whose occupation was in journalism and media communication. While the majority of our data indicated a negative sentiment toward wild pigs and other related search terms, users who identified with agriculture-related occupations had more favorable sentiment. Overall, this article is an important starting point for further investigation of the use of social media data and social identity in the context of wild pigs and other invasive species
Towards Real-Time, Country-Level Location Classification of Worldwide Tweets
In contrast to much previous work that has focused on location classification
of tweets restricted to a specific country, here we undertake the task in a
broader context by classifying global tweets at the country level, which is so
far unexplored in a real-time scenario. We analyse the extent to which a
tweet's country of origin can be determined by making use of eight
tweet-inherent features for classification. Furthermore, we use two datasets,
collected a year apart from each other, to analyse the extent to which a model
trained from historical tweets can still be leveraged for classification of new
tweets. With classification experiments on all 217 countries in our datasets,
as well as on the top 25 countries, we offer some insights into the best use of
tweet-inherent features for an accurate country-level classification of tweets.
We find that the use of a single feature, such as the use of tweet content
alone -- the most widely used feature in previous work -- leaves much to be
desired. Choosing an appropriate combination of both tweet content and metadata
can actually lead to substantial improvements of between 20\% and 50\%. We
observe that tweet content, the user's self-reported location and the user's
real name, all of which are inherent in a tweet and available in a real-time
scenario, are particularly useful to determine the country of origin. We also
experiment on the applicability of a model trained on historical tweets to
classify new tweets, finding that the choice of a particular combination of
features whose utility does not fade over time can actually lead to comparable
performance, avoiding the need to retrain. However, the difficulty of achieving
accurate classification increases slightly for countries with multiple
commonalities, especially for English and Spanish speaking countries.Comment: Accepted for publication in IEEE Transactions on Knowledge and Data
Engineering (IEEE TKDE
Profile Update: The Effects of Identity Disclosure on Network Connections and Language
Our social identities determine how we interact and engage with the world
surrounding us. In online settings, individuals can make these identities
explicit by including them in their public biography, possibly signaling a
change to what is important to them and how they should be viewed. Here, we
perform the first large-scale study on Twitter that examines behavioral changes
following identity signal addition on Twitter profiles. Combining social
networks with NLP and quasi-experimental analyses, we discover that after
disclosing an identity on their profiles, users (1) generate more tweets
containing language that aligns with their identity and (2) connect more to
same-identity users. We also examine whether adding an identity signal
increases the number of offensive replies and find that (3) the combined effect
of disclosing identity via both tweets and profiles is associated with a
reduced number of offensive replies from others
#WhoAmI in 160 characters?: Classifying social identities based on Twitter profile descriptions
We combine social theory and NLP methods to classify English-speaking Twitter users’ online social identity in profile descriptions. We conduct two text classification experiments. In Experiment 1 we use a 5-category online social identity classification based on identity and self-categorization theories. While we are able to automatically classify two identity categories (Relational and Occupational), automatic classification of the other three identities (Political, Ethnic/religious and Stigmatized) is challenging. In Experiment 2 we test a merger of such identities based on theoretical arguments. We find that by combining these identities we can improve the predictive performance of the classifiers in the experiment. Our study shows how social theory can be used to guide NLP methods, and how such methods provide input to revisit traditional social theory that is strongly consolidated in offline settin
Stepping Out of the Ivory Tower for Ocean Literacy
The Ocean Literacy movement is predominantly driven forward by scientists and educators working in subject areas associated with ocean science. While some in the scientific community have heeded the responsibility to communicate with the general public to increase scientific literacy, reaching and engaging with diverse audiences remains a challenge. Many academic institutions, research centers, and individual scientists use social network sites (SNS) like Twitter to not only promote conferences, journal publications, and scientific reports, but to disseminate resources and information that have the potential to increase the scientific literacy of diverse audiences. As more people turn to social media for news and information, SNSs like Twitter have a great potential to increase ocean literacy, so long as disseminators understand the best practices and limitations of SNS communication. This study analyzed the Twitter account of MaREI – Ireland’s Centre for Marine and Renewable Energy – coordinated by University College Cork Ireland, as a case study. We looked specifically at posts related to ocean literacy to determine what types of audiences are being engaged and what factors need to be considered to increase engagement with intended audiences. Two main findings are presented in this paper. First, we present overall user retweet frequency as a function of post characteristics, highlighting features significant in influencing users’ retweet behavior. Second, we separate users into two types – INREACH and OUTREACH – and identify post characteristics that are statistically relevant in increasing the probability of engaging with an OUTREACH user. The results of this study provide novel insight into the ways in which science-based Twitter users can better use the platform as a vector for science communication and outreach