8 research outputs found

    Machine Learning as a Tool for Wildlife Management and Research: The Case of Wild Pig-Related Content on Twitter

    Get PDF
    Wild pigs (Sus scrofa) are a non-native, invasive species that cause considerable damage and transmit a variety of diseases to livestock, people, and wildlife. We explored Twitter, the most popular social media micro-blogging platform, to demonstrate how social media data can be leveraged to investigate social identity and sentiment toward wild pigs. In doing so, we employed a sophisticated machine learning approach to investigate: (1) the overall sentiment associated with the dataset, (2) online identities via user profile descriptions, and (3) the extent to which sentiment varied by online identity. Results indicated that the largest groups of online identity represented in our dataset were females and people whose occupation was in journalism and media communication. While the majority of our data indicated a negative sentiment toward wild pigs and other related search terms, users who identified with agriculture-related occupations had more favorable sentiment. Overall, this article is an important starting point for further investigation of the use of social media data and social identity in the context of wild pigs and other invasive species

    Towards Real-Time, Country-Level Location Classification of Worldwide Tweets

    Get PDF
    In contrast to much previous work that has focused on location classification of tweets restricted to a specific country, here we undertake the task in a broader context by classifying global tweets at the country level, which is so far unexplored in a real-time scenario. We analyse the extent to which a tweet's country of origin can be determined by making use of eight tweet-inherent features for classification. Furthermore, we use two datasets, collected a year apart from each other, to analyse the extent to which a model trained from historical tweets can still be leveraged for classification of new tweets. With classification experiments on all 217 countries in our datasets, as well as on the top 25 countries, we offer some insights into the best use of tweet-inherent features for an accurate country-level classification of tweets. We find that the use of a single feature, such as the use of tweet content alone -- the most widely used feature in previous work -- leaves much to be desired. Choosing an appropriate combination of both tweet content and metadata can actually lead to substantial improvements of between 20\% and 50\%. We observe that tweet content, the user's self-reported location and the user's real name, all of which are inherent in a tweet and available in a real-time scenario, are particularly useful to determine the country of origin. We also experiment on the applicability of a model trained on historical tweets to classify new tweets, finding that the choice of a particular combination of features whose utility does not fade over time can actually lead to comparable performance, avoiding the need to retrain. However, the difficulty of achieving accurate classification increases slightly for countries with multiple commonalities, especially for English and Spanish speaking countries.Comment: Accepted for publication in IEEE Transactions on Knowledge and Data Engineering (IEEE TKDE

    Profile Update: The Effects of Identity Disclosure on Network Connections and Language

    Full text link
    Our social identities determine how we interact and engage with the world surrounding us. In online settings, individuals can make these identities explicit by including them in their public biography, possibly signaling a change to what is important to them and how they should be viewed. Here, we perform the first large-scale study on Twitter that examines behavioral changes following identity signal addition on Twitter profiles. Combining social networks with NLP and quasi-experimental analyses, we discover that after disclosing an identity on their profiles, users (1) generate more tweets containing language that aligns with their identity and (2) connect more to same-identity users. We also examine whether adding an identity signal increases the number of offensive replies and find that (3) the combined effect of disclosing identity via both tweets and profiles is associated with a reduced number of offensive replies from others

    #WhoAmI in 160 characters?: Classifying social identities based on Twitter profile descriptions

    Get PDF
    We combine social theory and NLP methods to classify English-speaking Twitter users’ online social identity in profile descriptions. We conduct two text classification experiments. In Experiment 1 we use a 5-category online social identity classification based on identity and self-categorization theories. While we are able to automatically classify two identity categories (Relational and Occupational), automatic classification of the other three identities (Political, Ethnic/religious and Stigmatized) is challenging. In Experiment 2 we test a merger of such identities based on theoretical arguments. We find that by combining these identities we can improve the predictive performance of the classifiers in the experiment. Our study shows how social theory can be used to guide NLP methods, and how such methods provide input to revisit traditional social theory that is strongly consolidated in offline settin

    Stepping Out of the Ivory Tower for Ocean Literacy

    Get PDF
    The Ocean Literacy movement is predominantly driven forward by scientists and educators working in subject areas associated with ocean science. While some in the scientific community have heeded the responsibility to communicate with the general public to increase scientific literacy, reaching and engaging with diverse audiences remains a challenge. Many academic institutions, research centers, and individual scientists use social network sites (SNS) like Twitter to not only promote conferences, journal publications, and scientific reports, but to disseminate resources and information that have the potential to increase the scientific literacy of diverse audiences. As more people turn to social media for news and information, SNSs like Twitter have a great potential to increase ocean literacy, so long as disseminators understand the best practices and limitations of SNS communication. This study analyzed the Twitter account of MaREI – Ireland’s Centre for Marine and Renewable Energy – coordinated by University College Cork Ireland, as a case study. We looked specifically at posts related to ocean literacy to determine what types of audiences are being engaged and what factors need to be considered to increase engagement with intended audiences. Two main findings are presented in this paper. First, we present overall user retweet frequency as a function of post characteristics, highlighting features significant in influencing users’ retweet behavior. Second, we separate users into two types – INREACH and OUTREACH – and identify post characteristics that are statistically relevant in increasing the probability of engaging with an OUTREACH user. The results of this study provide novel insight into the ways in which science-based Twitter users can better use the platform as a vector for science communication and outreach

    Towards Real-Time, Country-Level Location Classification of Worldwide Tweets

    Full text link
    corecore