Search CORE

22,162 research outputs found

Automatic Identification of Personal Life Events in Twitter

Author: Harith Alani
How To Cite
Lisa A Thomas
Miriam Fernandez
Pam Briggs
Paul Mulholland
Thomas Dickinson
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2015
Field of study

New social media has led to an explosion in personal digital data that encompasses both those expressions of self chosen by the individual as well as reflections of self provided by other, third parties. The resulting Digital Personhood (DP) data is complex and for many users it is too easy to become lost in the mire of digital data. This paper studies the automatic detection of personal life events in Twitter. Six relevant life events are considered from psychological research including: beginning school; first full time job; falling in love; marriage; having children and parent's death. We define a variety of features (user, content, semantic and interaction) to capture the characteristics of those life events and present the results of several classification methods to automatically identify these events in Twitter

CiteSeerX

Crossref

Open Research Online (The Open University)

Identifying Prominent Life Events on Twitter

Author: Alani Harith
Briggs Pam
Dickinson Thomas
Fernández Miriam
Mulholland Paul
Thomas Lisa A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/09/2015
Field of study

Social media is a common place for people to post and share digital reflections of their life events, including major events such as getting married, having children, graduating, etc. Although the creation of such posts is straightforward, the identification of events on online media remains a challenge. Much research in recent years focused on extracting major events from Twitter, such as earthquakes, storms, and floods. This paper however, targets the automatic detection of personal life events, focusing on five events that psychologists found to be the most prominent in people lives. We define a variety of features (user, content, semantic and interaction) to capture the characteristics of those life events and present the results of several classification methods to automatically identify these events in Twitter. Our proposed classification methods obtain results between 0.84 and 0.92 F1-measure for the different types of life events. A novel contribution of this work also lies in a new corpus of tweets, which has been annotated by using crowdsourcing and that constitutes, to the best of our knowledge, the first publicly available dataset for the automatic identification of personal life events from Twitter

Northumbria Research Link

Open Research Online (The Open University)

Social media mining for identification and exploration of health-related information from pregnant women

Author: Chandrashekar Pramod Bharadwaj
Magge Arjun
Sarker Abeed
Gonzalez Graciela
Publication venue
Publication date: 01/01/1989
Field of study

Widespread use of social media has led to the generation of substantial amounts of information about individuals, including health-related information. Social media provides the opportunity to study health-related information about selected population groups who may be of interest for a particular study. In this paper, we explore the possibility of utilizing social media to perform targeted data collection and analysis from a particular population group -- pregnant women. We hypothesize that we can use social media to identify cohorts of pregnant women and follow them over time to analyze crucial health-related information. To identify potentially pregnant women, we employ simple rule-based searches that attempt to detect pregnancy announcements with moderate precision. To further filter out false positives and noise, we employ a supervised classifier using a small number of hand-annotated data. We then collect their posts over time to create longitudinal health timelines and attempt to divide the timelines into different pregnancy trimesters. Finally, we assess the usefulness of the timelines by performing a preliminary analysis to estimate drug intake patterns of our cohort at different trimesters. Our rule-based cohort identification technique collected 53,820 users over thirty months from Twitter. Our pregnancy announcement classification technique achieved an F-measure of 0.81 for the pregnancy class, resulting in 34,895 user timelines. Analysis of the timelines revealed that pertinent health-related information, such as drug-intake and adverse reactions can be mined from the data. Our approach to using user timelines in this fashion has produced very encouraging results and can be employed for other important tasks where cohorts, for which health-related information may not be available from other sources, are required to be followed over time to derive population-based estimates.Comment: 9 page

arXiv.org e-Print Archive

Wageningen University & Research Publications

Characterizing Geo-located Tweets in Brazilian Megacities

Author: Christina Gagnon (4247860)
Elizabeth Ottoni (4247866)
Luc DesGroseillers (59022)
Rémy Beaujois (314942)
Sami HSine (4247869)
Stéphanie Mollet (4247875)
Wildriss Viranaicken (347964)
Xin Zhang (35492)
Publication venue
Publication date: 01/01/2017
Field of study

This work presents a framework for collecting, processing and mining geo-located tweets in order to extract meaningful and actionable knowledge in the context of smart cities. We collected and characterized more than 9M tweets from the two biggest cities in Brazil, Rio de Janeiro and S\~ao Paulo. We performed topic modeling using the Latent Dirichlet Allocation model to produce an unsupervised distribution of semantic topics over the stream of geo-located tweets as well as a distribution of words over those topics. We manually labeled and aggregated similar topics obtaining a total of 29 different topics across both cities. Results showed similarities in the majority of topics for both cities, reflecting similar interests and concerns among the population of Rio de Janeiro and S\~ao Paulo. Nevertheless, some specific topics are more predominant in one of the cities

arXiv.org e-Print Archive

Crossref

FigShare

Characterizing Geo-located Tweets in Brazilian Megacities

Author: Cacho Nélio
Pasquali Arian
Pereira João
Rossetti Rosaldo
Saleiro Pedro
Publication venue
Publication date: 06/09/2017
Field of study

arXiv.org e-Print Archive

Crossref

Understanding the Roots of Radicalisation on Twitter

Author: Borum Randy
Cano Basave Amparo Elizabeth
Hassan Saif MatthewRowe
Jonathon Morgan Berger
Schmid Alex P
Vergani Matteo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

In an increasingly digital world, identifying signs of online extremism sits at the top of the priority list for counter-extremist agencies. Researchers and governments are investing in the creation of advanced information technologies to identify and counter extremism through intelligent large-scale analysis of online data. However, to the best of our knowledge, these technologies are neither based on, nor do they take advantage of, the existing theories and studies of radicalisation. In this paper we propose a computational approach for detecting and predicting the radicalisation influence a user is exposed to, grounded on the notion of ’roots of radicalisation’ from social science models. This approach has been applied to analyse and compare the radicalisation level of 112 pro-ISIS vs.112 “general" Twitter users. Our results show the effectiveness of our proposed algorithms in detecting and predicting radicalisation influence, obtaining up to 0.9 F-1 measure for detection and between 0.7 and 0.8 precision for prediction. While this is an initial attempt towards the effective combination of social and computational perspectives, more work is needed to bridge these disciplines, and to build on their strengths to target the problem of online radicalisation

Crossref

Open Research Online (The Open University)