Search CORE

12 research outputs found

An analysis of the user occupational class through Twitter content

Author: Aletras N
Lampos V
Preoţiuc-Pietro D
Publication venue: The Association for Computational Linguistics
Publication date: 01/07/2015
Field of study

Social media content can be used as a complementary source to the traditional methods for extracting and studying collective social attributes. This study focuses on the prediction of the occupational class for a public user profile. Our analysis is conducted on a new annotated corpus of Twitter users, their respective job titles, posted textual content and platform-related attributes. We frame our task as classification using latent feature representations such as word clusters and embeddings. The employed linear and, especially, non-linear methods can predict a user’s occupational class with strong accuracy for the coarsest level of a standard occupation taxonomy which includes nine classes. Combined with a qualitative assessment, the derived results confirm the feasibility of our approach in inferring a new user attribute that can be embedded in a multitude of downstream applications

UCL Discovery

Point-of-interest type inference from social media text

Author: Aletras N.
Preoţiuc-Pietro D.
Villegas D.S.
Publication venue
Publication date
Field of study

Physical places help shape how we perceive the experiences we have there. For the first time, we study the relationship between social media text and the type of the place from where it was posted, whether a park, restaurant, or someplace else. To facilitate this, we introduce a novel data set of ∼200,000 English tweets published from 2,761 different points-of-interest in the U.S., enriched with place type information. We train classifiers to predict the type of the location a tweet was sent from that reach a macro F1 of 43.67 across eight classes and uncover the linguistic markers associated with each type of place. The ability to predict semantic place information from a tweet has applications in recommendation systems, personalization services and cultural geography

White Rose Research Online

Analyzing political parody in social media

Author: Aletras N
Maronikolakis A
Preoţiuc-Pietro D
Villegas DS
Publication venue
Publication date
Field of study

Parody is a figurative device used to imitate an entity for comedic or critical purposes and represents a widespread phenomenon in social media through many popular parody accounts. In this paper, we present the first computational study of parody. We introduce a new publicly available data set of tweets from real politicians and their corresponding parody accounts. We run a battery of supervised machine learning models for automatically detecting parody tweets with an emphasis on robustness by testing on tweets from accounts unseen in training, across different genders and across countries. Our results show that political parody tweets can be predicted with an accuracy up to 90%. Finally, we identify the markers of parody through a linguistic analysis. Beyond research in linguistics and political communication, accurately and automatically detecting parody is important to improving fact checking for journalists and analytics such as sentiment analysis through filtering out parodical utterances

White Rose Research Online

Automatically identifying complaints in social media

Author: Aletras N.
Găman M.
Preoţiuc-Pietro D.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Complaining is a basic speech act regularly used in human and computer mediated communication to express a negative mismatch between reality and expectations in a particular situation. Automatically identifying complaints in social media is of utmost importance for organizations or brands to improve the customer experience or in developing dialogue systems for handling and responding to complaints. In this paper, we introduce the first systematic analysis of complaints in computational linguistics. We collect a new annotated data set of written complaints expressed in English on Twitter. We present an extensive linguistic analysis of complaining as a speech act in social media and train strong feature-based and neural models of complaints across nine domains achieving a predictive performance of up to 79 F1 using distant supervision

arXiv.org e-Print Archive

Crossref

White Rose Research Online

Studying user income through language, behaviour and affect in social media

Author: AJ Smola
B Bernstein
B Bernstein
BW Roberts
CE Rasmussen
D Freedman
D Kahneman
D Preoţiuc-Pietro
D Rout
Daniel Preoţiuc-Pietro
DM Blei
E Diener
E Snelson
F Pedregosa
FD Blau
H Zou
HA Schwartz
HB Mann
J Bollen
J Cohen
J Eisenstein
L Sloan
Lidia Adriana Braunstein
Nikolaos Aletras
P Ekman
P Elias
RM Neal
Svitlana Volkova
TA Judge
V Lampos
Vasileios Lampos
VN Vapnik
W Ng
W Youyou
Yoram Bachrach
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 22/09/2015
Field of study

Automatically inferring user demographics from social media posts is useful for both social science research and a range of downstream applications in marketing and politics. We present the first extensive study where user behaviour on Twitter is used to build a predictive model of income. We apply non-linear methods for regression, i.e. Gaussian Processes, achieving strong correlation between predicted and actual user income. This allows us to shed light on the factors that characterise income on Twitter and analyse their interplay with user emotions and sentiment, perceived psycho-demographics and language use expressed through the topics of their posts. Our analysis uncovers correlations between different feature categories and income, some of which reflect common belief e.g. higher perceived education and intelligence indicates higher earnings, known differences e.g. gender and age differences, however, others show novel findings e.g. higher income users express more fear and anger, whereas lower income users express more of the time emotion and opinions

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

UCL Discovery

PubMed Central

White Rose Research Online

Predicting and Characterising User Impact on Twitter

Author: Lampos V
Aletras N
Preoţiuc-Pietro D
Cohn T
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2014
Field of study

The open structure of online social networks and their uncurated nature give rise to problems of user credibility and influence. In this paper, we address the task of predicting the impact of Twitter users based only on features under their direct control, such as usage statistics and the text posted in their tweets.We approach the problem as regression and apply linear as well as non-linear learning methods to predict a user impact score, estimated by combining the numbers of the user's followers, followees and listings. The experimental results point out that a strong prediction performance is achieved, especially for models based on the Gaussian Processes framework. Hence, we can interpret various modelling components, transforming them into indirect 'suggestions' for impact boosting. © 2014 Association for Computational Linguistics

Crossref

UCL Discovery

Predicting Consumers’ Decision-Making Styles by Analyzing Digital Footprints on Facebook

Author: Blei D. M.
Jeong H. J.
Jyun-Han Wu
Kotler P.
Preoţiuc-Pietro D.
Salton G.
Yu-Jen Hsu
Yuh-Jen Chen
Yuh-Min Chen
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date
Field of study

Crossref

The added value of online user-generated content in traditional methods for influenza surveillance

Author: A Correa
D Lazer
D Preoţiuc-Pietro
DN Klaucke
DR Olson
DR Olson
E Yom-Tov
GI Eysenbach
J Ginsberg
JW Buehler
L Simonsen
M Santillana
M Wagner
MJ Paul
PM Polgreen
RR German
T Mikolov
T Vega
V Lampos
V Lampos
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

The canary in the city: indicator groups as predictors of local rent increases

Author: A Farseev
A Noulas
A Pentland
A Poorthuis
AA Siddig
AD Singleton
B Hawelka
B Hecht
C Ratti
C Zhong
D Hristova
D Lazer
D Preoţiuc-Pietro
DJ Hammel
EA Holt
EC Delmelle
EC Delmelle
EM Hoover
F Calabrese
F Luo
J Blumenstock
J Petersen
KE Case
KP Schwirian
L Gabrielli
L Lees
M Birkin
M Kosinski
M Scheffer
M Schläpfer
M Tizzoni
N Smith
R Atkinson
R Jurdak
RJ Hill
RL Glass
S Hasan
S Isaacman
S Zukin
SR Carpenter
T Shelton
TH Rashidi
V Frias-Martinez
X Lu
Y Kryvasheyeu
Y Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Abstract As cities grow, certain neighborhoods experience a particularly high demand for housing, resulting in escalating rents. Despite far-reaching socioeconomic consequences, it remains difficult to predict when and where urban neighborhoods will face such changes. To tackle this challenge, we adapt the concept of ‘bioindicators’, borrowed from ecology, to the urban context. The objective is to use an ‘indicator group’ of people to assess the quality of a complex environment and its changes over time. Specifically, we analyze 92 million geolocated Twitter records across five US cities, allowing us to derive socio-economic user profiles based on individual movement patterns. As a proof-of-concept, we define users with a ‘high-income-profile’ as an indicator group and show that their visitation patterns are a suitable indicator for expected future rent increases in different neighborhoods. The concept of indicator groups highlights the potential of closely monitoring only a specific subset of the population, rather than the population as a whole. If the indicator group is defined appropriately for the phenomenon of interest, this approach can yield early predictions while simultaneously reducing the amount of data that needs to be collected and analyzed

Repository for Publications and Research Data

Crossref

Directory of Open Access Journals

DR-NTU (Digital Repository of NTU)

The individual dynamics of affective expression on social media

Author: A Beasley
AA Augustine
AB Warriner
AJ Reagan
B Rimé
C von Scheve
CE Osgood
CW Gardiner
D Garcia
D Garcia
D Garcia
D Garcia
D Preoţiuc-Pietro
D Ruths
DB Yaden
DL Sackett
DM Greenberg
E Dejonckheere
E Ferrara
E Kennedy-Moore
F Celli
F Schweitzer
FN Ribeiro
GY Lim
IB Wood
IM Kloumann
J Berger
J Bollen
J Boucher
JA Russell
JA Russell
K Niven
M Kosinski
M Skowron
M Vlasceanu
MD Gurven
MM Bradley
N Schwarz
NH Frijda
P Koval
P Kuppens
P Kuppens
P Kuppens
PS Dodds
R Alvarez
R Core Team
R Fan
RF Baumeister
RW Picard
S Gobron
S Mohammad
S Zheng
SA Golder
SD Gosling
WN McPhee
Z Tufekci
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref