43,082 research outputs found
Demographic Inference and Representative Population Estimates from Multilingual Social Media Data
Social media provide access to behavioural data at an unprecedented scale and
granularity. However, using these data to understand phenomena in a broader
population is difficult due to their non-representativeness and the bias of
statistical inference tools towards dominant languages and groups. While
demographic attribute inference could be used to mitigate such bias, current
techniques are almost entirely monolingual and fail to work in a global
environment. We address these challenges by combining multilingual demographic
inference with post-stratification to create a more representative population
sample. To learn demographic attributes, we create a new multimodal deep neural
architecture for joint classification of age, gender, and organization-status
of social media users that operates in 32 languages. This method substantially
outperforms current state of the art while also reducing algorithmic bias. To
correct for sampling biases, we propose fully interpretable multilevel
regression methods that estimate inclusion probabilities from inferred joint
population counts and ground-truth population counts. In a large experiment
over multilingual heterogeneous European regions, we show that our demographic
inference and bias correction together allow for more accurate estimates of
populations and make a significant step towards representative social sensing
in downstream applications with multilingual social media.Comment: 12 pages, 10 figures, Proceedings of the 2019 World Wide Web
Conference (WWW '19
Longitudinal Study of Child Face Recognition
We present a longitudinal study of face recognition performance on Children
Longitudinal Face (CLF) dataset containing 3,682 face images of 919 subjects,
in the age group [2, 18] years. Each subject has at least four face images
acquired over a time span of up to six years. Face comparison scores are
obtained from (i) a state-of-the-art COTS matcher (COTS-A), (ii) an open-source
matcher (FaceNet), and (iii) a simple sum fusion of scores obtained from COTS-A
and FaceNet matchers. To improve the performance of the open-source FaceNet
matcher for child face recognition, we were able to fine-tune it on an
independent training set of 3,294 face images of 1,119 children in the age
group [3, 18] years. Multilevel statistical models are fit to genuine
comparison scores from the CLF dataset to determine the decrease in face
recognition accuracy over time. Additionally, we analyze both the verification
and open-set identification accuracies in order to evaluate state-of-the-art
face recognition technology for tracing and identifying children lost at a
young age as victims of child trafficking or abduction
What Twitter Profile and Posted Images Reveal About Depression and Anxiety
Previous work has found strong links between the choice of social media
images and users' emotions, demographics and personality traits. In this study,
we examine which attributes of profile and posted images are associated with
depression and anxiety of Twitter users. We used a sample of 28,749 Facebook
users to build a language prediction model of survey-reported depression and
anxiety, and validated it on Twitter on a sample of 887 users who had taken
anxiety and depression surveys. We then applied it to a different set of 4,132
Twitter users to impute language-based depression and anxiety labels, and
extracted interpretable features of posted and profile pictures to uncover the
associations with users' depression and anxiety, controlling for demographics.
For depression, we find that profile pictures suppress positive emotions rather
than display more negative emotions, likely because of social media
self-presentation biases. They also tend to show the single face of the user
(rather than show her in groups of friends), marking increased focus on the
self, emblematic for depression. Posted images are dominated by grayscale and
low aesthetic cohesion across a variety of image features. Profile images of
anxious users are similarly marked by grayscale and low aesthetic cohesion, but
less so than those of depressed users. Finally, we show that image features can
be used to predict depression and anxiety, and that multitask learning that
includes a joint modeling of demographics improves prediction performance.
Overall, we find that the image attributes that mark depression and anxiety
offer a rich lens into these conditions largely congruent with the
psychological literature, and that images on Twitter allow inferences about the
mental health status of users.Comment: ICWSM 201
A panel model for predicting the diversity of internal temperatures from English dwellings
Using panel methods, a model for predicting daily mean internal temperature demand across a heterogeneous domestic building stock is developed. The model offers an important link that connects building stock models to human behaviour. It represents the first time a panel model has been used to estimate the dynamics of internal temperature demand from the natural daily fluctuations of external temperature combined with important behavioural, socio-demographic and building efficiency variables. The model is able to predict internal temperatures across a heterogeneous building stock to within ~0.71°C at 95% confidence and explain 45% of the variance of internal temperature between dwellings. The model confirms hypothesis from sociology and psychology that habitual behaviours are important drivers of home energy consumption. In addition, the model offers the possibility to quantify take-back (direct rebound effect) owing to increased internal temperatures from the installation of energy efficiency measures. The presence of thermostats or thermostatic radiator valves (TRV) are shown to reduce average internal temperatures, however, the use of an automatic timer is statistically insignificant. The number of occupants, household income and occupant age are all important factors that explain a proportion of internal temperature demand. Households with children or retired occupants are shown to have higher average internal temperatures than households who do not. As expected, building typology, building age, roof insulation thickness, wall U-value and the proportion of double glazing all have positive and statistically significant effects on daily mean internal temperature. In summary, the model can be used as a tool to predict internal temperatures or for making statistical inferences. However, its primary contribution offers the ability to calibrate existing building stock models to account for behaviour and socio-demographic effects making it possible to back-out more accurate predictions of domestic energy demand
- …