Search CORE

43,082 research outputs found

Demographic Inference and Representative Population Estimates from Multilingual Social Media Data

Author: Alzahrani Sultan
Bergsma Shane
Bethlehem Jelke G
Buolamwini Joy
Chen Xin
Ciot Morgane
Compton Ryan
Goot Rob
Goswami Sumit
Hecht Brent
Huang Gao
Jung Soon-Gyo
Kim Yoon
McCorriston James
Mislove Alan
Nguyen Dong
Nguyen Dong
Rosenthal Sara
Sap Maarten
Schler Jonathan
Zamal Faiyaz Al
Zhang Jinxue
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

Social media provide access to behavioural data at an unprecedented scale and granularity. However, using these data to understand phenomena in a broader population is difficult due to their non-representativeness and the bias of statistical inference tools towards dominant languages and groups. While demographic attribute inference could be used to mitigate such bias, current techniques are almost entirely monolingual and fail to work in a global environment. We address these challenges by combining multilingual demographic inference with post-stratification to create a more representative population sample. To learn demographic attributes, we create a new multimodal deep neural architecture for joint classification of age, gender, and organization-status of social media users that operates in 32 languages. This method substantially outperforms current state of the art while also reducing algorithmic bias. To correct for sampling biases, we propose fully interpretable multilevel regression methods that estimate inclusion probabilities from inferred joint population counts and ground-truth population counts. In a large experiment over multilingual heterogeneous European regions, we show that our demographic inference and bias correction together allow for more accurate estimates of populations and make a significant step towards representative social sensing in downstream applications with multilingual social media.Comment: 12 pages, 10 figures, Proceedings of the 2019 World Wide Web Conference (WWW '19

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

Universaar

Acronym

Longitudinal Study of Child Face Recognition

Author: Deb Debayan
Jain Anil K.
Nain Neeta
Publication venue
Publication date: 10/11/2017
Field of study

We present a longitudinal study of face recognition performance on Children Longitudinal Face (CLF) dataset containing 3,682 face images of 919 subjects, in the age group [2, 18] years. Each subject has at least four face images acquired over a time span of up to six years. Face comparison scores are obtained from (i) a state-of-the-art COTS matcher (COTS-A), (ii) an open-source matcher (FaceNet), and (iii) a simple sum fusion of scores obtained from COTS-A and FaceNet matchers. To improve the performance of the open-source FaceNet matcher for child face recognition, we were able to fine-tune it on an independent training set of 3,294 face images of 1,119 children in the age group [3, 18] years. Multilevel statistical models are fit to genuine comparison scores from the CLF dataset to determine the decrease in face recognition accuracy over time. Additionally, we analyze both the verification and open-set identification accuracies in order to evaluate state-of-the-art face recognition technology for tracing and identifying children lost at a young age as victims of child trafficking or abduction

arXiv.org e-Print Archive

Crossref

What Twitter Profile and Posted Images Reveal About Depression and Anxiety

Author: Eichstaedt Johannes C.
Guntuku Sharath Chandra
Preotiuc-Pietro Daniel
Ungar Lyle H.
Publication venue
Publication date: 04/04/2019
Field of study

Previous work has found strong links between the choice of social media images and users' emotions, demographics and personality traits. In this study, we examine which attributes of profile and posted images are associated with depression and anxiety of Twitter users. We used a sample of 28,749 Facebook users to build a language prediction model of survey-reported depression and anxiety, and validated it on Twitter on a sample of 887 users who had taken anxiety and depression surveys. We then applied it to a different set of 4,132 Twitter users to impute language-based depression and anxiety labels, and extracted interpretable features of posted and profile pictures to uncover the associations with users' depression and anxiety, controlling for demographics. For depression, we find that profile pictures suppress positive emotions rather than display more negative emotions, likely because of social media self-presentation biases. They also tend to show the single face of the user (rather than show her in groups of friends), marking increased focus on the self, emblematic for depression. Posted images are dominated by grayscale and low aesthetic cohesion across a variety of image features. Profile images of anxious users are similarly marked by grayscale and low aesthetic cohesion, but less so than those of depressed users. Finally, we show that image features can be used to predict depression and anxiety, and that multitask learning that includes a joint modeling of demographics improves prediction performance. Overall, we find that the image attributes that mark depression and anxiety offer a rich lens into these conditions largely congruent with the psychological literature, and that images on Twitter allow inferences about the mental health status of users.Comment: ICWSM 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Age estimation from face images: Human vs. machine performance

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

A panel model for predicting the diversity of internal temperatures from English dwellings

Author: Crawford-Brown D
Gentry MI
Kelly S
Lomas K
Pollitt M
Shipworth DT
Shipworth MD
Wright A
Publication venue: Tyndall Centre for Climate Change Research
Publication date: 01/01/2012
Field of study

Using panel methods, a model for predicting daily mean internal temperature demand across a heterogeneous domestic building stock is developed. The model offers an important link that connects building stock models to human behaviour. It represents the first time a panel model has been used to estimate the dynamics of internal temperature demand from the natural daily fluctuations of external temperature combined with important behavioural, socio-demographic and building efficiency variables. The model is able to predict internal temperatures across a heterogeneous building stock to within ~0.71°C at 95% confidence and explain 45% of the variance of internal temperature between dwellings. The model confirms hypothesis from sociology and psychology that habitual behaviours are important drivers of home energy consumption. In addition, the model offers the possibility to quantify take-back (direct rebound effect) owing to increased internal temperatures from the installation of energy efficiency measures. The presence of thermostats or thermostatic radiator valves (TRV) are shown to reduce average internal temperatures, however, the use of an automatic timer is statistically insignificant. The number of occupants, household income and occupant age are all important factors that explain a proportion of internal temperature demand. Households with children or retired occupants are shown to have higher average internal temperatures than households who do not. As expected, building typology, building age, roof insulation thickness, wall U-value and the proportion of double glazing all have positive and statistically significant effects on daily mean internal temperature. In summary, the model can be used as a tool to predict internal temperatures or for making statistical inferences. However, its primary contribution offers the ability to calibrate existing building stock models to account for behaviour and socio-demographic effects making it possible to back-out more accurate predictions of domestic energy demand

UCL Discovery