43,082 research outputs found

    Demographic Inference and Representative Population Estimates from Multilingual Social Media Data

    Get PDF
    Social media provide access to behavioural data at an unprecedented scale and granularity. However, using these data to understand phenomena in a broader population is difficult due to their non-representativeness and the bias of statistical inference tools towards dominant languages and groups. While demographic attribute inference could be used to mitigate such bias, current techniques are almost entirely monolingual and fail to work in a global environment. We address these challenges by combining multilingual demographic inference with post-stratification to create a more representative population sample. To learn demographic attributes, we create a new multimodal deep neural architecture for joint classification of age, gender, and organization-status of social media users that operates in 32 languages. This method substantially outperforms current state of the art while also reducing algorithmic bias. To correct for sampling biases, we propose fully interpretable multilevel regression methods that estimate inclusion probabilities from inferred joint population counts and ground-truth population counts. In a large experiment over multilingual heterogeneous European regions, we show that our demographic inference and bias correction together allow for more accurate estimates of populations and make a significant step towards representative social sensing in downstream applications with multilingual social media.Comment: 12 pages, 10 figures, Proceedings of the 2019 World Wide Web Conference (WWW '19

    Longitudinal Study of Child Face Recognition

    Full text link
    We present a longitudinal study of face recognition performance on Children Longitudinal Face (CLF) dataset containing 3,682 face images of 919 subjects, in the age group [2, 18] years. Each subject has at least four face images acquired over a time span of up to six years. Face comparison scores are obtained from (i) a state-of-the-art COTS matcher (COTS-A), (ii) an open-source matcher (FaceNet), and (iii) a simple sum fusion of scores obtained from COTS-A and FaceNet matchers. To improve the performance of the open-source FaceNet matcher for child face recognition, we were able to fine-tune it on an independent training set of 3,294 face images of 1,119 children in the age group [3, 18] years. Multilevel statistical models are fit to genuine comparison scores from the CLF dataset to determine the decrease in face recognition accuracy over time. Additionally, we analyze both the verification and open-set identification accuracies in order to evaluate state-of-the-art face recognition technology for tracing and identifying children lost at a young age as victims of child trafficking or abduction

    What Twitter Profile and Posted Images Reveal About Depression and Anxiety

    Full text link
    Previous work has found strong links between the choice of social media images and users' emotions, demographics and personality traits. In this study, we examine which attributes of profile and posted images are associated with depression and anxiety of Twitter users. We used a sample of 28,749 Facebook users to build a language prediction model of survey-reported depression and anxiety, and validated it on Twitter on a sample of 887 users who had taken anxiety and depression surveys. We then applied it to a different set of 4,132 Twitter users to impute language-based depression and anxiety labels, and extracted interpretable features of posted and profile pictures to uncover the associations with users' depression and anxiety, controlling for demographics. For depression, we find that profile pictures suppress positive emotions rather than display more negative emotions, likely because of social media self-presentation biases. They also tend to show the single face of the user (rather than show her in groups of friends), marking increased focus on the self, emblematic for depression. Posted images are dominated by grayscale and low aesthetic cohesion across a variety of image features. Profile images of anxious users are similarly marked by grayscale and low aesthetic cohesion, but less so than those of depressed users. Finally, we show that image features can be used to predict depression and anxiety, and that multitask learning that includes a joint modeling of demographics improves prediction performance. Overall, we find that the image attributes that mark depression and anxiety offer a rich lens into these conditions largely congruent with the psychological literature, and that images on Twitter allow inferences about the mental health status of users.Comment: ICWSM 201

    Age estimation from face images: Human vs. machine performance

    Full text link

    A panel model for predicting the diversity of internal temperatures from English dwellings

    Get PDF
    Using panel methods, a model for predicting daily mean internal temperature demand across a heterogeneous domestic building stock is developed. The model offers an important link that connects building stock models to human behaviour. It represents the first time a panel model has been used to estimate the dynamics of internal temperature demand from the natural daily fluctuations of external temperature combined with important behavioural, socio-demographic and building efficiency variables. The model is able to predict internal temperatures across a heterogeneous building stock to within ~0.71°C at 95% confidence and explain 45% of the variance of internal temperature between dwellings. The model confirms hypothesis from sociology and psychology that habitual behaviours are important drivers of home energy consumption. In addition, the model offers the possibility to quantify take-back (direct rebound effect) owing to increased internal temperatures from the installation of energy efficiency measures. The presence of thermostats or thermostatic radiator valves (TRV) are shown to reduce average internal temperatures, however, the use of an automatic timer is statistically insignificant. The number of occupants, household income and occupant age are all important factors that explain a proportion of internal temperature demand. Households with children or retired occupants are shown to have higher average internal temperatures than households who do not. As expected, building typology, building age, roof insulation thickness, wall U-value and the proportion of double glazing all have positive and statistically significant effects on daily mean internal temperature. In summary, the model can be used as a tool to predict internal temperatures or for making statistical inferences. However, its primary contribution offers the ability to calibrate existing building stock models to account for behaviour and socio-demographic effects making it possible to back-out more accurate predictions of domestic energy demand
    corecore