50 research outputs found
Studying Migrant Assimilation Through Facebook Interests
Migrants' assimilation is a major challenge for European societies, in part
because of the sudden surge of refugees in recent years and in part because of
long-term demographic trends. In this paper, we use Facebook's data for
advertisers to study the levels of assimilation of Arabic-speaking migrants in
Germany, as seen through the interests they express online. Our results
indicate a gradient of assimilation along demographic lines, language spoken
and country of origin. Given the difficulty to collect timely migration data,
in particular for traits related to cultural assimilation, the methods that we
develop and the results that we provide open new lines of research that
computational social scientists are well-positioned to address.Comment: Accepted as a short paper at Social Informatics 2018
(https://socinfo2018.hse.ru/). Please cite the SocInfo versio
White, Man, and Highly Followed: Gender and Race Inequalities in Twitter
Social media is considered a democratic space in which people connect and
interact with each other regardless of their gender, race, or any other
demographic factor. Despite numerous efforts that explore demographic factors
in social media, it is still unclear whether social media perpetuates old
inequalities from the offline world. In this paper, we attempt to identify
gender and race of Twitter users located in U.S. using advanced image
processing algorithms from Face++. Then, we investigate how different
demographic groups (i.e. male/female, Asian/Black/White) connect with other. We
quantify to what extent one group follow and interact with each other and the
extent to which these connections and interactions reflect in inequalities in
Twitter. Our analysis shows that users identified as White and male tend to
attain higher positions in Twitter, in terms of the number of followers and
number of times in user's lists. We hope our effort can stimulate the
development of new theories of demographic information in the online space.Comment: In Proceedings of the IEEE/WIC/ACM International Conference on Web
Intelligence (WI'17). Leipzig, Germany. August 201
Recommended from our members
Identifying tweets from Syria refugees using a Random Forest classifier
A social unrest and violent atmosphere can force a vast number of people to flee their country. While governments and international aid organizations need migration data to inform their decisions, the availability of this data is often delayed due to the tediousness to collect and publish this data. Recent studies recognized the increasing usage of social networking platforms amongst refugees to seek help and express their hardship during their journeys. This paper investigates the feasibility of accurately extracting and identifying tweets from Syria refugees. A robust framework has been developed to find, retrieve, clean and classify tweets from Syria. This includes the development of a Random Forest classifier, which automatically determines which tweets are from Syria refugees. Testing the classifier with samples of historical Twitter data produced promising result of 81% correct classification rate. This preliminary study demonstrates the potential that refugees’ messages can be accurately identified and extracted from social media data mixed with many unwanted messages, and this enables further works for studying refugee issues and predicting their migration patterns
БОЛЬШИЕ ДАННЫЕ И СТАТИСТИКА МИГРАЦИИ
This article presents the first part of the work devoted to the application of innovative approaches to statistics on migration, their directions and priorities. It is focused on the emerging use of Big Data in measuring migration. It is concluded that in the foreseeable future, Big Data will find its niche among the sources of information on population movements. However, at present they can only be used for estimations of various forms of short-term population mobility and shifts in its spatial distribution at certain moments or periods of time. It is not possible to apply to the Big Data the criteria of a migrant and migration identification that are used in official statistics, first of all - the concept of place of usual residence. An important limitation is also the lack of different variables characterizing the structure of migration flows and stocks. It is concluded that Big Data is not yet suitable to become an alternative to the traditional sources of information for the production of reliable and comprehensive statistics on migration. The potential of the latest is far from being exhausted, but the current situation is characterized by a complex of problems that require implementation of advanced technological solutions. Positive anticipations dealing with possible improvement of the situation are associated with establishment of the population register of Russia.В статье, являющейся первой частью работы о применении инновационных подходов в статистике миграции, их направлениях и приоритетах, рассматривается набирающая популярность тема использования больших данных для измерения миграции. Однако в настоящее время они могут использоваться только для оценок различных форм краткосрочной мобильности населения и сдвигов в его размещении в определенные моменты или периоды времени. В больших данных нет возможности применить критерии учета мигрантов и миграции, которые используются в официальной статистике, в первую очередь концепции обычного места жительства. Важным ограничением является отсутствие в больших данных различных переменных, характеризующих структуру миграционных потоков и контингентов. Сделан вывод о том, что большие данные пока не могут быть альтернативой традиционным источникам информации для разработки надежной и понятной статистики миграции. Потенциал этих источников далеко не исчерпан, но текущее положение дел характеризуется комплексом проблем, которые также требуют современных технологических решений. Надежды на возможное улучшение ситуации связываются с созданием регистра населения России
Inferring Social Media Users’ Demographics from Profile Pictures: A Face++ Analysis on Twitter Users
In this research, we evaluate the applicability of using facial recognition of social media account profile pictures to infer the demographic attributes of gender, race, and age of the account owners leveraging a commercial and well-known image service, specifically Face++. Our goal is to determine the feasibility of this approach for actual system implementation. Using a dataset of approximately 10,000 Twitter profile pictures, we use Face++ to classify this set of images for gender, race, and age. We determine that about 30% of these profile pictures contain identifiable images of people using the current state-of-the-art automated means. We then employ human evaluations to manually tag both the set of images that were determined to contain faces and the set that was determined not to contain faces, comparing the results to Face++. Of the thirty percent that Face++ identified as containing a face, about 80% are more likely than not the account holder based on our manual classification, with a variety of issues in the remaining 20%. Of the images that Face++ was unable to detect a face, we isolate a variety of likely issues preventing this detection, when a face actually appeared in the image. Overall, we find the applicability of automatic facial recognition to infer demographics for system development to be problematic, despite the reported high accuracy achieved for image test collection
What Twitter Profile and Posted Images Reveal About Depression and Anxiety
Previous work has found strong links between the choice of social media
images and users' emotions, demographics and personality traits. In this study,
we examine which attributes of profile and posted images are associated with
depression and anxiety of Twitter users. We used a sample of 28,749 Facebook
users to build a language prediction model of survey-reported depression and
anxiety, and validated it on Twitter on a sample of 887 users who had taken
anxiety and depression surveys. We then applied it to a different set of 4,132
Twitter users to impute language-based depression and anxiety labels, and
extracted interpretable features of posted and profile pictures to uncover the
associations with users' depression and anxiety, controlling for demographics.
For depression, we find that profile pictures suppress positive emotions rather
than display more negative emotions, likely because of social media
self-presentation biases. They also tend to show the single face of the user
(rather than show her in groups of friends), marking increased focus on the
self, emblematic for depression. Posted images are dominated by grayscale and
low aesthetic cohesion across a variety of image features. Profile images of
anxious users are similarly marked by grayscale and low aesthetic cohesion, but
less so than those of depressed users. Finally, we show that image features can
be used to predict depression and anxiety, and that multitask learning that
includes a joint modeling of demographics improves prediction performance.
Overall, we find that the image attributes that mark depression and anxiety
offer a rich lens into these conditions largely congruent with the
psychological literature, and that images on Twitter allow inferences about the
mental health status of users.Comment: ICWSM 201