34,603 research outputs found
Gender Prediction of Indonesian Twitter Users Using Tweet and Profile Features
The increasing use of social media generates huge amounts of data which in turn triggers research into social media analytics. Social media contents can be analyzed to explore public opinion on an issue or provide the insights reflecting proxy indicators towards real-world events. Understanding the demographics of social media users can increase the potential for applications of sentiment analysis, topic modeling, and other analytical tasks. To map demographics, we need to know the latent attributes of users, such as age, gender, occupation and location of residence. Since this attribute is not directly available, we need to do some inference from the social media data. This study aims to predict the gender attribute given a Twitter user account. We conducted experiments with several supervised classifiers with feature extraction, including the use of word embedding representations. The results of this study indicate that the combination of features extracted from Tweet contents and user profile structured data can predict the gender of Twitter users in Indonesia with accuracy above 80%
Recommended from our members
Using social media to inform policy making: to whom are we listening?
Domination of social media is giving today’s web users a venue for expressing their views and sharing their experiences with others. With well over a billion active users, social networking sites (SNS) have become dynamic sources of information on peoples’ interests, needs and opinions and are considered an extremely rich source of content to reach out to many millions of people. This is creating a revolutionary opportunity for governments to learn about the citizens and to engage with them more effectively. The potential is there for eParticipation applications to go from simply informing the public to unprecedented levels of interaction and engagement between Policy Makers (PMs) and the community, involving the public in deliberation processes leading to legislation.
Despite its great potential, several concerns arise from the exploitation of social media, especially when used to inform policy making. Among these issues we can highlight the lack of awareness of the characteristics of those citizens discussing policy topics in social media, and lack of awareness of the characteristics of their discussions. Although some studies have emerged in the last few years that aim to capture the demographics of social media users (e.g., gender, age, geographical locations) they tend not to focus on those specific users participating in policy discussions. Understanding who are the users discussing policy in social media and how policy topics are debated could help assessing how their views and opinions should be weighted and considered to inform policy making.
Aiming to provide a step forward in this direction, this paper investigates the characteristics of over 8K users involved in policy discussions in Twitter. These discussions were collected by monitoring, for one week, 42 different political topics selected by sixteen PMs from different political institutions in Germany. Our results indicate that: (i) a high volume of conversations around policy topics does not come from citizens, but from news agencies and other organisations, (ii) the average user discussing policy topics in Twitter is more active, popular and engaged than the average Twitter user and, (iii) users engaged in social media conversations around policy topics tend to be geographically concentrated in constituencies with high population density. Regarding the analysed conversations, a small subset of topics is extensively discussed while the majority go relatively unnoticed
#greysanatomy vs. #yankees: Demographics and Hashtag Use on Twitter
Demographics, in particular, gender, age, and race, are a key predictor of
human behavior. Despite the significant effect that demographics plays, most
scientific studies using online social media do not consider this factor,
mainly due to the lack of such information. In this work, we use
state-of-the-art face analysis software to infer gender, age, and race from
profile images of 350K Twitter users from New York. For the period from
November 1, 2014 to October 31, 2015, we study which hashtags are used by
different demographic groups. Though we find considerable overlap for the most
popular hashtags, there are also many group-specific hashtags.Comment: This is a preprint of an article appearing at ICWSM 201
Where are my followers? Understanding the Locality Effect in Twitter
Twitter is one of the most used applications in the current Internet with
more than 200M accounts created so far. As other large-scale systems Twitter
can obtain enefit by exploiting the Locality effect existing among its users.
In this paper we perform the first comprehensive study of the Locality effect
of Twitter. For this purpose we have collected the geographical location of
around 1M Twitter users and 16M of their followers. Our results demonstrate
that language and cultural characteristics determine the level of Locality
expected for different countries. Those countries with a different language
than English such as Brazil typically show a high intra-country Locality
whereas those others where English is official or co-official language suffer
from an external Locality effect. This is, their users have a larger number of
followers in US than within their same country. This is produced by two
reasons: first, US is the dominant country in Twitter counting with around half
of the users, and second, these countries share a common language and cultural
characteristics with US
White, Man, and Highly Followed: Gender and Race Inequalities in Twitter
Social media is considered a democratic space in which people connect and
interact with each other regardless of their gender, race, or any other
demographic factor. Despite numerous efforts that explore demographic factors
in social media, it is still unclear whether social media perpetuates old
inequalities from the offline world. In this paper, we attempt to identify
gender and race of Twitter users located in U.S. using advanced image
processing algorithms from Face++. Then, we investigate how different
demographic groups (i.e. male/female, Asian/Black/White) connect with other. We
quantify to what extent one group follow and interact with each other and the
extent to which these connections and interactions reflect in inequalities in
Twitter. Our analysis shows that users identified as White and male tend to
attain higher positions in Twitter, in terms of the number of followers and
number of times in user's lists. We hope our effort can stimulate the
development of new theories of demographic information in the online space.Comment: In Proceedings of the IEEE/WIC/ACM International Conference on Web
Intelligence (WI'17). Leipzig, Germany. August 201
“I just want to be skinny.”: A content analysis of tweets expressing eating disorder symptoms
There is increasing concern about online communities that promote eating disorder (ED) behaviors through messages and/or images that encourage a “thin ideal” (i.e., promotion of thinness as attractive) and harmful weight loss/weight control practices. The purpose of this paper is to assess the content of body image and ED-related content on Twitter and provide a deeper understanding of EDs that may be used for future studies and online-based interventions. Tweets containing ED or body image-related keywords were collected from January 1-January 31, 2015 (N = 28,642). A random sample (n = 3000) was assessed for expressions of behaviors that align with subscales of the Eating Disorder Examination (EDE) 16.0. Demographic characteristics were inferred using a social media analytics company. The comprehensive research that we conducted indicated that 2,584 of the 3,000 tweets were ED-related; 65% expressed a preoccupation with body shape, 13% displayed issues related to food/eating/calories, and 4% expressed placing a high level of importance on body weight. Most tweets were sent by girls (90%) who were ≤19 years old (77%). Our findings stress a need to better understand if and how ED-related content on social media can be used for targeting prevention and intervention messages towards those who are in-need and could potentially benefit from these efforts.</div
- …