278,283 research outputs found
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
Echoes of power: Language effects and power differences in social interaction
Understanding social interaction within groups is key to analyzing online
communities. Most current work focuses on structural properties: who talks to
whom, and how such interactions form larger network structures. The
interactions themselves, however, generally take place in the form of natural
language --- either spoken or written --- and one could reasonably suppose that
signals manifested in language might also provide information about roles,
status, and other aspects of the group's dynamics. To date, however, finding
such domain-independent language-based signals has been a challenge.
Here, we show that in group discussions power differentials between
participants are subtly revealed by how much one individual immediately echoes
the linguistic style of the person they are responding to. Starting from this
observation, we propose an analysis framework based on linguistic coordination
that can be used to shed light on power relationships and that works
consistently across multiple types of power --- including a more "static" form
of power based on status differences, and a more "situational" form of power in
which one individual experiences a type of dependence on another. Using this
framework, we study how conversational behavior can reveal power relationships
in two very different settings: discussions among Wikipedians and arguments
before the U.S. Supreme Court.Comment: v3 is the camera-ready for the Proceedings of WWW 2012. Changes from
v2 include additional technical analysis. See
http://www.cs.cornell.edu/~cristian/www2012 for data and more inf
All Who Wander: On the Prevalence and Characteristics of Multi-community Engagement
Although analyzing user behavior within individual communities is an active
and rich research domain, people usually interact with multiple communities
both on- and off-line. How do users act in such multi-community environments?
Although there are a host of intriguing aspects to this question, it has
received much less attention in the research community in comparison to the
intra-community case. In this paper, we examine three aspects of
multi-community engagement: the sequence of communities that users post to, the
language that users employ in those communities, and the feedback that users
receive, using longitudinal posting behavior on Reddit as our main data source,
and DBLP for auxiliary experiments. We also demonstrate the effectiveness of
features drawn from these aspects in predicting users' future level of
activity.
One might expect that a user's trajectory mimics the "settling-down" process
in real life: an initial exploration of sub-communities before settling down
into a few niches. However, we find that the users in our data continually post
in new communities; moreover, as time goes on, they post increasingly evenly
among a more diverse set of smaller communities. Interestingly, it seems that
users that eventually leave the community are "destined" to do so from the very
beginning, in the sense of showing significantly different "wandering" patterns
very early on in their trajectories; this finding has potentially important
design implications for community maintainers. Our multi-community perspective
also allows us to investigate the "situation vs. personality" debate from
language usage across different communities.Comment: 11 pages, data available at
https://chenhaot.com/pages/multi-community.html, Proceedings of WWW 2015
(updated references
Understanding the hobbit: the cross-national and cross-linguistic reception of a global media product in Belgium, France and the Netherlands
The Hobbit franchise, as many global media products, reaches audiences worldwide. Audience members apparently consume a uniform media product. But do they? The World Hobbit Project offers a new and exciting opportunity to explore differences and similarities, for it provides us with audiences' understandings of the trilogy across languages and nationalities. In this paper we conduct a statistical analysis on differences and similarities in understandings of The Hobbit trilogy between Belgium, the Netherlands, and France – both in what audiences do and do not feel The Hobbit films to be. Analyzing this particular region in Europe provides an extraordinary opportunity, for The World Hobbit project allows us to compare on the language level (the Dutch and French-speaking Belgian regions with respectively the Netherlands and France), as well as on the level of national identities (comparing the three countries amongst each other). In doing so, we are able to further understand what informs geographical and linguistic differences in the consumption of a uniform media product. As such, this paper touches upon cultural hegemony, cross-border flows of fiction, language and cultural proximity
Loyalty in Online Communities
Loyalty is an essential component of multi-community engagement. When users
have the choice to engage with a variety of different communities, they often
become loyal to just one, focusing on that community at the expense of others.
However, it is unclear how loyalty is manifested in user behavior, or whether
loyalty is encouraged by certain community characteristics.
In this paper we operationalize loyalty as a user-community relation: users
loyal to a community consistently prefer it over all others; loyal communities
retain their loyal users over time. By exploring this relation using a large
dataset of discussion communities from Reddit, we reveal that loyalty is
manifested in remarkably consistent behaviors across a wide spectrum of
communities. Loyal users employ language that signals collective identity and
engage with more esoteric, less popular content, indicating they may play a
curational role in surfacing new material. Loyal communities have denser
user-user interaction networks and lower rates of triadic closure, suggesting
that community-level loyalty is associated with more cohesive interactions and
less fragmentation into subgroups. We exploit these general patterns to predict
future rates of loyalty. Our results show that a user's propensity to become
loyal is apparent from their first interactions with a community, suggesting
that some users are intrinsically loyal from the very beginning.Comment: Extended version of a paper appearing in the Proceedings of ICWSM
2017 (with the same title); please cite the official ICWSM versio
- …