1,543 research outputs found
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
Language, Twitter and Academic Conferences
Using Twitter during academic conferences is a way of engaging and connecting
an audience inherently multicultural by the nature of scientific collaboration.
English is expected to be the lingua franca bridging the communication and
integration between native speakers of different mother tongues. However,
little research has been done to support this assumption. In this paper we
analyzed how integrated language communities are by analyzing the scholars'
tweets used in 26 Computer Science conferences over a time span of five years.
We found that although English is the most popular language used to tweet
during conferences, a significant proportion of people also tweet in other
languages. In addition, people who tweet solely in English interact mostly
within the same group (English monolinguals), while people who speak other
languages tend to show a more diverse interaction with other lingua groups.
Finally, we also found that the people who interact with other Twitter users
show a more diverse language distribution, while people who do not interact
mostly post tweets in a single language. These results suggest a relation
between the number of languages a user speaks, which can affect the interaction
dynamics of online communities.Comment: 4 pages, 3 figures, 4 tables, submitted to ACM Hypertext and Social
Media 201
Community and Social Interaction in Digital Religious Discourse in Nigeria, Ghana and Cameroon
Since the advent of the Internet, religion has maintained a very strong online presence. This study examines how African Christianity is negotiated and practised on the Internet. The main objectives are to investigate to what extent online worshippers in Nigeria, Ghana and Cameroon constitute (online) communities and how interactive the social networks of the churches are. This study shows that some important criteria for community are met by African digital worshippers. However, interaction flow is more of one to many, thus members do not regularly interact with one another as they would in offline worship. Worshippers view the forums as a sacred space solely for spiritual matters and not for sharing social or individual feelings and problems. However, the introduction of social media networks such as Facebook, Twitter, YouTube and interactive forums is an interesting and promising new development in religious worship in Africa
S\'i o no, qu\`e penses? Catalonian Independence and Linguistic Identity on Social Media
Political identity is often manifested in language variation, but the
relationship between the two is still relatively unexplored from a quantitative
perspective. This study examines the use of Catalan, a language local to the
semi-autonomous region of Catalonia in Spain, on Twitter in discourse related
to the 2017 independence referendum. We corroborate prior findings that
pro-independence tweets are more likely to include the local language than
anti-independence tweets. We also find that Catalan is used more often in
referendum-related discourse than in other contexts, contrary to prior findings
on language variation. This suggests a strong role for the Catalan language in
the expression of Catalonian political identity.Comment: NAACL 201
Cross-language Wikipedia Editing of Okinawa, Japan
This article analyzes users who edit Wikipedia articles about Okinawa, Japan,
in English and Japanese. It finds these users are among the most active and
dedicated users in their primary languages, where they make many large,
high-quality edits. However, when these users edit in their non-primary
languages, they tend to make edits of a different type that are overall smaller
in size and more often restricted to the narrow set of articles that exist in
both languages. Design changes to motivate wider contributions from users in
their non-primary languages and to encourage multilingual users to transfer
more information across language divides are presented.Comment: In Proceedings of the SIGCHI Conference on Human Factors in Computing
Systems, CHI 2015. AC
European Language Ecology and Bilingualism with English on Twitter
The present paper deals with Flemish adolescents' informal computer-mediated communication (CMC) in a large corpus (2.9 million tokens) of chat conversations. We analyze deviations from written standard Dutch and possible correlations with the teenagers' gender, age and educational track. The concept of non-standardness is operationalized by means of a wide range of features that serve different purposes, related to the chatspeak maxims of orality, brevity and expressiveness. It will be demonstrated how the different social variables impact on non-standard writing, and, more importantly, how they interact with each other. While the findings for age and education correspond to our expectations (more non-standard markers are used by younger adolescents and students in practice-oriented educational tracks), the results for gender (no significant difference between girls and boys) do not: they call for a more fine-grained analysis of non-standard writing, in which features relating to different chat principles are examined separately
- …