1,543 research outputs found

    Computational Sociolinguistics: A Survey

    Get PDF
    Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

    Language, Twitter and Academic Conferences

    Full text link
    Using Twitter during academic conferences is a way of engaging and connecting an audience inherently multicultural by the nature of scientific collaboration. English is expected to be the lingua franca bridging the communication and integration between native speakers of different mother tongues. However, little research has been done to support this assumption. In this paper we analyzed how integrated language communities are by analyzing the scholars' tweets used in 26 Computer Science conferences over a time span of five years. We found that although English is the most popular language used to tweet during conferences, a significant proportion of people also tweet in other languages. In addition, people who tweet solely in English interact mostly within the same group (English monolinguals), while people who speak other languages tend to show a more diverse interaction with other lingua groups. Finally, we also found that the people who interact with other Twitter users show a more diverse language distribution, while people who do not interact mostly post tweets in a single language. These results suggest a relation between the number of languages a user speaks, which can affect the interaction dynamics of online communities.Comment: 4 pages, 3 figures, 4 tables, submitted to ACM Hypertext and Social Media 201

    Community and Social Interaction in Digital Religious Discourse in Nigeria, Ghana and Cameroon

    Get PDF
    Since the advent of the Internet, religion has maintained a very strong online presence. This study examines how African Christianity is negotiated and practised on the Internet. The main objectives are to investigate to what extent online worshippers in Nigeria, Ghana and Cameroon constitute (online) communities and how interactive the social networks of the churches are. This study shows that some important criteria for community are met by African digital worshippers. However, interaction flow is more of one to many, thus members do not regularly interact with one another as they would in offline worship. Worshippers view the forums as a sacred space solely for spiritual matters and not for sharing social or individual feelings and problems. However, the introduction of social media networks such as Facebook, Twitter, YouTube and interactive forums is an interesting and promising new development in religious worship in Africa

    Political discourse in post-digital societies

    Get PDF

    S\'i o no, qu\`e penses? Catalonian Independence and Linguistic Identity on Social Media

    Full text link
    Political identity is often manifested in language variation, but the relationship between the two is still relatively unexplored from a quantitative perspective. This study examines the use of Catalan, a language local to the semi-autonomous region of Catalonia in Spain, on Twitter in discourse related to the 2017 independence referendum. We corroborate prior findings that pro-independence tweets are more likely to include the local language than anti-independence tweets. We also find that Catalan is used more often in referendum-related discourse than in other contexts, contrary to prior findings on language variation. This suggests a strong role for the Catalan language in the expression of Catalonian political identity.Comment: NAACL 201

    Cross-language Wikipedia Editing of Okinawa, Japan

    Full text link
    This article analyzes users who edit Wikipedia articles about Okinawa, Japan, in English and Japanese. It finds these users are among the most active and dedicated users in their primary languages, where they make many large, high-quality edits. However, when these users edit in their non-primary languages, they tend to make edits of a different type that are overall smaller in size and more often restricted to the narrow set of articles that exist in both languages. Design changes to motivate wider contributions from users in their non-primary languages and to encourage multilingual users to transfer more information across language divides are presented.Comment: In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2015. AC

    European Language Ecology and Bilingualism with English on Twitter

    Get PDF
    The present paper deals with Flemish adolescents' informal computer-mediated communication (CMC) in a large corpus (2.9 million tokens) of chat conversations. We analyze deviations from written standard Dutch and possible correlations with the teenagers' gender, age and educational track. The concept of non-standardness is operationalized by means of a wide range of features that serve different purposes, related to the chatspeak maxims of orality, brevity and expressiveness. It will be demonstrated how the different social variables impact on non-standard writing, and, more importantly, how they interact with each other. While the findings for age and education correspond to our expectations (more non-standard markers are used by younger adolescents and students in practice-oriented educational tracks), the results for gender (no significant difference between girls and boys) do not: they call for a more fine-grained analysis of non-standard writing, in which features relating to different chat principles are examined separately
    • …
    corecore