15,314 research outputs found
Computational Sociolinguistics: A Survey
Language is a social phenomenon and variation is inherent to its social
nature. Recently, there has been a surge of interest within the computational
linguistics (CL) community in the social dimension of language. In this article
we present a survey of the emerging field of "Computational Sociolinguistics"
that reflects this increased interest. We aim to provide a comprehensive
overview of CL research on sociolinguistic themes, featuring topics such as the
relation between language and social identity, language use in social
interaction and multilingual communication. Moreover, we demonstrate the
potential for synergy between the research communities involved, by showing how
the large-scale data-driven methods that are widely used in CL can complement
existing sociolinguistic studies, and how sociolinguistics can inform and
challenge the methods and assumptions employed in CL studies. We hope to convey
the possible benefits of a closer collaboration between the two communities and
conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication:
18th February, 201
Traditional magic or European occultism? Commercial fortune-telling and magic in post-Soviet Russia and their relationship to Russian tradition
The article examines the vibrant commercial magic and fortune-telling industry in Russia today. Based on fieldwork in Petersburg conducted in 2006, supplemented by printed and, in particular, web material, it seeks to show that, despite the many similarities with its counterparts in Europe and North America, Russian fortune-telling and magic are clearly shaped by local traditions. In the context of the article, tradition is taken to include not just rural folk magic and divination, but also urban traditions of the late imperial period as well as those resulting from Soviet policies and practices. It emerges that as far as magic services are concerned, the range of services offered are those demanded by the client, largely stemming from folk tradition. By contrast discourse, approach and ritual often owe much to Western esoteric literature, and perhaps also to pre-Revolutionary occultism and the Soviet interest in psychics. In the case of fortune-telling, today’s professionals (gypsies apart) have adopted more complex and sophisticated ways of telling the future (tarot and astrology). Old ways of fortune-telling are so widely known that they must offer something different to clients. Tradition survives in many ways, sometimes transmuted, sometimes partial, but it makes the Russian magic and fortune-telling scene distinctive
A Decade of Shared Tasks in Digital Text Forensics at PAN
[EN] Digital text forensics aims at examining the originality and
credibility of information in electronic documents and, in this regard, to extract and analyze information about the authors of these documents. The research field has been substantially developed during the last decade. PAN is a series of shared tasks that started in 2009 and significantly contributed to attract the attention of the research community in well-defined digital text forensics tasks. Several benchmark datasets have been developed to assess the state-of-the-art performance in a wide range of tasks. In this paper, we present the evolution of both the examined tasks and the developed datasets during the last decade. We also briefly introduce the upcoming PAN 2019 shared tasks.We are indebted to many colleagues and friends who contributed greatly to PAN's tasks: Maik Anderka, Shlomo Argamon, Alberto Barrón-Cedeño, Fabio Celli, Fabio Crestani, Walter Daelemans, Andreas Eiselt, Tim Gollub,
Parth Gupta, Matthias Hagen, Teresa Holfeld, Patrick Juola, Giacomo Inches, Mike
Kestemont, Moshe Koppel, Manuel Montes-y-Gómez, Aurelio Lopez-Lopez, Francisco
Rangel, Miguel Angel Sánchez-Pérez, Günther Specht, Michael Tschuggnall, and Ben
Verhoeven. Our special thanks go to PAN¿s sponsors throughout the years and not
least to the hundreds of participants.Potthast, M.; Rosso, P.; Stamatatos, E.; Stein, B. (2019). A Decade of Shared Tasks in Digital Text Forensics at PAN. Lecture Notes in Computer Science. 11438:291-300. https://doi.org/10.1007/978-3-030-15719-7_39S2913001143
Stance Prediction for Russian: Data and Analysis
Stance detection is a critical component of rumour and fake news
identification. It involves the extraction of the stance a particular author
takes related to a given claim, both expressed in text. This paper investigates
stance classification for Russian. It introduces a new dataset, RuStance, of
Russian tweets and news comments from multiple sources, covering multiple
stories, as well as text classification approaches to stance detection as
benchmarks over this data in this language. As well as presenting this
openly-available dataset, the first of its kind for Russian, the paper presents
a baseline for stance prediction in the language
- …