Search CORE

6 research outputs found

A Computational Study in the Detection of English–Spanish Code-Switches

Author: Polanco Yohamy C
Publication venue: CUNY Academic Works
Publication date: 01/02/2021
Field of study

Code-switching is the linguistic phenomenon where a multilingual person alternates between two or more languages in a conversation, whether that be spoken or written. This thesis studies the automatic detection of code-switching occurring specifically between English and Spanish in two corpora. Twitter and other social media sites have provided an abundance of linguistic data that is available to researchers to perform countless experiments. Collecting the data is fairly easy if a study is on monolingual text, but if a study requires code-switched data, this becomes a complication as APIs only accept one language as a parameter. This thesis focuses on identifying code-switching in both Twitter data and the Miami-Bangor corpus. This is done by conducting three different experiments. Our first experiment is a logistic regression model where we attempt to distinguish code-switched data from monolingual data. The second experiment is using a novel Word2Vec average nearest neighbor (WANN) classifier based on word embeddings to detect code-switching. The third experiment uses Doc2Vec, where the model uses the mean vector of each document to learn and distinguish between code-switched and monolingual data. Each of these experiments are performed twice, once with tweets and once with the Miami Bangor corpus. The results show that the WANN model performs best on Twitter data. The Doc2Vec model performs best on the Miami Bangor corpus. However, both approaches did well and the performances are comparable

City University of New York

Proceedings of the 10th Linguistic Annotation Workshop held in conjunction with ACL 2016 (LAW-X 2016), August 11, 2016, Berlin, Germany

Author: Friedrich Annemarie
Tomanek Katrin
Publication venue
Publication date: 01/01/2016
Field of study

OPUS Augsburg

Code-switching during church sermons: implications on language development.

Author: Dladla Celimpilo Piety.
Publication venue
Publication date: 01/01/2017
Field of study

Master of Arts in the school of arts. University of KwaZulu-Natal, Pietermaritzburg 2017

ResearchSpace@UKZN