1 research outputs found

    Token Level Code-Switching Detection Using Wikipedia as a Lexical Resource

    No full text
    We present a novel lexicon-based classification approach for code-switching detection on Twitter. The main aim is to develop a simple lexical look-up classifier based on frequency information retrieved from Wikipedia. We evaluate the classifier using three different language pairs: Spanish-English, Dutch-English, and German-Turkish. The results indicate that our figures for Spanish-English are competitive with current state of the art classifiers, even though the approach is simplistic and based solely on word frequency information
    corecore