840 research outputs found

    Assessing the Nigerianness of SMS Text-Messages in English

    Get PDF
    In the history of the English language certain developments have left significant linguistic marks on the language. As new developments and cultural forms occur, new words and styles of expression evolve with them and spread. This is true of the new linguistic style that is associated with the Global System for Mobile Communications (GSM) revolution in Nigeria since 2001. GSM has brought with it a variety of English that is situationally distinctive and context sensitive (Awonusi, 2004:45)

    Towards shared datasets for normalization research

    Get PDF
    In this paper we present a Dutch and English dataset that can serve as a gold standard for evaluating text normalization approaches. With the combination of text messages, message board posts and tweets, these datasets represent a variety of user generated content. All data was manually normalized to their standard form using newly-developed guidelines. We perform automatic lexical normalization experiments on these datasets using statistical machine translation techniques. We focus on both the word and character level and find that we can improve the BLEU score with ca. 20% for both languages. In order for this user generated content data to be released publicly to the research community some issues first need to be resolved. These are discussed in closer detail by focussing on the current legislation and by investigating previous similar data collection projects. With this discussion we hope to shed some light on various difficulties researchers are facing when trying to share social media data

    Gender Differences in the Text Messaging of Young Jordanian University Students: An Analysis of Linguistic Feature

    Get PDF
    In spite of being extensively studied in face-to-face communication, gender differences remain widely unexplored within text messaging. The objectives of this study are to explore gender differences in the use of linguistic features in the text messaging of young Jordanian male and female university students with regard to (1) lexical features (abbreviations, acronyms, shortenings, borrowing, derivation, blending, compounding, and conversion), (2) syntactic features (deletion of subject pronoun, deletion of subject pronoun and auxiliary, deletion of copular/ modal verb, and deletion of article), and (3) typographical features (punctuation, letter and number homophones, phonetic spellings, onomatopoeic words, and emoticons). Theoretically, the study is guided by Bodomo and Lee‟s model of Technology-conditioned Language Change and Use and Herring‟s approach of Computer-Mediated Discourse Analysis. Three techniques of qualitative data collection were used: open-ended questionnaires, user diaries and semi-structured interviews to elicit information on the features reflected in the text messages of the students. One hundred students responded to a questionnaire while twenty students participated in semi-structured interviews. The sixty students who participated in the user diaries provided a corpus of 1,612 text messages which were analyzed according to the gender of the senders. The messages were also analyzed for occurrences of lexical, syntactic, and typographical features, and compared for differences across gender. Lexical features were categorized based on Yule‟s (2009) categorization of word-formation processes while syntactic and typographical features were categorized according to Hård af Segrestad‟s (2002) and Thurlow's (2003) typology of linguistic features of text messaging. The findings of this study reveal the existence of gender differences in the text messages of the Jordanian students in all the three linguistic features. The females tend to use more lexical features than males, whereas the males tend to favor the deletion of syntactic features more than females. In terms of typographical features, the males tend to use more letter and number homophones and phonetic spelling than females while the females tend to use more punctuation, onomatopoeic words and emoticons than males. The findings corroborate with previous findings on differences across gender in text messaging as well as in face-to-face and computer-mediated communication. This study contributes to the literature related to the study of language in terms of the use of some of the linguistic features and their variations in text messaging between males and females. Some implications and recommendations are provided in this study

    Information-theoretic causal inference of lexical flow

    Get PDF
    This volume seeks to infer large phylogenetic networks from phonetically encoded lexical data and contribute in this way to the historical study of language varieties. The technical step that enables progress in this case is the use of causal inference algorithms. Sample sets of words from language varieties are preprocessed into automatically inferred cognate sets, and then modeled as information-theoretic variables based on an intuitive measure of cognate overlap. Causal inference is then applied to these variables in order to determine the existence and direction of influence among the varieties. The directed arcs in the resulting graph structures can be interpreted as reflecting the existence and directionality of lexical flow, a unified model which subsumes inheritance and borrowing as the two main ways of transmission that shape the basic lexicon of languages

    THE SELF-REPORTED INFLUENCE OF USING SMS LANGUAGE IN TEXTING AND SOCIAL MEDIA ON SAUDI STUDENTS' ACADEMIC WRITING

    Get PDF
    SMS language is a form of written English that is often used in informal, computer-mediated communications like texting, online chat, and social media. It is known for shortening many words using acronyms and other forms of abbreviation.  SMS language is increasingly well documented in the scholarly literature and its impact on students' formal academic writing is a topic of debate.  This study uses a mixed-methods approach to investigate students' perspectives on their own use of SMS language and how it might affect their formal academic writing.  The sample was composed of final-year university students who are native Arabic speakers and acquired English as a second language.  The data was collected through the use of quantitative questionnaires and supplemented with semi-structured quantitative interviews. The findings revealed that virtually all participants used features of SMS language in their online communications. Still, they struggled to recognize some of the most well-documented, commonly used abbreviations.  The interviews showed that time, convenience, and character limits were the primary motivators for students to use SMS language.  The findings also indicated that at least some students can recall having made spelling or sentence construction errors in formal academic writing that they attribute to their reliance on SMS language in their digital communications. Further scholarly attention is strongly indicated, and possible directions for future research are described
    • …
    corecore