35,655 research outputs found

    Using word and phrase abbreviation patterns to extract age from Twitter microtexts

    Get PDF
    The wealth of texts available publicly online for analysis is ever increasing. Much work in computational linguistics focuses on syntactic, contextual, morphological and phonetic analysis on written documents, vocal recordings, or texts on the internet. Twitter messages present a unique challenge for computational linguistic analysis due to their constrained size. The constraint of 140 characters often prompts users to abbreviate words and phrases. Additionally, as an informal writing medium, messages are not expected to adhere to grammatically or orthographically standard English. As such, Twitter messages are noisy and do not necessarily conform to standard writing conventions of linguistic corpora, often requiring special pre-processing before advanced analysis can be done. In the area of computational linguistics, there is an interest in determining latent attributes of an author. Attributes such as author gender can be determined with some amount of success from many sources, using various methods, such as analysis of shallow linguistic patterns or topic. Author age is more difficult to determine, but previous research has been somewhat successful at classifying age as a binary (e.g. over or under 30), ternary, or even as a continuous variable using various techniques. Twitter messages present a difficult problem for latent user attribute analysis, due to the pre-processing necessary for many computational linguistics analysis tasks. An added logistical challenge is that very few latent attributes are explicitly defined by users on Twitter. Twitter messages are a part of an enormous data set, but the data set must be independently annotated for latent writer attributes not defined through the Twitter API before any classification on such attributes can be done. The actual classification problem is another particular challenge due to restrictions on tweet length. Previous work has shown that word and phrase abbreviation patterns used on Twitter can be indicative of some latent user attributes, such as geographic region or the Twitter client (iPhone, Android, Twitter website, etc.) used to make posts. Language change has generally been posited as being driven by women. This study explores if there there are age-related patterns or change in those patterns over time evident in Twitter posts from a variety of English authors. This work presents a growable data set annotated by Twitter users themselves for age and other useful attributes. The study also presents an extension of prior work on Twitter abbreviation patterns which shows that word and phrase abbreviation patterns can be used toward determining user age. Notable results include classification accuracy of up to 83%, which was 63% above relative majority class baseline (ZeroR in Weka) when classifying user ages into 6 equally sized age bins using a multilayer perceptron network classifier

    Exploring Identities in Online Music Fandoms: How identities Formed in Online Fan Communities Affect Real Life Identities

    Get PDF
    This thesis set out to explore the identities formed by members of online fandom communities, and to determine the ways in which those identities affect their real life, offline identities. This qualitative study encountered elements related to stereo types of young women who are fans of mainstream pop music, and provided insight on their experiences through interviews with five long time boy band online fandom members. This study asked if fans prefer to keep their fandom identities internal or let them reflect outward, how one\u27s online identity affects or translates to their real life identity, and what experiences in the online fandom were the most impactful to the individual\u27s real life identity or led to new knowledge. It was revealed that online community platform is the place where fans gather to enjoy a similar passion, but it is the relationships and discussions held on the site between fans that truly affect an individual and their identity, rather than the more superficial elements of being in a fandom. Through fandom discourse, members found social support and solidarity with one another

    Youth and Unions

    Get PDF
    [Excerpt] Following a suggestion from the Cornell ILR Labor Advisory Counsel in early 2009 Cornell ILR began studying the relationships between young workers and unions. Marlena Fontes, a Cornell student, worked with Cornell Extension Faculty Ken Margolies and others during the summer of 2009 on the study. The study is based on a literature review, survey research, observations and focus groups. The report provides a glimpse into the issues that are facing young people and unions and how unions are seeking to organize and involve young workers and members. The table on page 9 summarizes the survey research conducted by Ms. Fontes and two other Cornell summer Fellows

    A Sociolinguistic Study of Code Choice among Saudis on Twitter

    Full text link
    The present study is an attempt to explore a new dimension of language use: how Arabic is utilized in the social media, Twitter in particular. It attempts to examine codeswitching (CS) in its written form between standard Arabic (SA) and Saudi dialect (SD). It aims to answer three research questions, namely: 1. What are the functions of using CS on Saudi Twitter? Are these functions different from the functions of CS in face-to-face interactions? 2. Do patterns of CS differ by gender and education? 3. Do patterns of CS differ by topic? The current study adopts the sociolinguistic approach and provides a qualitative descriptive and quantitative analysis of 7350 tweets which were collected between December 2016 and July 2017, from 210 Saudi Twitter accounts diversified in terms of gender and education. The goal was to compare the motivations for CS in the written form with those motivations that have been identified in face-to-face interactions and to explore whether CS patterns would differ by gender and education. An additional 500 tweets were collected to investigate whether or not CS patterns would change by topic. The findings revealed that the Saudi Twitter community utilized SA more than the SD. The study revealed that CS to SA is correlated with prestige, importance, sophistication, and seriousness. It revealed that the Saudi Twitter community switched to SA for the following social motivations: 1. to introduce formulaic expressions 2. to emphasize a point 3. to quote 4. to shift from comic to serious tone 5. to take a pedantic stand. In contrast, the SD or the Low variety is associated with sarcasm, informality, low-prestige, and everyday topics. It revealed that the Saudi Twitter community switched to the SD for the following social motivations: 1. for a specific intended meaning 2. for sarcasm and criticism 3. for quotations 4. for exemplifying and simplification 5. for introducing daily-life sayings 6. for scolding and personal attack or insult 7. for common usage. Regarding the role of topic in CS patterns, the present study provided evidence against Ferguson’s prediction (1959) in which he associated code choice with the topic and situation. It revealed that CS occurred in different contexts that varied in their formality and informality. Therefore, the study provided evidence that CS occurs to perform intended functions. As for gender, the study found that men utilized SA more than women, and this confirms previous findings of Ibrahim (1986), and Abd-El-Jawad (1987), Badawi (1973), and Haeri (1996a), Schmidt (1974), and Walters (1996) that women with the same level of education as men use SA less than men. Regarding education, the present study found that the Saudi Twitter users with high and college education used SA more than their counterparts with less than college education. However, the current study should have considered age in addition to gender and education, because education by itself might be “a proxy variable” that could act on behalf of other less obvious independent variables (Al-Wer 2009). The findings of the present study suggest studying each community independently as each community differs in terms of its social variables, language attitudes, perceptions, and language policies. Finally, the study emphasizes the importance of teaching SA to Arabic learners, placing less focus on dialects to learners due to the stability of SA, and designing as well as developing curriculums accordingly.PHDNear Eastern StudiesUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144071/1/saeedaa_1.pd

    Spartan Daily August 30, 2012

    Get PDF
    Volume 139, Issue 3https://scholarworks.sjsu.edu/spartandaily/1319/thumbnail.jp

    Technology Criticism in the Classroom (Chapter in The Nature of Technology)

    Full text link
    I first heard about a tragedy in Tucson, not from major television news networks, but from a direct message sent by a politically-active friend who was attending the political gathering where a mass shooting took place, including the shooting of an Arizona congresswoman, Gabrielle Giffords. While the television news sputtered around trying to offer details (initially wrongly claiming that she was dead, likely from pressure to be the first to report big news), I found myself reading Google News, piecing together Facebook posts, e-mailing friends and reading Twitter updates

    Spartan Daily, April 10, 2018

    Get PDF
    Volume 150, Issue 29https://scholarworks.sjsu.edu/spartan_daily_2018/1028/thumbnail.jp

    Spartan Daily, April 25, 2019

    Get PDF
    Volume 152, Issue 37https://scholarworks.sjsu.edu/spartan_daily_2019/1036/thumbnail.jp

    Spartan Daily September 27, 2012

    Get PDF
    Volume 139, Issue 16https://scholarworks.sjsu.edu/spartandaily/1332/thumbnail.jp

    Spartan Daily, October 5, 2017

    Get PDF
    Volume 149, Issue 19https://scholarworks.sjsu.edu/spartan_daily_2017/1060/thumbnail.jp
    • …
    corecore