5 research outputs found

    Definitely Islands? Experimental investigation of definite islands

    Get PDF
    Experimental work on islands has used formal acceptability judgment studies to quantify the severity of different island violations. This current study uses this approach to probe the (in-)violability of definite islands, an understudied island, in offline and online measures. We conducted two acceptability judgment studies and find a modest island effect. However, rating distributions appear bimodal across definites and indefinites. We also conducted a self-paced reading experiment, but found no sig- nificant effects. Overall, offline, definite islands differ from other uniform islands, but online, the results are more complicated

    Trip the freaking light fantastic: Syntactic structure in English verbal idioms

    Get PDF
    In past scholarship, idioms have been discussed from a mostly semantic perspective; authors have been primarily concerned with how idiomatic meaning is composed and stored (Swinney and Cutler 1979; Gibbs 1980; 1986). This article investigates idioms’ syntactic behavior and concludes that all verbal idioms of English have stored, internal syntactic structure. Vacuous modification (i.e. modification that does not contribute to the semantics of the phrase), metalinguistic modification (i.e. modification that indicates non-literal readings), aspect, and subject-oriented adverbs (SOAs) are used to test a variety of idioms for evidence of syntactic structure. There are restrictions on the syntactic processes some idioms can undergo (i.e., passivization and raising constructions). However, this is not due to their lack of internal syntax, but how their meaning is mapped onto the internal syntax.Ope

    Count or Context: Investigating Methods of Text Analysis

    Get PDF
    Using text as a source of psychological and cognitive information has become a popular subject (Robinson, Navea & Ickes, 2013; Donahue, Liang & Druckman, 2014; Wolfe & Goldman, 2003). To do this, researchers use a variety of methods to analyze text, but Linguistic Inquiry Word Count (LIWC) has become one the more common techniques. LIWC is a tokenbased method that contains multiple dictionaries representing various psychological states (positive affect, leisure, religion, social words) and keeps a running tabulation of how many words in a given text occur in each category. Latent Semantic Analysis (LSA) is a context-based method that uses statistics to calculate similarity between different texts based off the surrounding words. As a common strategy of analyzing text for psychological states, it is important LIWC be truly representative of the aspects it explores. The dictionaries must accurately represent the categories they measure to be an authentic assessment of the analyzed psychological and cognitive states. This current study seeks to use LSA to improve LIWC. The hypothesis is that a combination method of the two will perform better than the application of a single token-based method. LIWC and two other token-based methods were compared to a combination LSA-token method. The two techniques were applied to a set of headlines that had been previously judged by humans in terms of emotion and positive/negative valence. The first part of the experiment compared the token-based methods to confirm that they were different from each other but still successful measures of the stimuli. The second part of the experiment compared the correlation between the token-based method and the correct response of the pre-tagged data against the correlation of the combination method and the pre-tagged stimuli. The findings did not support the hypothesis, as the combination method performed worse than the token-based methods. These results, however, suggest further investigation into the power of LSA and its reliance on context. Specifically, LSA may be suited for analysis of longer, more semantically complex texts, not short, basic samples, like the headlines used in this study.Bachelor of Art

    Corpus-Guided Contrast Sets for Morphosyntactic Feature Detection in Low-Resource English Varieties

    Full text link
    The study of language variation examines how language varies between and within different groups of speakers, shedding light on how we use language to construct identities and how social contexts affect language use. A common method is to identify instances of a certain linguistic feature - say, the zero copula construction - in a corpus, and analyze the feature's distribution across speakers, topics, and other variables, to either gain a qualitative understanding of the feature's function or systematically measure variation. In this paper, we explore the challenging task of automatic morphosyntactic feature detection in low-resource English varieties. We present a human-in-the-loop approach to generate and filter effective contrast sets via corpus-guided edits. We show that our approach improves feature detection for both Indian English and African American English, demonstrate how it can assist linguistic research, and release our fine-tuned models for use by other researchers.Comment: Field Matters Workshop at COLING 202

    Investigation of the Effect of Contextual Factors on BIN Production in AAE

    Get PDF
    Treatments of African American English (AAE) in the literature have focused primarily on morphosyntactic differences from mainstream American English. One these differences is found in the tense and aspect system. While both dialects have the present perfect use for “been”, AAE also has a stressed variant of “been”, termed BIN. This aspectual marker is featured in the literature, but the main focus has been on its prosodic qualities. It differs from present perfect been in that it has the semantics of a remote past marker (Rickford 1973, Rickford 1975, Green 1998). For a comprehensive understanding of AAE’s tense aspect system, both syntactic-semantic and discourse-pragmatic aspects of these markers need to be studied as well. We complete a production experiment with members of an AAE-speaking community in Southwest Louisiana followed by an acceptability judgement task. The purpose of the experiment is twofold. First, it allows us to examine BIN production in canonical BIN environments and non-BIN environments. Second, by paying close attention to the context these environments occur in, we can also examine the influence of discourse-pragmatic factors (LONG-TIME, TEMPORAL JUST, POLAR QUESTIONS) on BIN production in unambiguous environments, as well as in ambiguous environments. The factors LONG-TIME and TEMPORAL JUST are found to be significant predictors of BIN production Furthermore, there is a significant difference in ambiguity, such that the unambiguous contexts predicted BIN slightly less. Overall, the results of the experiment suggest that speakers are consistent in their BIN production for expected BIN environments, but more variable in the non-BIN environments for both unambiguous and ambiguous contexts. This raises the interesting question of why speakers are more variable in the non-BIN environments as well as questioning what the discourse-pragmatic factors are actually capturing. Together, however, it suggests that there are a variety of components that can influence BIN production. Future areas of work could further investigate in regards these components
    corecore